npm - @really-knows-ai/foundry - Versions diffs - 2.1.0 → 2.2.1 - Mend

@really-knows-ai/foundry 2.1.0 → 2.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/.opencode/plugins/foundry.js +329 -46
package/CHANGELOG.md +55 -0
package/package.json +3 -2
package/scripts/lib/artefacts.js +6 -0
package/scripts/lib/feedback-transitions.js +25 -0
package/scripts/lib/feedback.js +146 -9
package/scripts/lib/finalize.js +41 -0
package/scripts/lib/history.js +15 -3
package/scripts/lib/pending.js +18 -0
package/scripts/lib/secret.js +23 -0
package/scripts/lib/stage-guard.js +25 -0
package/scripts/lib/state.js +31 -0
package/scripts/lib/token.js +26 -0
package/scripts/lib/workfile.js +12 -1
package/scripts/sort.js +89 -14
package/skills/add-cycle/SKILL.md +11 -6
package/skills/appraise/SKILL.md +33 -17
package/skills/cycle/SKILL.md +25 -19
package/skills/flow/SKILL.md +9 -2
package/skills/forge/SKILL.md +38 -26
package/skills/human-appraise/SKILL.md +29 -17
package/skills/quench/SKILL.md +31 -15
package/skills/sort/SKILL.md +60 -28
package/skills/upgrade-foundry/SKILL.md +33 -1

package/skills/appraise/SKILL.md CHANGED Viewed

@@ -14,34 +14,48 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
 > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
+## Stage lifecycle (mandatory)
+Appraise runs inside an enforced stage. Your **first** and **last** tool calls are fixed:
+1. **First:** `foundry_stage_begin({stage, cycle, token})` — copy the token verbatim from the dispatch prompt.
+2. **Last:** `foundry_stage_end({summary})`.
+Appraise makes **no disk writes**. All output flows through `foundry_feedback_add`. `foundry_stage_finalize` flags any unexpected writes as a violation.
 ## Protocol
-1. Gather context:
-   - Call `foundry_workfile_get` — identify the artefact to appraise and its type
-   - Call `foundry_config_laws` — get all applicable laws (global + type-specific)
-   - Call `foundry_config_artefact_type` with the type ID — get the artefact type definition
-   - Call `foundry_appraisers_select` with the type ID — returns selected appraiser personalities with their raw model IDs
+1. `foundry_stage_begin(...)`.
+2. Gather context:
+   - `foundry_workfile_get` — read the `cycle` from frontmatter
+   - `foundry_artefacts_list({cycle: <current-cycle>})` — enumerate this cycle's artefacts. Always pass the `cycle` filter; omitting it returns stale rows from prior sessions. Skip rows whose status is `done` or `blocked`.
+   - For each remaining row, gather its type-specific context:
+     - `foundry_config_laws` with the row's type — applicable laws (global + type-specific)
+     - `foundry_config_artefact_type` with the type ID — the artefact type definition
+     - `foundry_appraisers_select` with the type ID — selected appraiser personalities with their raw model IDs
-2. Dispatch each appraiser as an independent sub-agent (see Dispatch below)
+3. Dispatch each appraiser as an independent sub-agent (see Dispatch below). If this cycle produced multiple artefacts, appraisers evaluate each.
-3. Collect results from all appraisers
+4. Collect results from all appraisers
-4. Consolidate (this is judgment):
+5. Consolidate (this is judgment):
    - Union of all issues — if any one appraiser flags it, it's feedback
    - De-duplicate: merge overlapping observations into a single feedback item
    - Preserve which appraiser(s) raised each issue (for traceability)
-5. For each consolidated issue: call `foundry_feedback_add` with the artefact file path, the issue description, and tag `law:<law-id>`
+6. For each consolidated issue: `foundry_feedback_add(file, text, tag: 'law:<law-id>')`. Tag MUST start with `law:` — the tool rejects other tags during appraise. The tool also de-duplicates by text-hash.
+7. If no appraiser found any issues, the artefact clears appraisal.
-6. If no appraiser found any issues, the artefact clears appraisal
+8. `foundry_stage_end({summary})`.
 ## Reviewing actioned and wont-fix feedback
 On subsequent passes, review previously actioned and wont-fix items:
-1. Call `foundry_feedback_list` to find `actioned` and `wontfix` items for this artefact
-2. For each item, the appraiser sub-agents evaluate whether the change addresses the issue (actioned) or the justification is sound (wont-fix)
-3. Call `foundry_feedback_resolve` with disposition `"approved"` or `"rejected"` (with reason) for each
+1. `foundry_feedback_list` — find `actioned` and `wont-fix` items for this artefact.
+2. Appraiser sub-agents evaluate whether the change addresses the issue (`actioned`) or the justification is sound (`wont-fix`).
+3. `foundry_feedback_resolve(file, index, resolution: 'approved'|'rejected', reason?)`. Appraise is the only stage (other than human-appraise) allowed to resolve `wont-fix` items.
 ## Dispatch
@@ -91,7 +105,7 @@ If there are no issues, return an empty list.
 ## History
-Do NOT call `foundry_history_append` — the sort skill (your caller) is responsible for writing history. Instead, return a clear summary of what you found (e.g., "3 issues found across 2 appraisers" or "No issues found") so sort can log it.
+Do NOT call `foundry_history_append` or `foundry_git_commit` — the sort skill handles those. Return a summary via `foundry_stage_end` (e.g., "3 issues found across 2 appraisers" or "No issues found").
 ### Human override awareness
@@ -99,6 +113,8 @@ When reviewing an artefact, check the feedback history for `#human` tagged items
 ## What you do NOT do
-- You do not revise the artefact
-- You do not check deterministic rules — that is the quench skill's job
-- You do not filter out feedback because only one appraiser raised it — one is enough
+- You do not write files — all output goes through `foundry_feedback_add`.
+- You do not revise the artefact.
+- You do not check deterministic rules — that is the quench skill's job.
+- You do not filter out feedback because only one appraiser raised it — one is enough.
+- You do not register artefacts — that happens automatically via `foundry_stage_finalize`.

package/skills/cycle/SKILL.md CHANGED Viewed

@@ -22,38 +22,42 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
 3. Determine the stage route:
    - Use the cycle definition's `stages` field if present
    - Otherwise generate defaults: always `forge`, add `quench` if `foundry_config_validation` returns non-null for the type, always `appraise`
-   - If the cycle definition has `human-appraise.enabled: true`, append `human-appraise` as the final stage
+   - If the cycle definition has `human-appraise: true`, append `human-appraise` as the final stage (runs every iteration). If `human-appraise: false` (default), do NOT include it in `stages` — sort will synthesize `human-appraise:<cycle>` on deadlock when needed.
    - Stages should use `base:alias` format (e.g. `forge:write-haiku`, `quench:check-syllables`). If you pass bare names, the tool will auto-append the cycle ID as the alias.
-4. Call `foundry_workfile_set` to configure the work file:
-   - `key: "cycle"`, `value: <cycle-id>`
-   - `key: "stages"`, `value: <determined stages list>`
-   - `key: "max-iterations"`, `value: <default 3 or from cycle definition>`
-   - If the cycle definition has a `models` map: `key: "models"`, `value: <models map>`
+4. Call `foundry_workfile_configure_from_cycle({cycleId, stages})` with the cycle ID and the stages list from step 3. The tool reads the cycle definition and writes `cycle`, `stages`, `max-iterations`, `human-appraise`, `deadlock-appraise`, `deadlock-iterations`, and (if present) `models` into WORK.md in a single call, applying defaults for anything the cycle def omits. Do **not** use `foundry_workfile_set` for this — the configure tool is the authoritative cycle-def → WORK.md translator.
 5. Invoke the sort skill
 ## Sort drives everything
-Once sort is invoked, it calls `foundry_sort` to determine the next stage, invokes the corresponding skill, then calls sort again. This repeats until sort returns `done` or `blocked`.
+Once sort is invoked, it calls `foundry_sort` to determine the next stage, dispatches the corresponding skill to a fresh subagent with a single-use token, calls `foundry_stage_finalize` to register outputs (or detect file-pattern violations), writes history, and commits. This repeats until sort returns `done`, `blocked`, or `violation`.
-The cycle skill does not contain routing logic — sort owns all of that.
+The cycle skill does not contain routing, finalization, history, or commit logic — sort owns all of that. The cycle skill only sets up the work file and reacts to sort's terminal result.
 ## Completing a foundry cycle
 When sort returns `done`:
-- Call `foundry_artefacts_set_status` with status `"done"`
-- Return control to the flow skill
+- Call `foundry_artefacts_set_status(file, 'done')` for the cycle's output artefact.
+- Return control to the flow skill.
 When sort returns `blocked`:
-- Call `foundry_artefacts_set_status` with status `"blocked"`
-- Return control to the flow skill (the flow decides how to handle it)
+- The target artefact is usually already marked `blocked` by sort (on violations) or by human-appraise (on explicit abort). If not, call `foundry_artefacts_set_status(file, 'blocked')`.
+- Return control to the flow skill — the flow decides how to handle it.
+When sort returns `violation` (e.g., `stage_finalize` `unexpected_files`, missing subagent, or file-pattern violation):
+- Sort has already marked affected artefacts blocked and returned. Treat as the blocked path.
+- Return control to the flow skill.
 ## Human Appraise
-If the cycle definition has `human-appraise.enabled: true`, the human-appraise stage is included after appraise. Sort will route to it after LLM appraisers pass, or earlier if a deadlock is detected.
+Human-appraise is controlled by two flat cycle-def keys:
+- `human-appraise: true` — human-appraise runs every iteration as part of the normal stage flow (appended to `stages`).
+- `deadlock-appraise: true` (default) — if LLM appraisers deadlock on the same feedback for `deadlock-iterations` rounds (default 5), sort routes to human-appraise to resolve it, even when it isn't in `stages`.
+- `deadlock-appraise: false` — no human intervention; deadlock → `blocked`.
 ## Micro commits
-Every stage must end with a micro commit. Call `foundry_git_commit` with message format: `[<cycle-id>] <base>:<alias>: <brief description>`
+Every stage ends with a micro commit, written by sort (not cycle, not subagents). The message format is `[<cycle-id>] <base>:<alias>: <brief description>`.
 Examples:
 - `[haiku-creation] forge:write-haiku: initial draft`
@@ -74,8 +78,10 @@ Tag types: `validation` (from quench), `law:<law-id>` (from appraise), `human` (
 ## What you do NOT do
-- You do not make routing decisions — sort does that
-- You do not change the laws mid-cycle
-- You do not decide the artefact is "close enough" — it passes or it doesn't
-- You do not proceed past a file modification violation
-- You do not modify input artefacts — they are read-only
+- You do not make routing decisions — sort does that.
+- You do not register artefacts — `foundry_stage_finalize` does that (invoked by sort).
+- You do not write history or commits — sort does that.
+- You do not change the laws mid-cycle.
+- You do not decide the artefact is "close enough" — it passes or it doesn't.
+- You do not proceed past a file modification violation — honor sort's `violation`/`blocked` return.
+- You do not modify input artefacts — they are read-only.

package/skills/flow/SKILL.md CHANGED Viewed

@@ -23,8 +23,15 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
    - If only one starting cycle, use it
    - If multiple starting cycles, check whether the user's request makes the choice obvious (e.g., "write a haiku" clearly maps to `create-haiku`)
    - If ambiguous, prompt the user to choose
-4. Call `foundry_workfile_create` with **only** the flow ID, chosen cycle ID, and goal — do **not** pass `stages` or `maxIterations`. The `cycle` skill will read the cycle definition and populate those via `foundry_workfile_set` in the next step.
-5. Execute the cycle by invoking the cycle skill
+4. Pre-check for an existing workfile (prevents silent data loss from an aborted prior session):
+   a. Call `foundry_workfile_get`.
+   b. If it returns `{error: ...}` (no WORK.md), proceed to step 5.
+   c. If it returns an existing workfile, present its `flow`, `cycle`, and `goal` to the user alongside the values just requested, then prompt for one of:
+      - **Resume** — keep the existing workfile and skip to step 6. **Only offer resume if the existing `flow` AND `cycle` match what the user just asked for.** If either differs, do not offer resume — running the wrong cycle against stale state corrupts the workflow.
+      - **Discard** — call `foundry_workfile_delete`, then proceed to step 5.
+      - **Abort** — stop the skill without modifying anything.
+5. Call `foundry_workfile_create` with **only** the flow ID, chosen cycle ID, and goal — do **not** pass `stages` or `maxIterations`. The `cycle` skill will read the cycle definition and populate those via `foundry_workfile_set` in the next step.
+6. Execute the cycle by invoking the cycle skill
 ## Between cycles

package/skills/forge/SKILL.md CHANGED Viewed

@@ -14,33 +14,42 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
 > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
+## Stage lifecycle (mandatory)
+Forge runs inside an enforced stage. Your **first** and **last** tool calls are fixed:
+1. **First:** `foundry_stage_begin({stage, cycle, token})` — the orchestrator hands you `stage`, `cycle`, and an opaque `token` string in the dispatch prompt. Copy the token verbatim; never invent, edit, or re-sign it. No other tool call is permitted before this one. Any writes before `stage_begin` will be blocked by preconditions.
+2. **Last:** `foundry_stage_end({summary})` — return control to the orchestrator. After `stage_end`, the orchestrator calls `foundry_stage_finalize` which scans the disk and registers your output artefact. **You do not register artefacts yourself.**
 ## Protocol
 ### First generation (no artefact registered yet)
-1. Call `foundry_workfile_get` — understand the goal
-2. Call `foundry_config_cycle` — understand what to produce and what inputs are available
-3. Call `foundry_config_artefact_type` with the output type ID — get the artefact type definition
-4. Call `foundry_config_laws` — get all applicable laws (global + type-specific)
-5. If the cycle has inputs, read the input artefacts (read-only context)
-6. Produce the artefact, respecting all applicable laws from the start (this is judgment — use your craft)
-7. Write the artefact file to the location specified in the artefact type definition
-8. Call `foundry_artefacts_add` with the file path, type, and cycle to register it with status `"draft"`
+1. `foundry_stage_begin(...)` with the token from the dispatch prompt.
+2. `foundry_workfile_get` — understand the goal.
+3. `foundry_config_cycle` — understand what to produce and what inputs are available.
+4. `foundry_config_artefact_type` with the output type ID — get the artefact type definition, especially its `file-patterns`.
+5. `foundry_config_laws` — get all applicable laws (global + type-specific).
+6. If the cycle has inputs, read the input artefacts (read-only context).
+7. Produce the artefact, respecting all applicable laws from the start.
+8. Write the artefact file to a location that matches the artefact type's `file-patterns`.
+9. `foundry_stage_end({summary})`.
 ### Revision (feedback exists)
-1. Call `foundry_feedback_list` to find unresolved feedback for the artefact
-2. Read the artefact file
-3. If the cycle has inputs, read the input artefacts (read-only context)
-4. For each unresolved feedback item, either:
-   - Address it and call `foundry_feedback_action` with the item ID (marks as actioned)
-   - Call `foundry_feedback_wontfix` with the item ID and a justification (appraisal feedback only)
-5. Update the artefact file
-6. Wont-fix is only available for `law:` feedback (subjective appraisal). Validation feedback must be actioned — deterministic rules are not negotiable.
+1. `foundry_stage_begin(...)`.
+2. `foundry_feedback_list` — find unresolved feedback for the artefact.
+3. Read the artefact file.
+4. If the cycle has inputs, read the input artefacts (read-only context).
+5. For each unresolved feedback item, either:
+   - Address it and call `foundry_feedback_action` (marks item `actioned`), or
+   - Call `foundry_feedback_wontfix` with a justification — available only for `law:` / `human` tags (validation feedback must be actioned).
+6. Update the artefact file.
+7. `foundry_stage_end({summary})`.
-### After (both paths)
+## File-pattern hygiene
-Do NOT call `foundry_history_append` — the sort skill (your caller) is responsible for writing history. Instead, return a clear summary of what you did so sort can log it.
+Writes during forge must match the output artefact type's `file-patterns`. Writing to any other path causes `foundry_stage_finalize` to return `{error: 'unexpected_files'}` and the orchestrator will mark the cycle's target artefact `blocked`. You will not get a retry. Plus `WORK.md` and `WORK.history.yaml` (managed by tools). Nothing else.
 ## Unresolved feedback
@@ -53,14 +62,17 @@ An item is resolved if it is `approved`.
 ## #human feedback
 Feedback tagged `human` (from the human-appraise stage) takes absolute priority:
-- You MUST address it — you cannot wont-fix `#human` feedback
-- When `#human` feedback contradicts LLM appraiser feedback on the same topic, follow the human's direction
-- Acknowledge the human's input in your revision
+- You MUST address it — you cannot wont-fix `#human` feedback.
+- When `#human` feedback contradicts LLM appraiser feedback on the same topic, follow the human's direction.
+- Acknowledge the human's input in your revision.
 ## What you do NOT do
-- You do not evaluate or score the artefact
-- You do not add feedback — that is the quench skill's and appraise skill's job
-- You do not mark feedback as actioned unless you actually changed the artefact to address it
-- You do not wont-fix validation feedback
-- You do not modify input artefacts — they are read-only
+- You do not add feedback — that is the quench and appraise skills' job. (`foundry_feedback_add` is blocked for you at the tool layer.)
+- You do not `foundry_feedback_resolve` — that belongs to quench/appraise/human-appraise.
+- You do not register artefacts — `foundry_stage_finalize` handles that automatically.
+- You do not call `foundry_history_append` or `foundry_git_commit` — the sort skill does.
+- You do not evaluate or score the artefact.
+- You do not mark feedback as actioned unless you actually changed the artefact to address it.
+- You do not wont-fix validation feedback.
+- You do not modify input artefacts — they are read-only.

package/skills/human-appraise/SKILL.md CHANGED Viewed

@@ -14,17 +14,27 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
 > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
+## Stage lifecycle (mandatory)
+Human-appraise runs inside an enforced stage. Your **first** and **last** tool calls are fixed:
+1. **First:** `foundry_stage_begin({stage, cycle, token})` — copy the token verbatim from the dispatch prompt.
+2. **Last:** `foundry_stage_end({summary})`.
+Human-appraise makes **no disk writes**. All output flows through `foundry_feedback_add` / `foundry_feedback_resolve` / `foundry_artefacts_set_status`. `foundry_stage_finalize` flags unexpected writes as a violation.
 ## Protocol
-1. Gather context by calling:
-   - `foundry_workfile_get` — current state, goal, artefacts
-   - `foundry_artefacts_list` — current artefact files and status
+1. `foundry_stage_begin(...)`.
+2. Gather context by calling:
+   - `foundry_workfile_get` — current state, goal, cycle
+   - `foundry_artefacts_list({cycle: <current-cycle>})` — this cycle's artefact files and status (always pass the `cycle` filter; omitting it returns stale rows from prior sessions)
    - `foundry_feedback_list` — all existing feedback
    - `foundry_history_list` — what has happened so far
-2. Read the artefact file(s) for this cycle.
+3. Read the artefact file(s) for this cycle.
-3. Present to the human:
+4. Present to the human:
    - The current artefact content (full file content or multi-file diff)
    - A summary of this iteration's feedback (resolved and open)
    - If this is a deadlock escalation, clearly explain the deadlock:
@@ -33,15 +43,15 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
      - Forge's wont-fix or revision justification
      - Ask the human to resolve the disagreement
-4. Wait for the human's response.
+5. Wait for the human's response.
-5. Act on the response:
-   - **Approve** — "looks good" / "continue" — no feedback added, sort will advance
-   - **Provide feedback** — call `foundry_feedback_add` with the human's feedback and tag `human`. Sort will route back to forge.
-   - **Dismiss deadlocked feedback** — call `foundry_feedback_resolve` with `resolution: "approved"` on the deadlocked item(s). This overrides the appraiser.
-   - **Abort** — call `foundry_artefacts_set_status` with status `"blocked"`, cycle ends
+6. Act on the response (tag MUST be `human` on any added feedback — the tool rejects other tags during human-appraise):
+   - **Approve** — "looks good" / "continue" — no feedback added, sort will advance.
+   - **Provide feedback** — `foundry_feedback_add(file, text, tag: 'human')`. Sort will route back to forge.
+   - **Dismiss deadlocked feedback** — `foundry_feedback_resolve(file, index, resolution: 'approved')`. Human-appraise may resolve items in state `actioned` or `wont-fix`. This overrides the appraiser.
+   - **Abort** — `foundry_artefacts_set_status(file, 'blocked')`, cycle ends.
-6. Return a clear summary of what the human decided so sort can log it in history.
+7. `foundry_stage_end({summary})` — describe what the human decided so sort can log it.
 ## #human feedback rules
@@ -51,8 +61,10 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
 ## What you do NOT do
-- You do not make decisions for the human — present the state and wait
-- You do not modify the artefact
-- You do not skip the pause — the human must respond before continuing
-- You do not filter or summarise away important details — show the full picture
-- You do not call `foundry_history_append` — sort owns history writing
+- You do not write files — all output goes through foundry tools.
+- You do not make decisions for the human — present the state and wait.
+- You do not modify the artefact.
+- You do not skip the pause — the human must respond before continuing.
+- You do not filter or summarise away important details — show the full picture.
+- You do not call `foundry_history_append` or `foundry_git_commit` — sort owns those.
+- You do not register artefacts — handled by `foundry_stage_finalize`.

package/skills/quench/SKILL.md CHANGED Viewed

@@ -14,33 +14,49 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
 > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
+## Stage lifecycle (mandatory)
+Quench runs inside an enforced stage. Your **first** and **last** tool calls are fixed:
+1. **First:** `foundry_stage_begin({stage, cycle, token})` — copy the token verbatim from the dispatch prompt. Any other tool call before this will be blocked.
+2. **Last:** `foundry_stage_end({summary})`.
+Quench makes **no disk writes**. You produce feedback via `foundry_feedback_add`, never by creating or modifying files. `foundry_stage_finalize` (run by the orchestrator after you return) will flag any unexpected writes as a violation.
 ## Protocol
-1. Call `foundry_workfile_get` to identify the artefact and its type.
-2. Call `foundry_config_validation` with the artefact type ID. If it returns null, output SKIP and stop — there is no validation for this type.
-3. Call `foundry_validate_run` with the type ID and artefact file path. It executes all validation commands and returns results.
-4. For each failure: call `foundry_feedback_add` with the artefact file path, a description of the failure, and tag `"validation"`.
-5. If all commands pass, add no new feedback.
+1. `foundry_stage_begin(...)`.
+2. `foundry_workfile_get` — read the `cycle` from frontmatter.
+3. `foundry_artefacts_list({cycle: <current-cycle>})` — enumerate the artefacts produced by **this** cycle. Always pass the `cycle` filter; omitting it returns rows from prior sessions and validates stale files. Skip rows whose status is `done` or `blocked`.
+4. For each remaining row:
+   a. `foundry_config_validation` with the row's type. If it returns null, skip this row.
+   b. `foundry_validate_run` with the type ID and the row's file path — executes all validation commands and returns results.
+   c. For each failure: `foundry_feedback_add(file, text, tag: 'validation')`. Tag MUST be `validation` — the tool rejects other tags during quench.
+5. If every command passes for every row, add no new feedback.
+6. If the artefact table has no rows for this cycle, `foundry_stage_end({summary: 'SKIP: no artefacts registered for this cycle'})` and stop.
+7. `foundry_stage_end({summary})`.
 ## Reviewing actioned feedback
 On subsequent passes, review previously actioned items:
-1. Call `foundry_feedback_list` to find `actioned` items tagged `validation` for this artefact.
+1. `foundry_feedback_list` — find `actioned` items tagged `validation` for artefacts in this cycle (use the file list from step 3 above).
 2. Re-run the relevant command via `foundry_validate_run`.
-3. If the check now passes: call `foundry_feedback_resolve` with disposition `"approved"`.
-4. If it still fails: call `foundry_feedback_resolve` with disposition `"rejected"` and a reason.
+3. If the check now passes: `foundry_feedback_resolve(file, index, resolution: 'approved')`.
+4. If it still fails: `foundry_feedback_resolve(file, index, resolution: 'rejected', reason)`.
-There is no wont-fix for validation feedback. Deterministic rules are not negotiable.
+There is no wont-fix for validation feedback — deterministic rules are not negotiable. Quench may only resolve items in state `actioned`; the feedback tool enforces this.
 ## History
-Do NOT call `foundry_history_append` — the sort skill (your caller) is responsible for writing history. Instead, return a clear summary of what you found (e.g., "2 validation issues found" or "Validation passed") so sort can log it.
+Do NOT call `foundry_history_append` or `foundry_git_commit` — the sort skill handles those. Return a clear summary via `foundry_stage_end` (e.g., "2 validation issues found" or "Validation passed").
 ## What you do NOT do
-- You do not make subjective judgments
-- You do not revise the artefact
-- You do not evaluate laws — that is the appraise skill's job
-- You do not invent validation rules — you only run commands from the validation config
-- You do not duplicate feedback that already exists
+- You do not write files — all output goes through `foundry_feedback_add`.
+- You do not make subjective judgments.
+- You do not revise the artefact (forge's job).
+- You do not evaluate laws — that is the appraise skill's job.
+- You do not invent validation rules — you only run commands from the validation config.
+- You do not duplicate feedback that already exists (the tool de-duplicates by text-hash, but don't rely on it).
+- You do not register artefacts — that happens automatically.

package/skills/sort/SKILL.md CHANGED Viewed

@@ -6,7 +6,7 @@ description: Deterministic routing for a foundry cycle. Runs the foundry_sort to
 # Sort
-You are the central dispatcher for a foundry cycle. You call the `foundry_sort` tool to determine what stage to execute next, then dispatch that stage to a fresh subagent.
+You are the central dispatcher for a foundry cycle. You call `foundry_sort` to determine what stage to execute next, dispatch that stage to a fresh subagent, finalize the stage's disk output, and log history. You are the sole writer of history and git commits.
 ## Prerequisites
@@ -16,31 +16,40 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
 ## Protocol
-1. Call `foundry_sort` (optionally passing `cycleDef` if the cycle definition has a non-standard path). It returns `{route, model?, details?}`.
+1. Call `foundry_sort` (optionally passing `cycleDef`). It returns `{route, model?, token?, details?}`. For dispatchable routes (`forge|quench|appraise|human-appraise:*`) the tool mints a single-use, time-limited `token`.
-2. Call `foundry_history_append` with the current cycle, stage `"sort"`, and a comment explaining the routing decision in natural language. This is your audit trail — if something goes wrong, this comment is what someone will read to understand what happened.
+2. Call `foundry_history_append({cycle, stage: 'sort', comment, route})` — the `route` field records what sort decided, and **subsequent** `history_append` calls for non-sort stages are enforced to match this route. This is your audit trail.
 3. Act on the route:
-   - `forge:*` — **dispatch** (see §Dispatch below)
-   - `quench:*` — **dispatch**
-   - `appraise:*` — **dispatch**. Note: the appraise skill handles its own per-appraiser model resolution internally.
-   - `human-appraise:*` — invoke the human-appraise skill inline (human stage, no subagent)
-   - `done` — foundry cycle is complete, return to the cycle skill
-   - `blocked` — foundry cycle is blocked (iteration limit hit with unresolved feedback), return to the cycle skill
-   - `violation` — a validation, file-modification, or missing-subagent violation was detected (see `details`). The cycle halts — call `foundry_artefacts_set_status` with status `"blocked"` for each affected artefact, and return to the cycle skill. If `details` mentions a missing subagent, tell the user to run the `refresh-agents` skill and restart.
+   - `forge:*` / `quench:*` / `appraise:*` — **dispatch** (see §Dispatch).
+   - `human-appraise:*` — invoke the human-appraise skill inline (human stage, no subagent) but still pass the `token`; the skill must call `foundry_stage_begin` with it.
+   - `done` — cycle is complete, return to the cycle skill.
+   - `blocked` — iteration limit hit with unresolved feedback, return to the cycle skill.
+   - `violation` — a validation, file-modification, or missing-subagent violation was detected (see `details`). Halt the cycle: call `foundry_artefacts_set_status(file, 'blocked')` for each affected artefact, and return to the cycle skill. If `details` mentions a missing subagent, tell the user to run `refresh-agents` and restart.
-4. After the subagent completes, call `foundry_history_append` with the current cycle, the **dispatched stage alias** (e.g., `forge:write-haiku`), and a comment summarizing what the subagent reported doing. This is critical — sort is the only reliable writer of stage history. Subagents must NOT write their own history entries.
+4. **After** the dispatched subagent returns, call `foundry_stage_finalize({cycle})`. Handle three outcomes:
+   - `{ok: true, artefacts: [...]}` — the tool has already registered output artefact rows in WORK.md. Proceed to step 5.
+   - `{error: 'unexpected_files', files: [...]}` — the subagent wrote outside the artefact type's `file-patterns`. Mark the cycle's target artefact `blocked` via `foundry_artefacts_set_status` and do **not** re-run the stage. Add a `violation` feedback item describing the offending files, then return to the cycle skill.
+   - Any other error — surface it to the user and halt.
-5. After logging the stage history, call `foundry_sort` again. Repeat from step 1 until it returns `done`, `blocked`, or `violation`.
+5. Call `foundry_history_append({cycle, stage: <dispatched-stage-alias>, comment})` summarizing what the subagent reported. The tool enforces that the stage alias matches the most recent sort's `route` — this is why step 2's `route` field matters.
+6. Call `foundry_git_commit({cycle, stage, description})` to record the stage's disk changes. **This is mandatory.** The next `foundry_sort` call will return `{route: 'violation', details: 'Uncommitted tool-managed files...'}` if WORK.md, WORK.history.yaml, or anything under `.foundry/` is dirty — the tool enforces one commit per stage.
+7. Return to step 1. Repeat until `done`, `blocked`, or `violation`.
 ## Dispatch
-Every forge, quench, and appraise stage runs in a **fresh subagent**. Never inline the stage work in the orchestrator conversation — even if the chosen model happens to match the orchestrator's model. The orchestrator's job is to route and log, nothing else.
+Every forge, quench, and appraise stage runs in a **fresh subagent**. Never inline the stage work in the orchestrator conversation — even if the chosen model matches the orchestrator's. The orchestrator's job is to route, dispatch, finalize, and log. Nothing else.
 ### Choosing the subagent
-- If `foundry_sort` returned a `model` field in its response, use that value verbatim as `subagent_type`. It is already in `foundry-<slug>` form (the tool does the slug computation by replacing both `/` and `.` with `-` in the model ID).
-- If `foundry_sort` returned **no** `model` field (the cycle has no `models:` map, or no entry for this stage base), dispatch to the default general-purpose subagent: `general`.
+- If `foundry_sort` returned a `model` field, use it verbatim as `subagent_type`. It is already in `foundry-<slug>` form.
+- If no `model` field, dispatch to `general`.
+### Token handling
+The `token` returned by `foundry_sort` is an opaque signed string. Pass it through the dispatch prompt verbatim. **Never** invent, edit, or re-sign tokens. The subagent's first tool call must be `foundry_stage_begin({stage, cycle, token})` using this exact string; `stage_begin` verifies the signature, expiry, and single-use nonce.
 ### Dispatch call shape
@@ -48,32 +57,55 @@ Use the `task` tool:
 ```
 task tool:
-  subagent_type: <model-slug-from-foundry_sort-response, or "general">
+  subagent_type: <model-slug-from-foundry_sort, or "general">
   description: "Run <stage-alias> for <cycle-id>"
   prompt: |
     You are a Foundry stage agent. Invoke the <stage-base> skill and follow its instructions exactly.
-    Current cycle: <cycle-id>
-    Current stage: <stage-alias>
+    Stage: <stage-alias>
+    Cycle: <cycle-id>
+    Token: <token-verbatim>
     Working directory: <worktree>
+    File patterns (forge only): <file-patterns-list>
+    Your FIRST tool call MUST be foundry_stage_begin({stage, cycle, token}) using the values above.
+    Your LAST tool call MUST be foundry_stage_end({summary}).
-    When done, report back a brief summary of what you did. Do NOT call foundry_history_append — the orchestrator handles history.
+    When done, report back a brief summary. Do NOT call foundry_history_append, foundry_git_commit, or foundry_artefacts_add — the orchestrator handles all of those.
 ```
 Substitute:
 - `<stage-alias>` — the full route string from `foundry_sort` (e.g., `forge:write-haiku`)
-- `<stage-base>` — the base of the alias (e.g., `forge`, `quench`, `appraise`)
-- `<cycle-id>` — the current cycle ID from WORK.md frontmatter
-- `<worktree>` — the current working directory
+- `<stage-base>` — the base of the alias
+- `<cycle-id>` — current cycle ID from WORK.md frontmatter
+- `<token-verbatim>` — exactly the `token` string from `foundry_sort` — no quoting transforms, no re-encoding
+- `<file-patterns-list>` — for forge stages, read via `foundry_config_artefact_type` and include so the subagent can avoid violations
+- `<worktree>` — current working directory
 ### Missing subagent (fail-fast)
-The `foundry_sort` tool verifies that the required `.opencode/agents/foundry-<slug>.md` file exists before returning a `model`. If it doesn't, sort returns `{route: 'violation', details: 'Missing required subagent: ...'}`. Handle this as described in step 3 above — halt the cycle, mark artefacts blocked, and instruct the user to run the `refresh-agents` skill.
+`foundry_sort` verifies that `.opencode/agents/foundry-<slug>.md` exists before returning a `model`. If it doesn't, sort returns `{route: 'violation', details: 'Missing required subagent: ...'}`. Handle as in step 3 above.
+## Violation handling
+If `foundry_stage_finalize` returns `{error: 'unexpected_files', files}`:
+- The stage wrote outside its permitted `file-patterns`. This is unrecoverable within the current cycle.
+- Mark the target artefact `blocked`: `foundry_artefacts_set_status(file, 'blocked')`.
+- Add a feedback item describing the offense: `foundry_feedback_add(file, text: 'unexpected files: …', tag: 'violation')` (if permitted by your stage), or log in the history comment.
+- Do NOT attempt to re-run the stage — the subagent already consumed the stage slot.
+- Return to the cycle skill so the operator can intervene.
+If `foundry_sort` returns `{route: 'violation', details: 'Uncommitted tool-managed files...'}`:
+- A prior stage skipped step 6 (the micro-commit). The work is not lost — it's still in the working tree.
+- Call `foundry_git_commit({cycle, stage: <the-stage-that-just-ran>, description})` to record it.
+- Then call `foundry_sort` again. It should route normally.
+- If you genuinely don't know which stage produced the dirty files, read `WORK.history.yaml` — the most recent non-sort entry is the culprit.
 ## What you do NOT do
-- You do not make routing decisions yourself — the tool decides.
-- You do not skip calling `foundry_sort`.
-- You do not override the tool's output.
-- You do not skip the history entry — every sort invocation gets a `sort` entry, and every completed stage gets a stage entry (e.g., `forge:write-haiku`). You are the sole writer of history.
-- You do **not** inline forge/quench/appraise work — always dispatch to a subagent via the `task` tool, even when the resolved model matches the orchestrator's own model.
+- You do not inline forge/quench/appraise work — always dispatch.
+- You do not mint, modify, or cache tokens — they come from `foundry_sort` and go straight to `foundry_stage_begin`.
+- You do not skip `foundry_stage_finalize` — it is the only mechanism that registers artefacts and detects file-pattern violations.
+- You do not let subagents call `foundry_history_append`, `foundry_git_commit`, or `foundry_artefacts_add` (the last has been removed anyway).

package/skills/upgrade-foundry/SKILL.md CHANGED Viewed

@@ -46,7 +46,7 @@ Check each file against the current expected format:
 - Has `targets` field? If not → needs target routing
 - Has `inputs.type` (`any-of`/`all-of`)? If `inputs` is a plain list → needs contract type
 - Has `hitl` in stages or frontmatter? → needs human-appraise migration
-- Has `human-appraise` config? Check format is correct
+- Has nested `human-appraise: {enabled, deadlock-threshold}`? → v2.2.1 flat-keys migration (see §4b)
 - Has `models` map? Check format
 **Artefact types:**
@@ -98,6 +98,38 @@ For each `.opencode/agents/foundry-*.md` file with a `.` in its filename:
 After renaming, remind the user: **Restart OpenCode** for the new agent filenames to register.
+### 4a. v2.2.0 lifecycle upgrade
+Foundry v2.2.0 introduces a tool-enforced stage lifecycle (`stage_begin` / `stage_end` / `stage_finalize`) backed by a per-project state directory and HMAC-signed dispatch tokens. The upgrade is non-destructive — no WORK.md or artefact migration is required — but the project needs three small changes:
+1. **Create `.foundry/`** (if absent):
+   - `mkdir -p .foundry`
+   - The plugin auto-creates `.foundry/.secret` on first boot via `readOrCreateSecret`. You do not need to generate it by hand; just ensure the directory exists and is writable.
+2. **Gitignore `.foundry/`**:
+   - Ensure `.gitignore` contains a line `.foundry/` (append if missing; do not duplicate). The directory holds a per-worktree HMAC secret and transient active-stage state — neither should be committed.
+3. **Pre-existing state:** v2.2.0 is a fresh state system. There is no `active-stage.json` to migrate. If one happens to exist from a manually-aborted prior run, leave it alone — the new plugin treats its absence as "no active stage" and its presence as a legitimate in-flight stage.
+The `foundry_artefacts_add` tool has been removed in v2.2.0 — artefact registration now happens automatically via `foundry_stage_finalize`. No existing config references this tool, so there is nothing to migrate in `foundry/`.
+### 4b. v2.2.1 cycle-definition flat human-appraise keys
+v2.2.1 replaces the nested `human-appraise: {enabled, deadlock-threshold}` block in cycle definitions with three flat keys:
+```yaml
+human-appraise: <true|false>         # default: false — run human-appraise every iteration
+deadlock-appraise: <true|false>      # default: true — pull in human-appraise when LLM appraisers deadlock
+deadlock-iterations: <number>        # default: 5 — deadlock detection threshold
+```
+For each `foundry/cycles/*.md` whose frontmatter has the old nested form, migrate:
+- `human-appraise.enabled: true` → `human-appraise: true`
+- `human-appraise.enabled: false` (or missing) → `human-appraise: false`
+- `human-appraise.deadlock-threshold: N` → `deadlock-iterations: N`
+- Always add `deadlock-appraise: true` unless the user explicitly wants the stricter "no human ever" behavior (`deadlock-appraise: false` → deadlock marks the cycle `blocked`).
+The old nested form is no longer read. After migration, verify by asking: "cycle `<id>`: human-appraise every iteration? deadlock-appraise on? deadlock-iterations = N?".
 ### 5. Migrate flows
 For each flow needing migration: