@really-knows-ai/foundry 3.8.2 → 3.8.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -133,6 +133,7 @@ function artefactTypeArgs(s) { return {
133
133
  name: s.string().describe('Human-readable display name (accepted at boundary, not persisted — id becomes frontmatter.name)'),
134
134
  filePatterns: s.array(s.string()).describe('Glob patterns defining forge write scope (written to frontmatter.file-patterns)'),
135
135
  description: s.string().describe('Prose description placed under ## Definition'),
136
+ example: s.string().optional().describe('Example artefact structure (markdown with code blocks). Written to example.md alongside definition.md. Guides forge agents on the expected output format.'),
136
137
  appraisers: s.object({
137
138
  count: s.number().optional().describe('Number of appraisers per cycle'),
138
139
  allowed: s.array(s.string()).optional().describe('Restrict to specific appraiser IDs'),
package/dist/CHANGELOG.md CHANGED
@@ -1,5 +1,31 @@
1
1
  # Changelog
2
2
 
3
+ ## [3.8.4] - 2026-05-27
4
+
5
+ ### Added
6
+
7
+ - `foundry_config_create_artefact_type` now accepts an optional `example` arg. When provided, it writes `foundry/artefacts/<id>/example.md` alongside `definition.md`. The example file is a structure document — markdown with code blocks showing the expected output format, plus documentation for the forge agent.
8
+
9
+ - The `add-artefact-type` skill now prompts users to provide an example artefact during the Understand phase, includes it in the Plan, and passes it to the create tool in Build.
10
+
11
+ ### Fixed
12
+
13
+ - Human-appraise stage tokens no longer carry a subagent model scope. Previously, `sort.js` resolved a model for human-appraise routes (falling through to `defaultModel`), which got embedded in the token. `foundry_stage_begin` then rejected the token because the main Foundry agent is not the scoped subagent, causing the human-appraise stage to fail to start and user feedback to be lost. Human-appraise always runs inline by the Foundry agent.
14
+
15
+ ## [3.8.3] - 2026-05-27
16
+
17
+ ### Changed
18
+
19
+ - Appraise subagents no longer receive artefact content or laws in their prompt. The dispatch prompt contains only the appraiser's personality and the artefact type ID. The subagent discovers artefact files, laws, and file-patterns via tool calls (`foundry_config_artefact_type`, `foundry_config_laws`, `foundry_artefacts_list`) and reads files from the worktree.
20
+
21
+ - Appraise subagent output format changed from YAML to JSONL (one JSON object per line), matching the quench validator protocol. Required fields: `file`, `text`. Recommended: `law`, `evidence`. Optional: `severity`, `location`. The consolidate phase parses JSONL and posts feedback with tag `law:<slug>`.
22
+
23
+ - Gather phase now creates one task per appraiser (not per artefact × appraiser). Each appraiser covers all artefacts of the given type via tool-based discovery.
24
+
25
+ ### Fixed
26
+
27
+ - Removed `js-yaml` dependency from appraise-module.js. All YAML parsing and fallback line-parsing code replaced with JSONL parsing.
28
+
3
29
  ## [3.8.2] - 2026-05-27
4
30
 
5
31
  ### Changed
@@ -2,28 +2,33 @@
2
2
  * Appraise module — gathers context for parallel appraiser dispatch and
3
3
  * consolidates results after all appraisers have run.
4
4
  *
5
- * Gather phase: reads artefacts, laws, and appraiser personalities, builds
6
- * subagent prompts, and returns a dispatch_multi action so the orchestrator's
7
- * LLM dispatches appraisers in parallel.
5
+ * Gather phase: reads artefacts, selects appraisers, builds subagent prompts
6
+ * with only personality + type ID (no artefact content or laws inlined), and
7
+ * returns a dispatch_multi action so the orchestrator's LLM dispatches
8
+ * appraisers in parallel.
8
9
  *
9
- * Consolidate phase: receives lastResults from the orchestrator, unions and
10
- * de-duplicates appraiser issues, posts feedback, and finalises the stage
11
- * so the orchestrator can re-sort and determine the next action.
10
+ * Each appraiser subagent discovers artefacts, laws, and file-patterns via
11
+ * tool calls and returns JSONL one JSON object per line.
12
+ *
13
+ * Consolidate phase: receives lastResults from the orchestrator, parses JSONL
14
+ * from each appraiser, unions and de-duplicates issues, posts feedback, and
15
+ * finalises the stage so the orchestrator can re-sort and determine the next
16
+ * action.
12
17
  */
13
18
 
14
19
  import { getArtefactFiles, computeArtefactVersion } from './lib/artefacts.js';
15
- import { selectAppraisers, getLaws, getCycleDefinition } from './lib/config.js';
20
+ import { selectAppraisers, getCycleDefinition } from './lib/config.js';
16
21
  import { openFeedbackStore } from './lib/feedback-store.js';
17
- import yaml from 'js-yaml';
18
22
 
19
23
  // ---------------------------------------------------------------------------
20
24
  // Public API — gather
21
25
  // ---------------------------------------------------------------------------
22
26
 
23
27
  /**
24
- * Gather appraise context: read draft artefacts, select appraisers, read laws
25
- * and artefact content, then build a dispatch_multi action with one task per
26
- * (artefact, appraiser) pair.
28
+ * Gather appraise context: read draft artefacts, select appraisers, build a
29
+ * dispatch_multi action with one task per appraiser. The subagent prompt
30
+ * contains only the appraiser personality and artefact type ID — the subagent
31
+ * discovers artefact files, laws, and file-patterns via tool calls.
27
32
  *
28
33
  * @param {object} ctx
29
34
  * @param {string} ctx.cycleId
@@ -36,91 +41,54 @@ import yaml from 'js-yaml';
36
41
  * @returns {Promise<{action: string, tasks: Array, stage: string, cycle: string}>}
37
42
  */
38
43
  export async function gatherAppraiseContext(ctx) {
39
- if (!ctx.cycleId) {
40
- return violation('cycleId is required', []);
41
- }
44
+ const guarded = guardAppraiseGather(ctx);
45
+ if (guarded) return guarded;
42
46
 
43
47
  await resolveStaleAppraiseFeedback(ctx);
44
48
 
45
49
  const cd = await getCycleDefinition(ctx.foundryDir, ctx.cycleId, ctx.io);
46
- const outputType = cd.frontmatter['output-type'];
47
- if (!outputType) {
48
- return violation(`cycle ${ctx.cycleId} missing output-type field`, []);
49
- }
50
- const baseBranch = ctx.baseBranch || 'main';
51
- const artefacts = await getArtefactFiles(ctx.foundryDir, outputType, ctx.io, { baseBranch });
52
- if (artefacts.length === 0) {
53
- return emptyDispatch(ctx.cycleId);
54
- }
55
-
56
- const typedArtefacts = artefacts.map(artefact => ({ ...artefact, type: outputType }));
57
- const tasks = await collectTasks(typedArtefacts, ctx);
58
-
59
- return {
60
- action: 'dispatch_multi',
61
- tasks,
62
- stage: `appraise:${ctx.cycleId}`,
63
- cycle: ctx.cycleId,
64
- };
65
- }
50
+ const outputType = validateOutputType(cd, ctx.cycleId);
51
+ if (typeof outputType !== 'string') return outputType;
66
52
 
67
- /**
68
- * Build all appraiser tasks across artefacts, caching per type.
69
- */
70
- async function collectTasks(artefacts, ctx) {
71
- const tasks = [];
72
- const typeCache = new Map();
73
-
74
- for (const artefact of artefacts) {
75
- const entry = await resolveTypeEntry(artefact.type, typeCache, ctx);
76
- if (!entry) continue;
53
+ const artefacts = await fetchAppraiseArtefacts(ctx, outputType);
54
+ if (!Array.isArray(artefacts)) return artefacts;
77
55
 
78
- addTasksForArtefact(tasks, artefact, entry, ctx);
56
+ const appraisers = await selectAppraisers(ctx.foundryDir, outputType, { io: ctx.io });
57
+ if (appraisers.length === 0) {
58
+ return emptyDispatch(ctx.cycleId);
79
59
  }
80
60
 
81
- return tasks;
61
+ return buildGatherResponse(appraisers, outputType, ctx);
82
62
  }
83
63
 
84
- /**
85
- * Get or create a cached (appraisers, laws) entry for an artefact type.
86
- * Returns null when no appraisers are available for the type.
87
- */
88
- async function resolveTypeEntry(typeId, cache, ctx) {
89
- if (cache.has(typeId)) {
90
- return cache.get(typeId);
91
- }
92
-
93
- const [appraisers, laws] = await Promise.all([
94
- selectAppraisers(ctx.foundryDir, typeId, { io: ctx.io }),
95
- getLaws(ctx.foundryDir, ctx.io, { typeId }),
96
- ]);
64
+ function guardAppraiseGather(ctx) {
65
+ return ctx.cycleId ? null : violation('cycleId is required', []);
66
+ }
97
67
 
98
- const entry = appraisers.length === 0 ? null : { appraisers, laws };
99
- cache.set(typeId, entry);
100
- return entry;
68
+ function validateOutputType(cd, cycleId) {
69
+ const outputType = cd.frontmatter['output-type'];
70
+ return outputType ?? violation(`cycle ${cycleId} missing output-type field`, []);
101
71
  }
102
72
 
103
- /**
104
- * Build and append appraiser tasks for a single artefact.
105
- */
106
- function addTasksForArtefact(tasks, artefact, entry, ctx) {
107
- let content = '';
108
- if (artefact.state !== 'deleted') {
109
- content = ctx.io.readFile(artefact.file);
110
- }
73
+ async function fetchAppraiseArtefacts(ctx, outputType) {
74
+ const baseBranch = ctx.baseBranch || 'main';
75
+ const artefacts = await getArtefactFiles(ctx.foundryDir, outputType, ctx.io, { baseBranch });
76
+ if (artefacts.length === 0) return emptyDispatch(ctx.cycleId);
77
+ return artefacts;
78
+ }
111
79
 
112
- for (const appraiser of entry.appraisers) {
113
- const prompt = buildAppraiserPrompt({
114
- appraiser,
115
- artefact: { file: artefact.file, content },
116
- laws: entry.laws,
117
- });
80
+ function buildGatherResponse(appraisers, outputType, ctx) {
81
+ const tasks = appraisers.map(appraiser => ({
82
+ subagent_type: resolveSubagentType(appraiser, ctx),
83
+ prompt: buildAppraiserPrompt({ appraiser, typeId: outputType }),
84
+ }));
118
85
 
119
- tasks.push({
120
- subagent_type: resolveSubagentType(appraiser, ctx),
121
- prompt,
122
- });
123
- }
86
+ return {
87
+ action: 'dispatch_multi',
88
+ tasks,
89
+ stage: `appraise:${ctx.cycleId}`,
90
+ cycle: ctx.cycleId,
91
+ };
124
92
  }
125
93
 
126
94
  /**
@@ -197,9 +165,9 @@ async function resolveStaleAppraiseFeedback(ctx) {
197
165
  /**
198
166
  * Consolidate appraiser results and finalise the appraise stage.
199
167
  *
200
- * Called by orchestrator after all appraisers have completed. Parses results,
201
- * posts combined feedback, resolves prior appraise feedback, and advances
202
- * the cycle to the next stage via finalize.
168
+ * Called by orchestrator after all appraisers have completed. Parses JSONL
169
+ * from each appraiser's output, posts combined feedback, resolves prior
170
+ * appraise feedback, and advances the cycle to the next stage via finalize.
203
171
  *
204
172
  * @param {object} ctx
205
173
  * @param {Array<{ok: boolean, output?: string, error?: string}>} lastResults
@@ -238,20 +206,73 @@ export async function consolidateAppraise(ctx, lastResults) {
238
206
  }
239
207
 
240
208
  /**
241
- * Parse all successful appraiser outputs and de-duplicate the combined issue
242
- * list by (file, law-id, issue text).
209
+ * Parse JSONL from all successful appraiser outputs and de-duplicate the
210
+ * combined issue list by (file, law-id, issue text).
243
211
  */
244
212
  function parseConsolidated(successful) {
245
213
  const all = [];
246
214
 
247
215
  for (const result of successful) {
248
- const issues = parseAppraiserOutput(result.output || '');
216
+ const issues = parseAppraiserJsonl(result.output || '');
249
217
  all.push(...issues);
250
218
  }
251
219
 
252
220
  return deduplicateIssues(all);
253
221
  }
254
222
 
223
+ /**
224
+ * Parse appraiser JSONL output.
225
+ *
226
+ * Each line must be a JSON object with at least `file` and `text` fields.
227
+ * Extra fields (`law`, `evidence`, `severity`, `location`) are preserved.
228
+ * The `text` field maps to the issue description used for feedback text.
229
+ */
230
+ function parseAppraiserJsonl(output) {
231
+ const issues = [];
232
+ const lines = output.trim().split('\n');
233
+
234
+ for (const line of lines) {
235
+ const issue = parseAppraiserLine(line);
236
+ if (issue) issues.push(issue);
237
+ }
238
+
239
+ return issues;
240
+ }
241
+
242
+ function parseAppraiserLine(line) {
243
+ const trimmed = line.trim();
244
+ if (!trimmed) return null;
245
+
246
+ const obj = tryJsonParseLine(trimmed);
247
+ if (!obj) return null;
248
+
249
+ return validateJsonlIssue(obj);
250
+ }
251
+
252
+ function tryJsonParseLine(line) {
253
+ try { return JSON.parse(line); } catch { return null; }
254
+ }
255
+
256
+ function validateJsonlIssue(obj) {
257
+ if (!hasStringField(obj, 'file')) return null;
258
+ if (!hasStringField(obj, 'text')) return null;
259
+
260
+ return {
261
+ file: obj.file,
262
+ law: strOrEmpty(obj.law),
263
+ issue: obj.text,
264
+ evidence: strOrEmpty(obj.evidence),
265
+ };
266
+ }
267
+
268
+ function hasStringField(obj, key) {
269
+ return typeof obj[key] === 'string' && obj[key].length > 0;
270
+ }
271
+
272
+ function strOrEmpty(value) {
273
+ return typeof value === 'string' ? value : '';
274
+ }
275
+
255
276
  /**
256
277
  * De-duplicate an issue array by (file, law, issue text).
257
278
  */
@@ -332,138 +353,41 @@ function buildConsolidateSummary(count) {
332
353
  // ---------------------------------------------------------------------------
333
354
 
334
355
  /**
335
- * Build a subagent prompt for a single (appraiser, artefact) pair.
356
+ * Build a subagent prompt for an appraiser.
336
357
  *
337
- * Follows the template from the appraise skill (src/skills/appraise/SKILL.md)
338
- * extended to include the file path for deterministic result parsing.
358
+ * The prompt contains only the appraiser's personality and the artefact type
359
+ * ID. The subagent discovers artefact files, laws, and file-patterns via tool
360
+ * calls and returns JSONL — one JSON object per line.
339
361
  */
340
- function buildAppraiserPrompt({ appraiser, artefact, laws }) {
341
- const lawSections = laws
342
- .map(law => `## ${law.id}\n\n${law.text}`)
343
- .join('\n\n');
344
-
362
+ function buildAppraiserPrompt({ appraiser, typeId }) {
345
363
  const lines = [
346
364
  'You are an appraiser. Your personality:',
347
365
  '',
348
366
  appraiser.personality,
349
367
  '',
350
- 'Evaluate the following artefact against each law below. For each law,',
351
- 'either:',
352
- '- Note no issues (pass)',
353
- '- Describe the issue, quoting evidence from the artefact',
368
+ `Evaluate artefacts of type "${typeId}" against applicable laws.`,
354
369
  '',
355
- '## Artefact',
370
+ 'Use tools to discover context:',
371
+ `- foundry_config_artefact_type with typeId "${typeId}" for file-patterns`,
372
+ `- foundry_config_laws with typeId "${typeId}" for applicable laws (prose only)`,
373
+ '- foundry_artefacts_list for changed files',
374
+ '- Read matching files from the worktree',
356
375
  '',
357
- artefact.content,
376
+ 'For each law, evaluate each relevant file. If a violation is found,',
377
+ 'output a JSONL line:',
358
378
  '',
359
- '## Laws',
379
+ '{"file": "<path>", "law": "<law-slug>", "text": "<issue description>", "evidence": "<quote>"}',
360
380
  '',
361
- lawSections,
381
+ '`file` and `text` are required. `law` and `evidence` are recommended.',
382
+ 'Optional fields `severity` and `location` are passed through unchanged.',
362
383
  '',
363
- '## Output',
364
- '',
365
- 'Return a list of issues. For each issue:',
366
- `- file: ${artefact.file}`,
367
- ' law: <law-id>',
368
- ' issue: <description>',
369
- ' evidence: <quote from artefact>',
370
- '',
371
- 'If there are no issues, return an empty list.',
384
+ 'Output ONLY JSONL — one JSON object per line. No markdown, no commentary.',
385
+ 'If no issues are found, output nothing.',
372
386
  ];
373
387
 
374
388
  return lines.join('\n');
375
389
  }
376
390
 
377
- // ---------------------------------------------------------------------------
378
- // Output parsing
379
- // ---------------------------------------------------------------------------
380
-
381
- /**
382
- * Parse a structured issue list from an appraiser subagent output.
383
- *
384
- * LLM output is free-form text that may contain a YAML list of issues.
385
- * Tries js-yaml first; falls back to line-scanning when the output is
386
- * not clean YAML (LLMs may include surrounding text, quotes in bare
387
- * strings, or other quirks that trip up a strict YAML parser).
388
- *
389
- * Returns an array of { file, law, issue, evidence } objects.
390
- */
391
- function parseAppraiserOutput(output) {
392
- const text = output || '';
393
- const yamlBlock = extractYamlBlock(text);
394
- const issues = tryYamlParse(yamlBlock);
395
- if (issues) return issues;
396
-
397
- return parseFallback(text);
398
- }
399
-
400
- function extractYamlBlock(text) {
401
- if (text.startsWith('- file:')) return text;
402
- const afterNewline = text.indexOf('\n- file:');
403
- if (afterNewline >= 0) return text.slice(afterNewline + 1);
404
- return text;
405
- }
406
-
407
- function tryYamlParse(yamlBlock) {
408
- try {
409
- const parsed = yaml.load(yamlBlock);
410
- if (Array.isArray(parsed)) {
411
- return parsed
412
- .filter(e => e && typeof e === 'object' && e.file && e.law && e.issue)
413
- .map(e => ({ file: e.file, law: e.law, issue: e.issue, evidence: e.evidence || '' }));
414
- }
415
- } catch { /* fall through to fallback */ }
416
- return null;
417
- }
418
-
419
- const FALLBACK_FIELDS = new Set(['law', 'issue', 'evidence']);
420
-
421
- function isCompleteIssue(obj) {
422
- return obj && obj.file && obj.law && obj.issue;
423
- }
424
-
425
- function applyFallbackField(kv, entry, issues) {
426
- if (kv.key === 'file') {
427
- const e = { file: kv.value, law: '', issue: '', evidence: '' };
428
- issues.push(e);
429
- return e;
430
- }
431
- if (entry && FALLBACK_FIELDS.has(kv.key)) {
432
- entry[kv.key] = kv.value;
433
- }
434
- return entry;
435
- }
436
-
437
- function parseFallback(text) {
438
- const issues = [];
439
- let entry = null;
440
-
441
- for (const line of text.split('\n')) {
442
- const kv = parseFallbackLine(line);
443
- if (kv) entry = applyFallbackField(kv, entry, issues);
444
- }
445
-
446
- return issues.filter(isCompleteIssue);
447
- }
448
-
449
- function parseFallbackLine(line) {
450
- const trimmed = line.trim();
451
- if (!trimmed) return null;
452
-
453
- const colon = trimmed.indexOf(':');
454
- if (colon < 1) return null;
455
-
456
- const key = stripDash(trimmed.slice(0, colon));
457
- return {
458
- key: key.trim(),
459
- value: trimmed.slice(colon + 1).trim(),
460
- };
461
- }
462
-
463
- function stripDash(s) {
464
- return s.startsWith('- ') ? s.slice(2) : s;
465
- }
466
-
467
391
  // ---------------------------------------------------------------------------
468
392
  // Shared helpers
469
393
  // ---------------------------------------------------------------------------
@@ -50,7 +50,30 @@ const _create = makeCreator({
50
50
  validator: validate,
51
51
  });
52
52
 
53
+ /**
54
+ * Assemble the markdown body for example.md from structured arguments.
55
+ *
56
+ * The example file is a structure document: markdown with code blocks showing
57
+ * the expected output format, plus documentation for the forge agent.
58
+ *
59
+ * @param {string} exampleContent - Raw markdown for example.md
60
+ * @returns {string} Trimmed content with trailing newline.
61
+ */
62
+ export function assembleExampleMarkdown(exampleContent) {
63
+ return `${exampleContent.trim()}\n`;
64
+ }
65
+
53
66
  export async function create(args) {
54
67
  const body = assembleArtefactTypeMarkdown(args);
68
+
69
+ if (args.example) {
70
+ const exampleDir = join('foundry', 'artefacts', args.id);
71
+ await args.io.mkdirp(exampleDir);
72
+ await args.io.writeFile(
73
+ join(exampleDir, 'example.md'),
74
+ assembleExampleMarkdown(args.example),
75
+ );
76
+ }
77
+
55
78
  return _create({ ...args, name: args.id, body });
56
79
  }
@@ -166,6 +166,7 @@ function resolveModelId(routeBase, models, defaultModel) {
166
166
 
167
167
  function pickModelId(route, frontmatter, defaultModel) {
168
168
  const routeBase = baseStage(route);
169
+ if (routeBase === 'human-appraise') return null;
169
170
  const resolved = frontmatter.models ? resolveModelId(routeBase, frontmatter.models, defaultModel) : null;
170
171
  return resolved || defaultModel || defaultForStage(routeBase);
171
172
  }
@@ -38,7 +38,7 @@ Do not tell the user to call branch tools directly.
38
38
 
39
39
  When invoked with pre-filled fields matching the `foundry_config_create_artefact_type` tool args, skip questions for provided fields. Missing fields trigger clarifying questions.
40
40
 
41
- Context fields: `{id, name, filePatterns, description, appraisers?}`
41
+ Context fields: `{id, name, filePatterns, description, example?, appraisers?}`
42
42
 
43
43
  When invoked with a context:
44
44
  - If all required fields are present, skip the Understand phase and proceed to Plan → Confirm → Build.
@@ -48,6 +48,12 @@ When invoked with a context:
48
48
 
49
49
  Ask for each field one question at a time. Prefer multiple choice for `filePatterns`, deriving options from the artefact type name and common conventions (e.g. `haikus/*.md`, `haiku.md`, `output/haiku/*.md`). Ask about `appraisers` (optional) — either provide an existing appraiser ID or skip.
50
50
 
51
+ After the core fields, ask about the example:
52
+
53
+ > Would you like to provide an example artefact? An example shows forge agents the expected output structure — markdown with code blocks, plus any conventions, constraints, or required sections. Give a short example file that demonstrates what a valid output looks like.
54
+
55
+ If the user provides an example, capture it verbatim. If the artefact type has no structured output (e.g. free-form prose with no required format), the user may skip this step.
56
+
51
57
  **Naming conflict check**: Read all existing artefact type definitions in `foundry/artefacts/*/definition.md`. Exact id match means a hard conflict — choose a different id. A semantically similar name or description triggers a warning:
52
58
 
53
59
  > An artefact type `<existing-id>` already exists that seems similar:
@@ -74,6 +80,7 @@ Present the definition to the user with these structured fields:
74
80
  - `name` (string) — human-readable label.
75
81
  - `filePatterns` (string[]) — glob patterns for files this type produces.
76
82
  - `description` (string) — prose description of what this artefact type is.
83
+ - `example` (string, optional) — example artefact to guide forge agents on the expected output structure.
77
84
  - `appraisers` ({ count?: number, allowed?: string[] }, optional) — appraiser configuration.
78
85
 
79
86
  Ask: does this capture the artefact type correctly? Iterate until the user is satisfied.
@@ -86,7 +93,7 @@ Ask: "Proceed with this plan?" — wait for user answer before building. If the
86
93
 
87
94
  1. **Validate**: Call `foundry_config_validate_artefact_type({ name: "<id>", body: "<assembled markdown>" })`. Assemble the body from the fields using the frontmatter format the tool produces internally. If the result is `{ ok: false, errors: [...] }`, address each error and re-run until `{ ok: true }`. Common issues: missing required frontmatter keys, references to artefact types or flows that do not exist yet.
88
95
 
89
- 2. **Create**: Call `foundry_config_create_artefact_type({ id: "<id>", name: "<name>", filePatterns: ["<pattern>"], description: "<description>" })`. The tool re-validates the body (TOCTOU), writes `foundry/artefacts/<id>/definition.md`, and produces one git commit on the current `config/*` branch. Show the user the resulting commit hash.
96
+ 2. **Create**: Call `foundry_config_create_artefact_type({ id: "<id>", name: "<name>", filePatterns: ["<pattern>"], description: "<description>", example: "<example>" })`. Include `example` only when the user provided one. The tool re-validates the body (TOCTOU), writes `foundry/artefacts/<id>/definition.md` (and `example.md` if provided), and produces one git commit on the current `config/*` branch. Show the user the resulting commit hash.
90
97
 
91
98
  If the tool returns `{ ok: false, errors }` because the target file already exists, read the existing file, incorporate the user's requested changes into the current body, propose the merged result for review, then write and commit the updated file.
92
99
 
@@ -1,12 +1,14 @@
1
1
  ---
2
2
  name: appraise
3
3
  type: atomic
4
- description: Subjective evaluation of an artefact against laws via multiple independent appraisers.
4
+ description: Subjective evaluation of an artefact against laws via independent appraiser subagents.
5
5
  ---
6
6
 
7
7
  # Appraise
8
8
 
9
- You orchestrate subjective appraisal of an artefact by dispatching independent sub-agent appraisers, then consolidating their feedback.
9
+ **This skill is subagent-only.** It describes the protocol an appraiser subagent follows when dispatched via `task()` from the orchestrate loop. Do NOT load this skill and run appraise inline — the orchestrate skill returns a `dispatch_multi` action with pre-built prompts; call `task()` with each.
10
+
11
+ You evaluate artefacts against laws. Your dispatch prompt contains your personality and the artefact type ID. You discover artefact files, laws, and file-patterns via tool calls.
10
12
 
11
13
  ## Prerequisites
12
14
 
@@ -18,152 +20,58 @@ Before running this skill, verify that the `foundry/` directory exists in the pr
18
20
 
19
21
  Appraise runs inside an enforced stage. Your **first** and **last** tool calls are fixed:
20
22
 
21
- 1. **First:** `foundry_stage_begin({stage, cycle, token})` — copy the token verbatim from the dispatch prompt.
23
+ 1. **First:** `foundry_stage_begin({stage, cycle, token})` — copy the token verbatim from the dispatch prompt. No other tool call is permitted before this one.
22
24
  2. **Last:** `foundry_stage_end({summary})`.
23
25
 
24
- Appraise makes **no disk writes**. Feedback output flows through `foundry_feedback_add` and `foundry_feedback_resolve`. The orchestrator's internal finalize step flags any unexpected writes as a violation.
26
+ Appraise makes **no disk writes**. Feedback output flows through JSONL returned in your response text. The orchestrator's internal consolidate step parses the JSONL, posts feedback, and resolves prior items.
25
27
 
26
28
  ## Protocol
27
29
 
28
- 1. `foundry_stage_begin(...)`.
29
- 2. Gather context:
30
- - `foundry_workfile_get` — read the `cycle` from frontmatter
31
-
32
- **Check for failed flow state.** If `foundry_workfile_get` returns `{status: "failed", reason: ...}`, STOP. Do not call any other tool. Tell the user:
33
-
34
- > The flow is in a failed state. Reason: `<reason>`.
35
- >
36
- > No further work is permitted. To recover:
37
- >
38
- > 1. `foundry_workfile_delete({confirm: true})` to abandon the cycle.
39
- > 2. Back out to main (`git checkout main`) and delete the work branch.
40
- > 3. Investigate and fix the root cause of the failure before restarting.
41
-
42
- Then return control to the user and stop.
43
- - `foundry_artefacts_list({})` — enumerate the current cycle's branch artefact changes as `[{ file, state }]` entries.
44
- - For each artefact change, gather its type-specific context:
45
- - `foundry_config_laws` with the cycle's output type — applicable laws (global + type-specific)
46
- - `foundry_config_artefact_type` with the type ID — the artefact type definition
47
- - `foundry_appraisers_select` with the type ID — selected appraiser personalities with their raw model IDs
48
-
49
- 3. Dispatch each appraiser as an independent sub-agent (see Dispatch below). If this cycle produced multiple artefacts, appraisers evaluate each.
50
-
51
- 4. Collect results from all appraisers
52
-
53
- 5. Consolidate (this is judgment):
54
- - Union of all issues — if any one appraiser flags it, it's feedback
55
- - De-duplicate: merge overlapping observations into a single feedback item
56
- - Preserve which appraiser(s) raised each issue (for traceability)
57
-
58
- 6. For each consolidated issue: `foundry_feedback_add` with `{ file, text, tag: 'law:<slug>' }`. Tags must match `law:<slug>`, and dedup uses the non-resolved `(file, tag, hash(text))` semantics described in Feedback handling.
59
-
60
- 7. If no appraiser found any issues, the artefact clears appraisal.
61
-
62
- 8. `foundry_stage_end({summary})`.
63
-
64
- ## Feedback handling
65
-
66
- As an appraise stage, you have two feedback responsibilities:
67
-
68
- 1. **Adding new law-violation feedback.** For each unmet law, call
69
- `foundry_feedback_add` with `{ file, text, tag: 'law:<slug>' }`.
70
- The `source` is automatically your stage id (e.g. `appraise:write-check`).
71
- The tool rejects any tag not matching `law:<slug>` during an appraise
72
- stage; do not attempt bare `'appraise'` or `'review'` tags.
73
-
74
- The tool returns `{ ok: true, id, deduped }` on success. `deduped: true`
75
- means an existing non-resolved item with the same `(file, tag,
76
- hash(text))` was found (no new snapshot written); `deduped: false`
77
- means a new item was created. Resolved items are NOT considered for
78
- dedup — a re-added item after a resolution is a legitimate new item
79
- (regression feedback).
30
+ 1. `foundry_stage_begin(...)` with the token from the dispatch prompt.
31
+ 2. `foundry_config_artefact_type` with the type ID — get the artefact type definition and `file-patterns`.
32
+ 3. `foundry_config_laws` with the type ID get all applicable laws (prose only).
33
+ 4. `foundry_artefacts_list` — enumerate the current cycle's branch artefact changes.
34
+ 5. For each artefact file that matches the type's `file-patterns`, read the file from the worktree.
35
+ 6. Evaluate each file against each law. For each law, either:
36
+ - Note no issues (pass)
37
+ - Describe the violation, quoting evidence from the artefact
38
+ 7. Output JSONL. Each line is one JSON object:
80
39
 
81
- 2. **Resolving items you sourced.** Call `foundry_feedback_list` and look
82
- at items whose `source` exactly matches your stage id. For items whose
83
- current state is `actioned` or `wont-fix`:
84
- - Approve: `foundry_feedback_resolve` with `{ id, resolution: 'approved' }`.
85
- `reason` is optional.
86
- - Reject: `foundry_feedback_resolve` with `{ id, resolution: 'rejected', reason: '...' }`.
87
- `reason` is required. A rejection sends the item back to forge for
88
- another attempt (the `rejected` state is a legal forge input per
89
- §5.1 rule 2).
40
+ ```json
41
+ {"file": "<path>", "law": "<law-slug>", "text": "<issue description>", "evidence": "<quote from artefact>"}
42
+ ```
90
43
 
91
- **Reason rules.** `reason` is required on `resolution: 'rejected'` and on
92
- any deadlock-override transition. On `resolution: 'approved'` for a
93
- non-deadlocked item, `reason` is optional.
44
+ `file` and `text` are required. `law` and `evidence` are recommended — `law` tells the orchestrator which law tag to use, `evidence` quotes the offending passage. Optional extra fields (`severity`, `location`) are passed through unchanged.
94
45
 
95
- **Source-authorship rule.** You can only resolve/reject items whose `source`
96
- matches your own stage id — not every appraise stage in the cycle, just yours.
97
- This prevents a second appraise stage from rubber-stamping work it didn't
98
- request. For deadlocked items, only human-appraise has the override authority.
46
+ If there are no issues, output nothing (empty response).
99
47
 
100
- **Future work.** Spec §17 notes a planned cycle-level mode that would let
101
- human-appraise see non-deadlocked unresolved feedback before the orchestrator routes.
102
- Not available in v2.6.0; appraise stages today are the sole resolver of
103
- their own non-deadlocked items.
48
+ Your response text is ONLY JSONL one JSON object per line. No markdown headings, no code blocks, no commentary, no YAML.
104
49
 
105
- ## Dispatch
50
+ 8. `foundry_stage_end({summary})`. The summary describes how many issues were found (e.g. "3 issues found" or "No issues found").
106
51
 
107
- Each appraiser is dispatched as an independent sub-agent. The sub-agent receives a prompt containing:
108
- - The appraiser's personality (from their definition)
109
- - The artefact content
110
- - All applicable laws (global + type-specific)
111
- - Instructions to evaluate the artefact against each law and return issues as a structured list
52
+ ## Output examples
112
53
 
113
- ### Model resolution
114
-
115
- `foundry_appraisers_select` returns raw model IDs for each appraiser. Convert each to an agent name: `foundry-<model.replace(/[/.]/g, '-')>` — both `/` and `.` are replaced with `-`. Examples:
116
- - `openai/gpt-4o` → `foundry-openai-gpt-4o`
117
- - `github-copilot/claude-sonnet-4.6` → `foundry-github-copilot-claude-sonnet-4-6`
118
-
119
- - If a model is specified: dispatch with `subagent_type: "foundry-<converted-name>"`. If no agent with that name exists, **hard fail**.
120
- - If no model is specified: dispatch with `subagent_type: "general"` (inherits session model).
121
-
122
- Note: per-appraiser `model` overrides are applied here at dispatch time. The cycle-level `models.appraise` value (if set) is used for routing-time agent-file validation only; this skill does not consult it when iterating appraisers.
123
-
124
- Dispatch all appraisers in parallel (multiple Task calls in a single response).
125
-
126
- ### Sub-agent prompt template
54
+ Good (issues found):
127
55
 
128
56
  ```
129
- You are an appraiser. Your personality:
130
-
131
- <contents of appraiser personality>
132
-
133
- Evaluate the following artefact against each law below. For each law, either:
134
- - Note no issues (pass)
135
- - Describe the issue, quoting evidence from the artefact
136
-
137
- ## Artefact
138
-
139
- <artefact content>
140
-
141
- ## Laws
142
-
143
- <all applicable laws>
144
-
145
- ## Output
146
-
147
- Return a list of issues. For each issue:
148
- - law: <law-id>
149
- - issue: <description>
150
- - evidence: <quote from artefact>
151
-
152
- If there are no issues, return an empty list.
57
+ {"file": "haikus/mountain.md", "law": "syllable-count", "text": "Line 2 has 8 syllables, expected 7", "evidence": "A frog jumps into the pond", "location": "2:1"}
58
+ {"file": "haikus/mountain.md", "law": "nature-imagery", "text": "Contains industrial imagery violating nature-only requirement", "evidence": "The rusty old machine"}
153
59
  ```
154
60
 
155
- ## History
61
+ Good (no issues found — empty response, then stage_end):
156
62
 
157
- Do NOT call `foundry_history_append` or `foundry_git_commit` — `foundry_orchestrate` handles those (the tools are not registered publicly). Return a summary via `foundry_stage_end` (e.g., "3 issues found across 2 appraisers" or "No issues found").
63
+ (no output text)
158
64
 
159
- ### Human override awareness
65
+ ## Feedback handling
160
66
 
161
- When reviewing an artefact, check the feedback history for `#human` tagged items. If a human has already ruled on a topic in a prior iteration, do not re-raise the same issue the human's decision is final.
67
+ You do NOT call `foundry_feedback_add` or `foundry_feedback_resolve`. The orchestrator's consolidate step reads your JSONL output, de-duplicates across all appraisers, posts feedback items with tag `law:<slug>`, and resolves prior appraise-sourced feedback.
162
68
 
163
69
  ## What you do NOT do
164
70
 
165
- - You do not write files — feedback output goes through `foundry_feedback_add` and `foundry_feedback_resolve`.
166
- - You do not revise the artefact.
71
+ - You do not write files — feedback output goes through JSONL, not `foundry_feedback_add`.
72
+ - You do not revise the artefact — that is the forge skill's job.
167
73
  - You do not run deterministic validators — that is the quench skill's job.
168
- - You do not filter out feedback because only one appraiser raised it — one is enough.
169
- - You do not register artefacts that happens automatically via the orchestrator's internal finalize step.
74
+ - You do not call `foundry_feedback_add`, `foundry_feedback_action`, `foundry_feedback_wontfix`, or `foundry_feedback_resolve`.
75
+ - You do not call `foundry_history_append` or `foundry_git_commit` `foundry_orchestrate` handles those.
76
+ - You do not register artefacts — that happens automatically.
77
+ - You do not output YAML, markdown, or prose — only JSONL.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@really-knows-ai/foundry",
3
- "version": "3.8.2",
3
+ "version": "3.8.4",
4
4
  "description": "A skill-driven framework for governed artefact generation with AI coding tools. Define your own artefact types, laws, and flows — Foundry handles the forge → quench → appraise pipeline with deterministic routing, quality gates, and iterative refinement.",
5
5
  "type": "module",
6
6
  "main": "dist/.opencode/plugins/foundry.js",