xdrs-core 0.10.0 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -52,7 +52,7 @@ Provides clear ownership by scope, predictable navigation, and reusable decision
52
52
  - Skills use `.xdrs/[scope]/[type]/[subject]/skills/[number]-[skill-name]/assets/`
53
53
  - **Scopes:**
54
54
  - examples: `business-x`, `business-y`, `team-43`, `_core`
55
- - `_local` is a reserved scope for XDRs created locally to a specific project or repository. XDRs in `_local` must not be shared with or propagated to other contexts. This scope must always be placed in the lowest position in `.xdrs/index.md` so that its decisions override or extend any decisions from all higher-positioned scopes.
55
+ - `_local` is a reserved scope for XDRs created locally to a specific project or repository. XDRs in `_local` must not be shared with or propagated to other contexts. This scope must always be placed in the lowest position in `.xdrs/index.md` so that its decisions override or extend any decisions from all higher-positioned scopes. Shared `.xdrs/index.md` files MUST NOT link `_local` canonical type indexes because `_local` stays workspace-local and is not distributed with shared packages. Readers, tools, and agents SHOULD still try to discover existing workspace-local `_local` canonical indexes by default, even when the shared root index does not link them.
56
56
  - **Types:** `adrs`, `bdrs`, `edrs`
57
57
  - there can exist sufixes to the standard scope names (e.g: `business-x-mobileapp`, `business-y-servicedesk`)
58
58
  - **Subjects:** MUST be one of the following depending on the type of the XDR:
@@ -10,7 +10,7 @@ metadata:
10
10
 
11
11
  ## Overview
12
12
 
13
- Guides the creation of a well-structured XDR by following the standards in `_core-adr-001`, researching existing records for conflicts, checking redundancy across related artifacts, and iterating until the document is concise, decision-focused, and clear about when the decision should be used.
13
+ Guides the creation of a well-structured XDR by following the standards in `_core-adr-001`, consulting `xdr-standards` for every core element definition, researching existing records for conflicts, checking redundancy across related artifacts, and iterating until the document is concise, decision-focused, and clear about when the decision should be used.
14
14
 
15
15
  ## Instructions
16
16
 
@@ -18,10 +18,13 @@ Guides the creation of a well-structured XDR by following the standards in `_cor
18
18
 
19
19
  1. Read `.xdrs/index.md` to discover all active scopes and their canonical indexes.
20
20
  2. Read `.xdrs/_core/adrs/principles/001-xdr-standards.md` in full to internalize structure rules, mandatory language, and the XDR template.
21
- 3. Ask the user (or infer from context) the topic of the decision. Do NOT proceed to Phase 2 without a clear topic.
21
+ 3. Treat `.xdrs/_core/adrs/principles/001-xdr-standards.md` as the canonical source for all core XDR element definitions. Before choosing or writing any core element, consult it for the exact rules for type, scope, subject, ID, numbering, title, placement, and applicable folder structure instead of relying on memory or local convention.
22
+ 4. Ask the user (or infer from context) the topic of the decision. Do NOT proceed to Phase 2 without a clear topic.
22
23
 
23
24
  ### Phase 2: Select Type, Scope, and Subject
24
25
 
26
+ Consult `001-xdr-standards` while making each choice in this phase. The summaries below are orientation only; when any detail matters, the standard decides.
27
+
25
28
  **Type** — choose exactly one based on the nature of the decision:
26
29
  - **BDR**: business process, product policy, strategic rule, operational procedure
27
30
  - **ADR**: system context, integration pattern, overarching architectural choice
@@ -133,6 +136,7 @@ If any check fails, revise and re-run this phase before proceeding.
133
136
  ### Constraints
134
137
 
135
138
  - MUST follow the XDR template from `001-xdr-standards` exactly.
139
+ - MUST consult `001-xdr-standards` as the canonical source for every core element definition, especially type, scope, subject, ID, numbering, naming, and placement.
136
140
  - MUST NOT add personal opinions or general best-practice content not tied to a decision.
137
141
  - MUST NOT create an XDR that duplicates a decision already captured in another XDR — extend or reference instead.
138
142
  - MUST prefer links and short references over repeating the same decision content across related documents.
@@ -11,17 +11,20 @@ metadata:
11
11
 
12
12
  ## Overview
13
13
 
14
- Guides the creation of a well-structured skill package by following `_core-adr-003` skill standards, checking existing skills to avoid duplication, and producing a complete SKILL.md ready to activate in VS Code.
14
+ Guides the creation of a well-structured skill package by following `_core-adr-003` skill standards, consulting `xdr-standards` for every core element definition, checking existing skills to avoid duplication, and producing a complete SKILL.md ready to activate in VS Code.
15
15
 
16
16
  ## Instructions
17
17
 
18
18
  ### Phase 1: Understand the Skill Goal
19
19
 
20
20
  1. Read `.xdrs/_core/adrs/principles/003-skill-standards.md` in full to internalize the SKILL.md format, folder layout, and numbering rules.
21
- 2. Identify what the skill must do, the concrete outcome it should produce, and the exact conditions under which an agent should activate it. Do NOT proceed without a clear goal, outcome, and activation trigger.
21
+ 2. Read `.xdrs/_core/adrs/principles/001-xdr-standards.md` in full before defining any core element for the skill package. Treat it as the canonical source for type, scope, subject, numbering expectations, naming constraints, and folder placement rules.
22
+ 3. Identify what the skill must do, the concrete outcome it should produce, and the exact conditions under which an agent should activate it. Do NOT proceed without a clear goal, outcome, and activation trigger.
22
23
 
23
24
  ### Phase 2: Select Type, Scope, Subject, and Number
24
25
 
26
+ Consult `001-xdr-standards` while making each choice in this phase. The summaries below are orientation only; when there is any ambiguity or edge case, the standard decides.
27
+
25
28
  **Type** — choose one based on the skill's activity:
26
29
  - **EDR skill**: engineering workflows, tool usage, coding procedures, implementation how-tos
27
30
  - **ADR skill**: architectural evaluation, pattern compliance, technology selection guidance
@@ -122,6 +125,7 @@ If any check fails, revise before continuing.
122
125
  ### Constraints
123
126
 
124
127
  - MUST follow the agentskills SKILL.md format from `003-skill-standards` exactly.
128
+ - MUST consult `001-xdr-standards` as the canonical source for every core element definition, especially type, scope, subject, numbering, naming, and placement.
125
129
  - MUST NOT create a skill that duplicates an existing one — extend or reference it instead.
126
130
  - MUST keep scope `_local` unless the user explicitly states otherwise.
127
131
  - MUST include a References section linking to `003-skill-standards`.
@@ -9,9 +9,7 @@ metadata:
9
9
 
10
10
  ## Overview
11
11
 
12
- Guides the creation of a well-structured article by following `_core-adr-004`, researching the XDRs,
13
- Research documents, and Skills to synthesize, and producing a concise document that serves as a navigable view without duplicating
14
- decision content.
12
+ Guides the creation of a well-structured article by following `_core-adr-004`, consulting `xdr-standards` for every core element definition, researching the XDRs, Research documents, and Skills to synthesize, and producing a concise document that serves as a navigable view without duplicating decision content.
15
13
 
16
14
  ## Instructions
17
15
 
@@ -19,15 +17,21 @@ decision content.
19
17
 
20
18
  1. Read `.xdrs/_core/adrs/principles/004-article-standards.md` in full to internalize the template,
21
19
  placement rules, numbering rules, and the constraint that articles are views, not decisions.
22
- 2. Identify the topic and intended audience from user input or context. Do NOT proceed without a clear
20
+ 2. Read `.xdrs/_core/adrs/principles/001-xdr-standards.md` in full before defining the article's core elements. Treat it as the canonical source for how to choose and write type, scope, subject, numbering, naming, and folder placement.
21
+ 3. Identify the topic and intended audience from user input or context. Do NOT proceed without a clear
23
22
  topic.
24
23
 
25
24
  ### Phase 2: Select Scope, Type, and Subject
26
25
 
26
+ Consult `001-xdr-standards` while making each choice in this phase. The summaries below are orientation only; when any detail is unclear, the standard decides.
27
+
27
28
  **Scope** — use `_local` unless the user explicitly names another scope.
28
29
 
29
30
  **Type** — match the type of the XDRs the article primarily synthesizes (`adrs`, `bdrs`, or `edrs`).
30
- If the topic spans multiple types, use `adrs`.
31
+ If the topic spans multiple types, use `adrs`. Use the same rules as `002-write-xdr` Phase 2:
32
+ - **BDR**: business process, product policy, strategic rule, operational procedure
33
+ - **ADR**: system context, integration pattern, overarching architectural choice
34
+ - **EDR**: specific tool/library, coding practice, testing strategy, project structure, pipelines
31
35
 
32
36
  **Subject** — pick the subject that best matches the article's topic (see `004-article-standards`).
33
37
  If the article spans more than one subject, place it in `principles`.
@@ -111,6 +115,13 @@ Rules to apply while drafting:
111
115
  - **Conflicting information found** — note the conflict in the article and always defer to the XDR.
112
116
  - **Article would exceed 150 lines** — move detailed content to a new Research, Skill, or XDR and link back.
113
117
 
118
+ ## Constraints
119
+
120
+ - MUST consult `001-xdr-standards` as the canonical source for every core element definition, especially type, scope, subject, numbering, naming, and placement.
121
+ - MUST follow the article template and placement rules from `004-article-standards`.
122
+ - MUST keep scope `_local` unless the user explicitly states otherwise.
123
+ - MUST defer to active and applicable XDRs when article synthesis conflicts with them.
124
+
114
125
  ## References
115
126
 
116
127
  - [_core-adr-004 - Article standards](../../../.xdrs/_core/adrs/principles/004-article-standards.md)
@@ -0,0 +1,37 @@
1
+ 'use strict';
2
+
3
+ const path = require('path');
4
+ const { copilotCmd, testPrompt } = require('xdrs-core');
5
+
6
+ const REPO_ROOT = path.resolve(__dirname, '..', '..', '..', '..', '..', '..');
7
+
8
+ jest.setTimeout(60000);
9
+
10
+ test.skip('check', () => {
11
+ const err = testPrompt(
12
+ {
13
+ workspaceRoot: REPO_ROOT,
14
+ workspaceMode: 'in-place',
15
+ promptCmd: copilotCmd(REPO_ROOT)
16
+ },
17
+ 'Reply with READY and nothing else.',
18
+ 'Verify that the final output is READY and nothing else.',
19
+ true
20
+ );
21
+
22
+ expect(err).toBe('');
23
+ });
24
+
25
+ test('005-write-research creates an IMRAD research document in copy mode', () => {
26
+ const err = testPrompt(
27
+ {
28
+ workspaceRoot: REPO_ROOT,
29
+ workspaceMode: 'copy',
30
+ promptCmd: copilotCmd(REPO_ROOT)
31
+ },
32
+ 'Create a very small research document with the following data: We measured the installation time in our monorepo and pnpm is 3.5x faster than Yarn when installing dependencies. We recommend using PNPM in our monorepo to speed up our productivity as it seems very easy to use and have a better internal hoisting mechanism.',
33
+ 'Verify that a research file was created under .xdrs/_local/edrs/devops/researches/, that it contains the sections Abstract, Introduction, Methods, Results, Discussion, Conclusion, and References, and that the content contains all the provided data in input prompt, and doesn\'t contain more than 20% of additional information.'
34
+ );
35
+
36
+ expect(err).toBe('');
37
+ });
@@ -11,26 +11,32 @@ metadata:
11
11
 
12
12
  ## Overview
13
13
 
14
- Guides the creation of a well-structured research document by following `_core-adr-006`, checking related XDRs and existing research to avoid duplication, and producing an IMRAD-based study that reads as a standalone technical paper. Treat each section goal in the research template as an acceptance criterion, not as optional wording. Do not assume missing direction, evidence, or intended follow-up; ask the user explicitly before proceeding when those points are not already concrete.
14
+ Guides the creation of a well-structured research document by following `_core-adr-006`, consulting `xdr-standards` for every core element definition, checking related XDRs and existing research to avoid duplication, and producing an IMRAD-based study that reads as a standalone technical paper. Treat each section goal in the research template as an acceptance criterion, not as optional wording. Do not assume missing direction, evidence, or intended follow-up; ask the user explicitly before proceeding when those points are not already concrete.
15
15
 
16
16
  ## Instructions
17
17
 
18
18
  ### Phase 1: Understand the Research Goal
19
19
 
20
20
  1. Read `.xdrs/_core/adrs/principles/006-research-standards.md` in full to internalize the folder layout, numbering rules, and mandatory template.
21
- 2. Ask the user to confirm the intended direction of the research before planning the document: what decision, question, or option space the study should support, what boundaries or exclusions apply, and what kind of outcome they expect.
22
- 3. Ask the user what evidence already exists and what evidence-gathering methods are acceptable if the current evidence is incomplete. Do not invent facts, sources, or confidence that the user did not provide.
23
- 4. Ask the user what the proposed next step is after the research, such as writing a new XDR, updating an existing XDR, informing a discussion, or documenting trade-offs for later. Use that answer to shape the framing without turning the research into the final decision.
24
- 5. Identify the problem or question being explored, the relevant system or domain context, the likely technical audience, and why the subject matters in practice.
25
- 6. Internalize the goal of each required section before drafting: `Abstract` gives a quick technical reader the question, method, main result, and takeaway, `Introduction` frames the investigated problem and context, `Methods` makes the important parts reproducible, `Results` records raw findings with minimal interpretation, `Discussion` interprets the findings, `Conclusion` summarizes the practical takeaway and boundaries, and `References` makes sources traceable.
26
- 7. Collect the main constraints, known facts, important experiences, gaps, and assumptions that belong in the introduction.
27
- 8. Do NOT proceed without a clear problem statement, a central question, explicit user direction, an understood next step, and at least one credible source of evidence or a method for generating it. If any of these are ambiguous, stop and ask instead of assuming.
21
+ 2. Read `.xdrs/_core/adrs/principles/001-xdr-standards.md` in full before defining the research document's core elements. Treat it as the canonical source for how to choose and write type, scope, subject, numbering expectations, naming constraints, and folder placement.
22
+ 3. Ask the user to confirm the intended direction of the research before planning the document: what decision, question, or option space the study should support, what boundaries or exclusions apply, and what kind of outcome they expect.
23
+ 4. Ask the user what evidence already exists and what evidence-gathering methods are acceptable if the current evidence is incomplete. Do not invent facts, sources, or confidence that the user did not provide.
24
+ 5. Ask the user what the proposed next step is after the research, such as writing a new XDR, updating an existing XDR, informing a discussion, or documenting trade-offs for later. Use that answer to shape the framing without turning the research into the final decision.
25
+ 6. Identify the problem or question being explored, the relevant system or domain context, the likely technical audience, and why the subject matters in practice.
26
+ 7. Internalize the goal of each required section before drafting: `Abstract` gives a quick technical reader the question, method, main result, and takeaway, `Introduction` frames the investigated problem and context, `Methods` makes the important parts reproducible, `Results` records raw findings with minimal interpretation, `Discussion` interprets the findings, `Conclusion` summarizes the practical takeaway and boundaries, and `References` makes sources traceable.
27
+ 8. Collect the main constraints, known facts, important experiences, gaps, and assumptions that belong in the introduction.
28
+ 9. Do NOT proceed without a clear problem statement, a central question, explicit user direction, an understood next step, and at least one credible source of evidence or a method for generating it. If any of these are ambiguous, stop and ask instead of assuming.
28
29
 
29
30
  ### Phase 2: Select Scope, Type, Subject, and Number
30
31
 
32
+ Consult `001-xdr-standards` while making each choice in this phase. The summaries below are orientation only; when any detail matters, the standard decides.
33
+
31
34
  **Scope** — use `_local` unless the user explicitly names another scope.
32
35
 
33
- **Type** — match the type of decision this research supports (`adrs`, `bdrs`, or `edrs`).
36
+ **Type** — match the type of decision this research supports (`adrs`, `bdrs`, or `edrs`). Use the same rules as `002-write-xdr` Phase 2:
37
+ - **BDR**: business process, product policy, strategic rule, operational procedure
38
+ - **ADR**: system context, integration pattern, overarching architectural choice
39
+ - **EDR**: specific tool/library, coding practice, testing strategy, project structure, pipelines
34
40
 
35
41
  **Subject** — pick the most specific subject that matches the problem domain.
36
42
 
@@ -253,4 +259,11 @@ If any check fails, revise before continuing.
253
259
 
254
260
  - [_core-adr-006 - Research standards](../../006-research-standards.md)
255
261
  - [_core-adr-001 - XDR standards](../../001-xdr-standards.md)
256
- - [002-write-xdr skill](../002-write-xdr/SKILL.md)
262
+ - [002-write-xdr skill](../002-write-xdr/SKILL.md)
263
+
264
+ ## Constraints
265
+
266
+ - MUST consult `001-xdr-standards` as the canonical source for every core element definition, especially type, scope, subject, numbering, naming, and placement.
267
+ - MUST follow the research template and section-goal rules from `006-research-standards`.
268
+ - MUST keep scope `_local` unless the user explicitly states otherwise.
269
+ - MUST keep the document as research rather than turning it into a final decision.
package/.xdrs/index.md CHANGED
@@ -19,6 +19,4 @@ Decisions about how XDRs work
19
19
 
20
20
  ### _local (reserved)
21
21
 
22
- Project-local XDRs that must not be shared with other contexts. Always keep this scope last so its decisions override or extend all scopes listed above. Add specific `_local` ADR/BDR/EDR index links here when present.
23
-
24
-
22
+ Project-local XDRs that must not be shared with other contexts. Always keep this scope last so its decisions override or extend all scopes listed above. Keep `_local` canonical indexes in the workspace tree only; do not link them from this shared index. Readers and tools should still try to discover existing `_local` indexes in the current workspace by default.
package/README.md CHANGED
@@ -68,6 +68,46 @@ npx -y xdrs-core lint ./some-project
68
68
  pnpm exec xdrs-core lint .
69
69
  ```
70
70
 
71
+ ## Library Testing
72
+
73
+ The package also exposes a reusable behavior-test library for Jest or any other JavaScript test runner.
74
+
75
+ Main exports:
76
+
77
+ - `testPrompt(config, inputPrompt, judgePrompt)` runs the task prompt, evaluates the result in a fresh judge session, and returns an empty string on success or a markdown bullet list on failure.
78
+ - `runPromptTest(config, inputPrompt, judgePrompt)` returns the structured result object when you need access to captured output and the agent-reported changed file list.
79
+ - `copilotCmd(workspaceRoot)` returns a ready-to-use `promptCmd` template for the Copilot CLI. The library uses that same command template for both the task and judge phases. If `workspaceRoot` is omitted it defaults to the current git repository root.
80
+ - `config.workspaceRoot`, when set, is the authoritative workspace under test. If omitted, the library uses the current git repository root.
81
+
82
+ Execution model:
83
+
84
+ - phase 1 runs the task prompt and captures final output text plus the files the agent says it changed
85
+ - phase 2 runs an independent judge prompt in a fresh invocation of `promptCmd` against the original task prompt, task output, the agent-reported changed file list, and the current workspace state
86
+ - the judge trusts that reported file list as the authoritative change report and reads file contents from the workspace directly when needed
87
+ - when `workspaceMode: 'copy'` is used, the temporary workspace honors nested `.gitignore` rules and skips git metadata files during the copy
88
+
89
+ `promptCmd` accepts either a string array or a JSON array string and must include a `{PROMPT}` placeholder.
90
+
91
+ Example with Jest:
92
+
93
+ ```js
94
+ const { copilotCmd, testPrompt } = require('xdrs-core');
95
+
96
+ test('creates hello.md', () => {
97
+ const err = testPrompt(
98
+ {
99
+ workspaceRoot: process.cwd(),
100
+ promptCmd: copilotCmd(process.cwd()),
101
+ workspaceMode: 'copy'
102
+ },
103
+ "Create a nice markdown file at hello.md saying 'hello!'",
104
+ 'The resulting file should be created at hello.md and have hello as part of its contents, without too much extra info (should be <100 chars)'
105
+ );
106
+
107
+ expect(err).toBe('');
108
+ });
109
+ ```
110
+
71
111
  ## Requirements
72
112
 
73
113
  ### Multi-scope support
package/lib/index.js ADDED
@@ -0,0 +1,3 @@
1
+ 'use strict';
2
+
3
+ module.exports = require('./testPrompt');
package/lib/lint.js CHANGED
@@ -96,32 +96,30 @@ function lintRootIndex(rootIndexPath, xdrsRoot, actualTypeIndexes, errors) {
96
96
  errors.push(`Root index is missing required override text: ${toDisplayPath(rootIndexPath)}`);
97
97
  }
98
98
 
99
- const localLinks = parseLocalLinks(content, path.dirname(rootIndexPath));
100
- for (const linkPath of localLinks) {
99
+ const links = parseLocalLinks(content, path.dirname(rootIndexPath));
100
+ for (const linkPath of links) {
101
101
  if (!fs.existsSync(linkPath)) {
102
102
  errors.push(`Broken link in root index: ${displayPath(rootIndexPath, linkPath)}`);
103
103
  }
104
104
  }
105
105
 
106
- const linkedTypeIndexes = localLinks.filter((linkPath) => isCanonicalTypeIndex(linkPath, xdrsRoot));
106
+ const linkedTypeIndexes = links.filter((linkPath) => isCanonicalTypeIndex(linkPath, xdrsRoot));
107
107
  const linkedSet = new Set(linkedTypeIndexes.map(normalizePath));
108
108
 
109
- for (const indexPath of actualTypeIndexes) {
110
- if (!linkedSet.has(normalizePath(indexPath))) {
111
- errors.push(`Root index is missing canonical index link: ${toDisplayPath(indexPath)}`);
109
+ for (const indexPath of linkedTypeIndexes) {
110
+ const scopeName = path.basename(path.dirname(path.dirname(indexPath)));
111
+ if (scopeName === '_local') {
112
+ errors.push(`Root index must not link _local canonical index: ${displayPath(rootIndexPath, indexPath)}`);
112
113
  }
113
114
  }
114
115
 
115
- let seenLocal = false;
116
- for (const indexPath of linkedTypeIndexes) {
116
+ for (const indexPath of actualTypeIndexes) {
117
117
  const scopeName = path.basename(path.dirname(path.dirname(indexPath)));
118
118
  if (scopeName === '_local') {
119
- seenLocal = true;
120
119
  continue;
121
120
  }
122
- if (seenLocal) {
123
- errors.push('Root index must keep all _local scope links after every non-_local scope link');
124
- break;
121
+ if (!linkedSet.has(normalizePath(indexPath))) {
122
+ errors.push(`Root index is missing canonical index link: ${toDisplayPath(indexPath)}`);
125
123
  }
126
124
  }
127
125
  }
@@ -0,0 +1,660 @@
1
+ #!/usr/bin/env node
2
+ 'use strict';
3
+
4
+ const fs = require('fs');
5
+ const ignore = require('ignore');
6
+ const os = require('os');
7
+ const path = require('path');
8
+ const { spawnSync } = require('child_process');
9
+
10
+ const MAX_TASK_OUTPUT_CHARS = 12 * 1024;
11
+
12
+ function testPrompt(config, inputPrompt, judgePrompt, verbose) {
13
+ const result = runPromptTest(config, inputPrompt, judgePrompt, verbose);
14
+ return result.passed ? '' : formatFailureMarkdown(result.findings);
15
+ }
16
+
17
+ function runPromptTest(config, inputPrompt, judgePrompt, verbose) {
18
+ if(verbose) {
19
+ console.log('Running prompt test with config:', JSON.stringify(config, null, 2));
20
+ console.log('Input Prompt:', inputPrompt);
21
+ console.log('Judge Prompt:', judgePrompt);
22
+ }
23
+ const options = normalizeConfig(config);
24
+ const originalWorkspace = resolveWorkspaceRoot(options);
25
+ let tempRoot = null;
26
+ let effectiveWorkspace = originalWorkspace;
27
+
28
+ try {
29
+ if (options.workspaceMode === 'copy') {
30
+ tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), 'xdrs-core-test-'));
31
+ effectiveWorkspace = copyWorkspace(originalWorkspace, path.join(tempRoot, 'workspace'), verbose);
32
+ }
33
+
34
+ if(verbose) {
35
+ console.log(`Running prompt test in workspace: ${effectiveWorkspace} (mode: ${options.workspaceMode})`);
36
+ }
37
+ const task = runTaskPhase({
38
+ prompt: ensureNonEmptyString(inputPrompt, 'inputPrompt'),
39
+ commandTemplate: options.promptCmd,
40
+ workspacePath: effectiveWorkspace,
41
+ authoritativeWorkspacePath: originalWorkspace,
42
+ timeoutMs: options.taskTimeoutMs,
43
+ env: options.env,
44
+ verbose
45
+ });
46
+
47
+ if(verbose) {
48
+ console.log('Task phase completed. Summary:', task.summary);
49
+ console.log('Agent reported changed files:', task.changedFiles);
50
+ }
51
+
52
+ if(verbose) {
53
+ console.log('Running judge phase to evaluate the task output against the judge prompt.');
54
+ }
55
+ const evaluation = runJudgePhase({
56
+ originalPrompt: ensureNonEmptyString(inputPrompt, 'inputPrompt'),
57
+ judgePrompt: ensureNonEmptyString(judgePrompt, 'judgePrompt'),
58
+ taskOutput: task.summary,
59
+ agentReportedChanges: task.changedFiles,
60
+ commandTemplate: options.promptCmd,
61
+ workspacePath: effectiveWorkspace,
62
+ authoritativeWorkspacePath: originalWorkspace,
63
+ timeoutMs: options.judgeTimeoutMs,
64
+ env: options.env,
65
+ verbose
66
+ });
67
+
68
+ return {
69
+ passed: evaluation.pass,
70
+ findings: evaluation.findings,
71
+ taskOutput: task.summary,
72
+ agentReportedChanges: task.changedFiles,
73
+ judge: evaluation.raw,
74
+ workspace: {
75
+ original: originalWorkspace,
76
+ effective: effectiveWorkspace,
77
+ mode: options.workspaceMode
78
+ }
79
+ };
80
+ } finally {
81
+ if (tempRoot && options.workspaceMode === 'copy') {
82
+ fs.rmSync(tempRoot, { recursive: true, force: true });
83
+ }
84
+ }
85
+ }
86
+
87
+ function copilotCmd(workspaceRoot = findGitRoot(process.cwd())) {
88
+ return [
89
+ 'copilot',
90
+ `--add-dir=${path.resolve(workspaceRoot)}`,
91
+ '--allow-all',
92
+ '-p',
93
+ '{PROMPT}'
94
+ ];
95
+ }
96
+
97
+ function ensureNonEmptyString(value, label) {
98
+ if (typeof value !== 'string' || !value.trim()) {
99
+ throw new Error(`Expected non-empty ${label}`);
100
+ }
101
+ return value.trim();
102
+ }
103
+
104
+ function normalizeConfig(config) {
105
+ if (!config || typeof config !== 'object' || Array.isArray(config)) {
106
+ throw new Error('Expected config to be an object.');
107
+ }
108
+
109
+ const workspaceMode = config.workspaceMode || 'copy';
110
+ if (workspaceMode !== 'copy' && workspaceMode !== 'in-place') {
111
+ throw new Error(`Invalid workspaceMode value: ${workspaceMode}`);
112
+ }
113
+
114
+ return {
115
+ promptCmd: parseCommandTemplate(config.promptCmd, 'promptCmd'),
116
+ workspaceRoot: config.workspaceRoot ? path.resolve(config.workspaceRoot) : null,
117
+ workspaceMode,
118
+ env: normalizeEnv(config.env),
119
+ taskTimeoutMs: readTimeout(config.taskTimeoutMs, 'taskTimeoutMs'),
120
+ judgeTimeoutMs: readTimeout(config.judgeTimeoutMs, 'judgeTimeoutMs')
121
+ };
122
+ }
123
+
124
+ function resolveWorkspaceRoot(options) {
125
+ const resolvedWorkspace = options.workspaceRoot || findGitRoot(process.cwd());
126
+
127
+ if (!fs.existsSync(resolvedWorkspace) || !fs.statSync(resolvedWorkspace).isDirectory()) {
128
+ throw new Error(`Workspace directory not found: ${resolvedWorkspace}`);
129
+ }
130
+
131
+ return resolvedWorkspace;
132
+ }
133
+
134
+ function runTaskPhase({ prompt, commandTemplate, workspacePath, authoritativeWorkspacePath, timeoutMs, env }, verbose) {
135
+ const wrappedPrompt = [
136
+ 'XDRS-CORE TEST PHASE: TASK',
137
+ '',
138
+ 'Execute the following task in the current workspace.',
139
+ 'Keep all changes inside the workspace.',
140
+ 'Respond with JSON only and no code fences.',
141
+ 'Use exactly this schema: {"summary":"plain text summary","changedFiles":["relative/path.ext"]}.',
142
+ 'The summary must describe the final result only, not hidden reasoning.',
143
+ '',
144
+ 'BEGIN TASK PROMPT',
145
+ prompt,
146
+ 'END TASK PROMPT'
147
+ ].join('\n');
148
+
149
+ const result = runPromptCommand({
150
+ commandTemplate,
151
+ workspacePath,
152
+ authoritativeWorkspacePath,
153
+ prompt: wrappedPrompt,
154
+ timeoutMs,
155
+ env,
156
+ verbose
157
+ });
158
+
159
+ return parseTaskResponse(result.output);
160
+ }
161
+
162
+ function runJudgePhase({ originalPrompt, judgePrompt, taskOutput, agentReportedChanges, commandTemplate, workspacePath, authoritativeWorkspacePath, timeoutMs, env }, verbose) {
163
+ const wrappedPrompt = [
164
+ 'XDRS-CORE TEST PHASE: ASSERTION_EVALUATION',
165
+ '',
166
+ 'You are evaluating the result of a separate agent task run.',
167
+ 'Treat this as a fresh session. Do not assume any hidden history.',
168
+ 'Use the original task prompt, the judge prompt, the final task output, the reported changed file paths, and the current workspace state to decide whether the result passes.',
169
+ 'Trust the reported changed file path list as the authoritative change report for this task run.',
170
+ 'Read files from the workspace directly when you need their contents.',
171
+ 'Inspect files in the workspace directly when needed.',
172
+ 'Respond with JSON only and no code fences.',
173
+ 'Use exactly this schema: {"pass":true,"findings":[]} or {"pass":false,"findings":[{"target":"file","path":"relative/path.ext","line":1,"message":"explanation","assertionRef":"exact relevant phrase from the judge prompt"}]}.',
174
+ 'Use target="output" when the issue is in the final task output and target="workspace" when it is not tied to a specific file.',
175
+ 'Include 1-based line numbers when you cite a file or the output text. Include the exact judge-prompt phrase that triggered each finding in assertionRef.',
176
+ '',
177
+ 'BEGIN ORIGINAL TASK PROMPT',
178
+ originalPrompt,
179
+ 'END ORIGINAL TASK PROMPT',
180
+ '',
181
+ 'BEGIN JUDGE PROMPT',
182
+ judgePrompt,
183
+ 'END JUDGE PROMPT',
184
+ '',
185
+ 'BEGIN TASK OUTPUT',
186
+ truncateText(taskOutput || '(empty)', MAX_TASK_OUTPUT_CHARS),
187
+ 'END TASK OUTPUT',
188
+ '',
189
+ 'BEGIN AGENT REPORTED CHANGES JSON',
190
+ JSON.stringify(agentReportedChanges, null, 2),
191
+ 'END AGENT REPORTED CHANGES JSON'
192
+ ].join('\n');
193
+
194
+ const result = runPromptCommand({
195
+ commandTemplate,
196
+ workspacePath,
197
+ authoritativeWorkspacePath,
198
+ prompt: wrappedPrompt,
199
+ timeoutMs,
200
+ env,
201
+ verbose
202
+ });
203
+
204
+ return normalizeJudgeResponse(result.output);
205
+ }
206
+
207
+ function parseTaskResponse(output) {
208
+ const trimmed = String(output || '').trim();
209
+ if (!trimmed) {
210
+ throw new Error('The task command returned empty output.');
211
+ }
212
+
213
+ try {
214
+ const parsed = parseJsonObject(trimmed);
215
+ return {
216
+ summary: typeof parsed.summary === 'string' && parsed.summary.trim()
217
+ ? parsed.summary.trim()
218
+ : trimmed,
219
+ changedFiles: normalizeStringArray(parsed.changedFiles)
220
+ };
221
+ } catch (error) {
222
+ return {
223
+ summary: trimmed,
224
+ changedFiles: []
225
+ };
226
+ }
227
+ }
228
+
229
+ function normalizeJudgeResponse(output) {
230
+ let parsed;
231
+
232
+ try {
233
+ parsed = parseJsonObject(output);
234
+ } catch (error) {
235
+ throw new Error(`Judge returned invalid JSON: ${truncateText(String(output || '').trim(), 1000)}`);
236
+ }
237
+
238
+ if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) {
239
+ throw new Error('Judge response must be a JSON object.');
240
+ }
241
+
242
+ if (typeof parsed.pass !== 'boolean') {
243
+ throw new Error('Judge response must include a boolean pass field.');
244
+ }
245
+
246
+ let findings = [];
247
+ if (Array.isArray(parsed.findings)) {
248
+ findings = parsed.findings.map(normalizeFinding).filter(Boolean);
249
+ } else if (Array.isArray(parsed.reasons)) {
250
+ findings = parsed.reasons
251
+ .filter((reason) => typeof reason === 'string' && reason.trim())
252
+ .map((reason) => ({
253
+ target: 'workspace',
254
+ message: reason.trim(),
255
+ path: null,
256
+ line: null,
257
+ assertionRef: ''
258
+ }));
259
+ }
260
+
261
+ if (!parsed.pass && findings.length === 0) {
262
+ findings = [{
263
+ target: 'workspace',
264
+ message: 'Judge reported failure without detailed findings.',
265
+ path: null,
266
+ line: null,
267
+ assertionRef: ''
268
+ }];
269
+ }
270
+
271
+ return {
272
+ pass: parsed.pass,
273
+ findings,
274
+ raw: parsed
275
+ };
276
+ }
277
+
278
+ function normalizeFinding(finding) {
279
+ if (!finding) {
280
+ return null;
281
+ }
282
+
283
+ if (typeof finding === 'string') {
284
+ const message = finding.trim();
285
+ return message ? {
286
+ target: 'workspace',
287
+ message,
288
+ path: null,
289
+ line: null,
290
+ assertionRef: ''
291
+ } : null;
292
+ }
293
+
294
+ if (typeof finding !== 'object' || Array.isArray(finding)) {
295
+ return null;
296
+ }
297
+
298
+ const message = typeof finding.message === 'string' ? finding.message.trim() : '';
299
+ if (!message) {
300
+ return null;
301
+ }
302
+
303
+ const pathValue = typeof finding.path === 'string' && finding.path.trim() ? finding.path.trim() : null;
304
+ const lineValue = normalizeLineNumber(finding.line);
305
+ const target = finding.target === 'file' || finding.target === 'output' || finding.target === 'workspace'
306
+ ? finding.target
307
+ : (pathValue ? 'file' : 'workspace');
308
+
309
+ return {
310
+ target,
311
+ path: pathValue,
312
+ line: lineValue,
313
+ message,
314
+ assertionRef: typeof finding.assertionRef === 'string' ? finding.assertionRef.trim() : ''
315
+ };
316
+ }
317
+
318
+ function runPromptCommand({ commandTemplate, workspacePath, authoritativeWorkspacePath, prompt, timeoutMs, env }, verbose) {
319
+ const command = rewriteWorkspaceCommand(commandTemplate.map((entry) => entry
320
+ .replace('{PROMPT}', prompt)
321
+ .replace('{WORKSPACE_ROOT}', workspacePath)), workspacePath, authoritativeWorkspacePath);
322
+
323
+ const [file, ...args] = command;
324
+
325
+ if(verbose) {
326
+ console.log(`Running prompt cmd: ${file} ${args.join(' ')} in workspace: ${workspacePath}`);
327
+ }
328
+
329
+ const result = spawnSync(file, args, {
330
+ encoding: 'utf8',
331
+ cwd: workspacePath,
332
+ timeout: timeoutMs || undefined,
333
+ maxBuffer: 10 * 1024 * 1024,
334
+ env: {
335
+ ...process.env,
336
+ ...env
337
+ }
338
+ });
339
+
340
+ if(verbose) {
341
+ console.log(`Prompt command output: ${result.stdout || result.stderr}`);
342
+ }
343
+
344
+
345
+ if (result.error) {
346
+ if (result.error.code === 'ENOENT') {
347
+ throw new Error(`Command not found: ${file}`);
348
+ }
349
+ throw new Error(`Failed to execute ${file}: ${result.error.message}`);
350
+ }
351
+
352
+ if (result.status !== 0) {
353
+ const details = truncateText((result.stderr || result.stdout || '').trim(), 2000);
354
+ throw new Error(`${file} exited with status ${result.status}${details ? `: ${details}` : ''}`);
355
+ }
356
+
357
+ const output = (result.stdout || '').trim() || (result.stderr || '').trim();
358
+ if (!output) {
359
+ throw new Error(`${file} returned empty output.`);
360
+ }
361
+
362
+ if(verbose) {
363
+ console.log(`Prompt command output: ${output}`);
364
+ }
365
+
366
+ return { output };
367
+ }
368
+
369
+ function rewriteWorkspaceCommand(command, workspacePath, authoritativeWorkspacePath) {
370
+ if (!authoritativeWorkspacePath || path.resolve(workspacePath) === path.resolve(authoritativeWorkspacePath)) {
371
+ return command;
372
+ }
373
+
374
+ const normalizedAuthoritativeWorkspacePath = path.resolve(authoritativeWorkspacePath);
375
+ return command.map((entry, index, allEntries) => {
376
+ if (entry === '--add-dir' && typeof allEntries[index + 1] === 'string') {
377
+ return entry;
378
+ }
379
+
380
+ if (index > 0 && allEntries[index - 1] === '--add-dir' && path.resolve(entry) === normalizedAuthoritativeWorkspacePath) {
381
+ return workspacePath;
382
+ }
383
+
384
+ if (!entry.startsWith('--add-dir=')) {
385
+ return entry;
386
+ }
387
+
388
+ const addDirPath = entry.slice('--add-dir='.length);
389
+ if (path.resolve(addDirPath) !== normalizedAuthoritativeWorkspacePath) {
390
+ return entry;
391
+ }
392
+
393
+ return `--add-dir=${workspacePath}`;
394
+ });
395
+ }
396
+
397
+ function parseCommandTemplate(value, label) {
398
+ if (Array.isArray(value)) {
399
+ return normalizeCommandArray(value, label);
400
+ }
401
+
402
+ if (typeof value !== 'string' || !value.trim()) {
403
+ throw new Error(`Expected ${label} to be a non-empty JSON array string or string array.`);
404
+ }
405
+
406
+ let parsed;
407
+ try {
408
+ parsed = JSON.parse(value);
409
+ } catch (error) {
410
+ throw new Error(`${label} must be a JSON array string or a string array.`);
411
+ }
412
+
413
+ return normalizeCommandArray(parsed, label);
414
+ }
415
+
416
+ function normalizeCommandArray(value, label) {
417
+ if (!Array.isArray(value) || value.length === 0 || value.some((entry) => typeof entry !== 'string' || !entry)) {
418
+ throw new Error(`${label} must be a non-empty array of strings.`);
419
+ }
420
+
421
+ if (!value.some((entry) => entry.includes('{PROMPT}'))) {
422
+ throw new Error(`${label} must include a {PROMPT} placeholder.`);
423
+ }
424
+
425
+ return [...value];
426
+ }
427
+
428
+ function normalizeEnv(env) {
429
+ if (env == null) {
430
+ return {};
431
+ }
432
+
433
+ if (!env || typeof env !== 'object' || Array.isArray(env)) {
434
+ throw new Error('Expected env to be an object when provided.');
435
+ }
436
+
437
+ return { ...env };
438
+ }
439
+
440
+ function readTimeout(value, label) {
441
+ if (value == null) {
442
+ return 0;
443
+ }
444
+
445
+ const parsed = Number.parseInt(value, 10);
446
+ if (!Number.isFinite(parsed) || parsed < 0) {
447
+ throw new Error(`Invalid ${label} value: ${value}`);
448
+ }
449
+ return parsed;
450
+ }
451
+
452
+ function parseJsonObject(value) {
453
+ const trimmed = String(value || '').trim();
454
+ try {
455
+ return JSON.parse(trimmed);
456
+ } catch (error) {
457
+ const firstBrace = trimmed.indexOf('{');
458
+ const lastBrace = trimmed.lastIndexOf('}');
459
+ if (firstBrace !== -1 && lastBrace !== -1 && lastBrace > firstBrace) {
460
+ return JSON.parse(trimmed.slice(firstBrace, lastBrace + 1));
461
+ }
462
+ throw error;
463
+ }
464
+ }
465
+
466
+ function normalizeStringArray(values) {
467
+ if (!Array.isArray(values)) {
468
+ return [];
469
+ }
470
+
471
+ return [...new Set(values
472
+ .filter((value) => typeof value === 'string' && value.trim())
473
+ .map((value) => value.trim()))].sort((left, right) => left.localeCompare(right));
474
+ }
475
+
476
+ function normalizeLineNumber(value) {
477
+ const parsed = Number.parseInt(value, 10);
478
+ return Number.isFinite(parsed) && parsed > 0 ? parsed : null;
479
+ }
480
+
481
+ function formatFailureMarkdown(findings) {
482
+ const normalizedFindings = Array.isArray(findings) && findings.length > 0
483
+ ? findings
484
+ : [{ target: 'workspace', message: 'The prompt test failed without detailed findings.' }];
485
+
486
+ return normalizedFindings.map((finding) => {
487
+ const location = formatFindingLocation(finding);
488
+ const assertion = finding.assertionRef ? ` Assertion: "${finding.assertionRef}".` : '';
489
+ return `- [${location}] ${finding.message}${assertion}`;
490
+ }).join('\n');
491
+ }
492
+
493
+ function formatFindingLocation(finding) {
494
+ if (finding.target === 'output') {
495
+ return finding.line ? `output:${finding.line}` : 'output';
496
+ }
497
+
498
+ if (finding.path) {
499
+ return finding.line ? `${finding.path}:${finding.line}` : finding.path;
500
+ }
501
+
502
+ return 'workspace';
503
+ }
504
+
505
+ function findGitRoot(startPath) {
506
+ let currentPath = path.resolve(startPath);
507
+
508
+ while (true) {
509
+ if (fs.existsSync(path.join(currentPath, '.git'))) {
510
+ return currentPath;
511
+ }
512
+
513
+ const parentPath = path.dirname(currentPath);
514
+ if (parentPath === currentPath) {
515
+ return path.resolve(startPath);
516
+ }
517
+
518
+ currentPath = parentPath;
519
+ }
520
+ }
521
+
522
+ function copyWorkspace(sourcePath, targetPath, verbose) {
523
+ if(verbose) {
524
+ console.log(`Copying workspace from ${sourcePath} to ${targetPath}`);
525
+ }
526
+ fs.mkdirSync(targetPath, { recursive: true });
527
+ copyWorkspaceDirectory({
528
+ sourceDir: sourcePath,
529
+ targetDir: targetPath,
530
+ rootPath: sourcePath,
531
+ ignoreContexts: [],
532
+ activeRealDirectories: new Set()
533
+ });
534
+ return targetPath;
535
+ }
536
+
537
+ function copyWorkspaceDirectory({ sourceDir, targetDir, rootPath, ignoreContexts, activeRealDirectories }) {
538
+ const realSourceDir = fs.realpathSync(sourceDir);
539
+ if (activeRealDirectories.has(realSourceDir)) {
540
+ return;
541
+ }
542
+ activeRealDirectories.add(realSourceDir);
543
+
544
+ try {
545
+ const currentRelativeDir = toWorkspaceRelativePath(path.relative(rootPath, sourceDir));
546
+ const nextIgnoreContexts = loadGitignoreContext(sourceDir, currentRelativeDir, ignoreContexts);
547
+ const entries = fs.readdirSync(sourceDir, { withFileTypes: true });
548
+
549
+ for (const entry of entries) {
550
+ if (!shouldCopyWorkspaceEntry(entry.name)) {
551
+ continue;
552
+ }
553
+
554
+ const sourceEntryPath = path.join(sourceDir, entry.name);
555
+ const entryStats = entry.isSymbolicLink() ? fs.statSync(sourceEntryPath) : null;
556
+ const isDirectory = entry.isDirectory() || (entryStats && entryStats.isDirectory());
557
+ const isFile = entry.isFile() || (entryStats && entryStats.isFile());
558
+
559
+ if (!isDirectory && !isFile) {
560
+ continue;
561
+ }
562
+
563
+ const entryRelativePath = toWorkspaceRelativePath(path.relative(rootPath, sourceEntryPath));
564
+ const matchPath = isDirectory ? `${entryRelativePath}/` : entryRelativePath;
565
+ if (isGitignoredPath(matchPath, nextIgnoreContexts)) {
566
+ continue;
567
+ }
568
+
569
+ const targetEntryPath = path.join(targetDir, entry.name);
570
+ if (isDirectory) {
571
+ fs.mkdirSync(targetEntryPath, { recursive: true });
572
+ copyWorkspaceDirectory({
573
+ sourceDir: sourceEntryPath,
574
+ targetDir: targetEntryPath,
575
+ rootPath,
576
+ ignoreContexts: nextIgnoreContexts,
577
+ activeRealDirectories
578
+ });
579
+ continue;
580
+ }
581
+
582
+ fs.copyFileSync(sourceEntryPath, targetEntryPath);
583
+ fs.chmodSync(targetEntryPath, (entryStats || fs.statSync(sourceEntryPath)).mode);
584
+ }
585
+ } finally {
586
+ activeRealDirectories.delete(realSourceDir);
587
+ }
588
+ }
589
+
590
+ function loadGitignoreContext(sourceDir, currentRelativeDir, ignoreContexts) {
591
+ const gitignorePath = path.join(sourceDir, '.gitignore');
592
+ if (!fs.existsSync(gitignorePath) || !fs.statSync(gitignorePath).isFile()) {
593
+ return ignoreContexts;
594
+ }
595
+
596
+ const matcher = ignore();
597
+ matcher.add(fs.readFileSync(gitignorePath, 'utf8'));
598
+ return [...ignoreContexts, { basePath: currentRelativeDir, matcher }];
599
+ }
600
+
601
+ function isGitignoredPath(matchPath, ignoreContexts) {
602
+ let ignored = false;
603
+
604
+ for (const context of ignoreContexts) {
605
+ const relativeToContext = toContextRelativePath(matchPath, context.basePath);
606
+ if (relativeToContext == null) {
607
+ continue;
608
+ }
609
+
610
+ const result = context.matcher.checkIgnore(relativeToContext);
611
+ if (result.unignored) {
612
+ ignored = false;
613
+ }
614
+ if (result.ignored) {
615
+ ignored = true;
616
+ }
617
+ }
618
+
619
+ return ignored;
620
+ }
621
+
622
+ function toContextRelativePath(matchPath, basePath) {
623
+ const isDirectory = matchPath.endsWith('/');
624
+ const barePath = isDirectory ? matchPath.slice(0, -1) : matchPath;
625
+ const relativePath = basePath ? path.posix.relative(basePath, barePath) : barePath;
626
+
627
+ if (!relativePath || relativePath === '.' || relativePath === '..' || relativePath.startsWith('../')) {
628
+ return null;
629
+ }
630
+
631
+ return isDirectory ? `${relativePath}/` : relativePath;
632
+ }
633
+
634
+ function toWorkspaceRelativePath(relativePath) {
635
+ return relativePath ? relativePath.split(path.sep).join(path.posix.sep) : '';
636
+ }
637
+
638
+ function shouldCopyWorkspaceEntry(entryName) {
639
+ return !COPY_WORKSPACE_EXCLUDES.has(entryName);
640
+ }
641
+
642
+ const COPY_WORKSPACE_EXCLUDES = new Set([
643
+ '.git',
644
+ '.gitattributes',
645
+ '.gitignore',
646
+ '.gitmodules'
647
+ ]);
648
+
649
+ function truncateText(value, maxLength) {
650
+ if (value.length <= maxLength) {
651
+ return value;
652
+ }
653
+ return `${value.slice(0, maxLength)}...`;
654
+ }
655
+
656
+ module.exports = {
657
+ testPrompt,
658
+ runPromptTest,
659
+ copilotCmd
660
+ };
@@ -0,0 +1,133 @@
1
+ 'use strict';
2
+
3
+ const fs = require('fs');
4
+ const os = require('os');
5
+ const path = require('path');
6
+
7
+ const { copilotCmd, testPrompt } = require('./testPrompt');
8
+
9
+ let TMP_ROOT;
10
+ const COPILOT_DIR = path.join(__dirname, 'tests');
11
+
12
+ beforeAll(() => {
13
+ TMP_ROOT = fs.mkdtempSync(path.join(os.tmpdir(), 'xdrs-core-fixtures-'));
14
+ });
15
+
16
+ afterAll(() => {
17
+ fs.rmSync(TMP_ROOT, { recursive: true, force: true });
18
+ });
19
+
20
+ test('passes a prompt test with copied workspace isolation', () => {
21
+ const workspaceRoot = createWorkspace('customer-pass');
22
+ const err = testPrompt(
23
+ createConfig(workspaceRoot),
24
+ 'create a research about our customer base. We have 30% of customer > 50 years; 90% > 20',
25
+ 'The resulting file should be created at customer-research.md and should not generate facts that are not present in the original prompt'
26
+ );
27
+
28
+ expect(err).toBe('');
29
+ expect(fs.existsSync(path.join(workspaceRoot, 'customer-research.md'))).toBe(false);
30
+ });
31
+
32
+ test('passes when ignored files and git metadata stay out of the copied workspace', () => {
33
+ const workspaceRoot = createWorkspace('ignore-pass', { withIgnoredEntries: true });
34
+ const err = testPrompt(
35
+ createConfig(workspaceRoot),
36
+ 'create a note named summary.txt saying behavior ok',
37
+ 'Verify if ignored/seed.txt, .git/config, and nested/.git/config are not available in the copied workspace and are not reported as changes'
38
+ );
39
+
40
+ expect(err).toBe('');
41
+ expect(fs.existsSync(path.join(workspaceRoot, 'summary.txt'))).toBe(false);
42
+ assertFileExists(path.join(workspaceRoot, 'ignored', 'seed.txt'));
43
+ assertFileExists(path.join(workspaceRoot, '.git', 'config'));
44
+ assertFileExists(path.join(workspaceRoot, 'nested', '.git', 'config'));
45
+ });
46
+
47
+ test('returns markdown findings when the judge rejects the result', () => {
48
+ const workspaceRoot = createWorkspace('failure-case');
49
+ const err = testPrompt(
50
+ createConfig(workspaceRoot),
51
+ 'create a research about our customer base. We have 30% of customer > 50 years; 90% > 20',
52
+ 'Verify if summary.txt exists and the final output mentions summary.txt'
53
+ );
54
+
55
+ expect(err).toContain('- [summary.txt] summary.txt should exist.');
56
+ expect(err).toContain('Assertion: "summary.txt exists".');
57
+ expect(err).toContain('- [output:1] The final output should mention summary.txt.');
58
+ });
59
+
60
+ test('does not create a temp workspace in in-place mode', () => {
61
+ const workspaceRoot = createWorkspace('in-place');
62
+ const mkdtempSpy = jest.spyOn(fs, 'mkdtempSync');
63
+
64
+ try {
65
+ const err = testPrompt(
66
+ createConfig(workspaceRoot, { workspaceMode: 'in-place' }),
67
+ 'create a note named summary.txt saying behavior ok',
68
+ 'Verify if summary.txt exists and the final output mentions summary.txt'
69
+ );
70
+
71
+ expect(err).toBe('');
72
+ expect(mkdtempSpy).not.toHaveBeenCalled();
73
+ expect(fs.existsSync(path.join(workspaceRoot, 'summary.txt'))).toBe(true);
74
+ } finally {
75
+ mkdtempSpy.mockRestore();
76
+ }
77
+ });
78
+
79
+ test('copilotCmd defaults to the git repository root', () => {
80
+ const command = copilotCmd();
81
+ const addDirArgument = command.find((entry) => entry.startsWith('--add-dir='));
82
+
83
+ expect(addDirArgument).toBe(`--add-dir=${path.resolve(__dirname, '..')}`);
84
+ const promptIndex = command.indexOf('-p');
85
+ expect(command[promptIndex + 1]).toBe('{PROMPT}');
86
+ });
87
+
88
+ test('judge phase reuses promptCmd even when judgeCmd is provided', () => {
89
+ const workspaceRoot = createWorkspace('judge-cmd-ignored');
90
+ const err = testPrompt(
91
+ createConfig(workspaceRoot, {
92
+ judgeCmd: ['missing-command', '{PROMPT}']
93
+ }),
94
+ 'create a note named summary.txt saying behavior ok',
95
+ 'Verify if summary.txt exists and the final output mentions summary.txt'
96
+ );
97
+
98
+ expect(err).toBe('');
99
+ });
100
+
101
+ function createConfig(workspaceRoot, overrides = {}) {
102
+ return {
103
+ promptCmd: copilotCmd(workspaceRoot),
104
+ workspaceRoot,
105
+ workspaceMode: 'copy',
106
+ env: {
107
+ PATH: `${COPILOT_DIR}:${process.env.PATH}`
108
+ },
109
+ ...overrides
110
+ };
111
+ }
112
+
113
+ function createWorkspace(name, options = {}) {
114
+ const workspaceRoot = path.join(TMP_ROOT, name);
115
+ fs.mkdirSync(workspaceRoot, { recursive: true });
116
+ fs.writeFileSync(path.join(workspaceRoot, 'seed.txt'), 'seed\n', 'utf8');
117
+
118
+ if (options.withIgnoredEntries) {
119
+ fs.writeFileSync(path.join(workspaceRoot, '.gitignore'), 'ignored/\n', 'utf8');
120
+ fs.mkdirSync(path.join(workspaceRoot, 'ignored'), { recursive: true });
121
+ fs.writeFileSync(path.join(workspaceRoot, 'ignored', 'seed.txt'), 'ignored seed\n', 'utf8');
122
+ fs.mkdirSync(path.join(workspaceRoot, '.git'), { recursive: true });
123
+ fs.writeFileSync(path.join(workspaceRoot, '.git', 'config'), 'git config\n', 'utf8');
124
+ fs.mkdirSync(path.join(workspaceRoot, 'nested', '.git'), { recursive: true });
125
+ fs.writeFileSync(path.join(workspaceRoot, 'nested', '.git', 'config'), 'nested git config\n', 'utf8');
126
+ }
127
+
128
+ return workspaceRoot;
129
+ }
130
+
131
+ function assertFileExists(filePath) {
132
+ expect(fs.existsSync(filePath)).toBe(true);
133
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "xdrs-core",
3
- "version": "0.10.0",
3
+ "version": "0.11.0",
4
4
  "description": "A standard way to organize Decision Records (XDRs) across scopes, subjects, and teams so that AI agents can reliably query and follow them.",
5
5
  "repository": {
6
6
  "type": "git",
@@ -13,16 +13,26 @@
13
13
  "ai-agents"
14
14
  ],
15
15
  "license": "MIT",
16
+ "main": "lib/index.js",
17
+ "types": "lib/index.d.ts",
18
+ "exports": {
19
+ ".": "./lib/index.js"
20
+ },
16
21
  "files": [
17
22
  ".xdrs/_core/**",
18
23
  ".xdrs/index.md",
19
24
  "package.json",
20
25
  "AGENTS.md",
21
26
  "bin/filedist.js",
22
- "lib/**/*.js"
27
+ "lib/**/*.js"
23
28
  ],
29
+ "devDependencies": {
30
+ "jest": "^29.7.0"
31
+ },
24
32
  "dependencies": {
25
- "filedist": "^0.26.0"
33
+ "filedist": "^0.26.0",
34
+ "ignore": "^7.0.5",
35
+ "minimatch": "^10.2.5"
26
36
  },
27
37
  "filedist": {
28
38
  "sets": [
@@ -31,6 +41,10 @@
31
41
  "files": [
32
42
  "AGENTS.md",
33
43
  ".xdrs/_core/**"
44
+ ],
45
+ "exclude": [
46
+ "**/*.test.js",
47
+ "**/*.test.int.js"
34
48
  ]
35
49
  },
36
50
  "output": {