xdrs-core 0.10.0 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.xdrs/_core/adrs/principles/001-xdr-standards.md +1 -1
- package/.xdrs/_core/adrs/principles/skills/002-write-xdr/SKILL.md +6 -2
- package/.xdrs/_core/adrs/principles/skills/003-write-skill/SKILL.md +6 -2
- package/.xdrs/_core/adrs/principles/skills/004-write-article/SKILL.md +16 -5
- package/.xdrs/_core/adrs/principles/skills/005-write-research/005-write-research.test.int.js +37 -0
- package/.xdrs/_core/adrs/principles/skills/005-write-research/SKILL.md +23 -10
- package/.xdrs/index.md +1 -3
- package/README.md +40 -0
- package/lib/index.js +3 -0
- package/lib/lint.js +10 -12
- package/lib/testPrompt.js +660 -0
- package/lib/testPrompt.test.js +133 -0
- package/package.json +17 -3
|
@@ -52,7 +52,7 @@ Provides clear ownership by scope, predictable navigation, and reusable decision
|
|
|
52
52
|
- Skills use `.xdrs/[scope]/[type]/[subject]/skills/[number]-[skill-name]/assets/`
|
|
53
53
|
- **Scopes:**
|
|
54
54
|
- examples: `business-x`, `business-y`, `team-43`, `_core`
|
|
55
|
-
- `_local` is a reserved scope for XDRs created locally to a specific project or repository. XDRs in `_local` must not be shared with or propagated to other contexts. This scope must always be placed in the lowest position in `.xdrs/index.md` so that its decisions override or extend any decisions from all higher-positioned scopes.
|
|
55
|
+
- `_local` is a reserved scope for XDRs created locally to a specific project or repository. XDRs in `_local` must not be shared with or propagated to other contexts. This scope must always be placed in the lowest position in `.xdrs/index.md` so that its decisions override or extend any decisions from all higher-positioned scopes. Shared `.xdrs/index.md` files MUST NOT link `_local` canonical type indexes because `_local` stays workspace-local and is not distributed with shared packages. Readers, tools, and agents SHOULD still try to discover existing workspace-local `_local` canonical indexes by default, even when the shared root index does not link them.
|
|
56
56
|
- **Types:** `adrs`, `bdrs`, `edrs`
|
|
57
57
|
- there can exist sufixes to the standard scope names (e.g: `business-x-mobileapp`, `business-y-servicedesk`)
|
|
58
58
|
- **Subjects:** MUST be one of the following depending on the type of the XDR:
|
|
@@ -10,7 +10,7 @@ metadata:
|
|
|
10
10
|
|
|
11
11
|
## Overview
|
|
12
12
|
|
|
13
|
-
Guides the creation of a well-structured XDR by following the standards in `_core-adr-001`, researching existing records for conflicts, checking redundancy across related artifacts, and iterating until the document is concise, decision-focused, and clear about when the decision should be used.
|
|
13
|
+
Guides the creation of a well-structured XDR by following the standards in `_core-adr-001`, consulting `xdr-standards` for every core element definition, researching existing records for conflicts, checking redundancy across related artifacts, and iterating until the document is concise, decision-focused, and clear about when the decision should be used.
|
|
14
14
|
|
|
15
15
|
## Instructions
|
|
16
16
|
|
|
@@ -18,10 +18,13 @@ Guides the creation of a well-structured XDR by following the standards in `_cor
|
|
|
18
18
|
|
|
19
19
|
1. Read `.xdrs/index.md` to discover all active scopes and their canonical indexes.
|
|
20
20
|
2. Read `.xdrs/_core/adrs/principles/001-xdr-standards.md` in full to internalize structure rules, mandatory language, and the XDR template.
|
|
21
|
-
3.
|
|
21
|
+
3. Treat `.xdrs/_core/adrs/principles/001-xdr-standards.md` as the canonical source for all core XDR element definitions. Before choosing or writing any core element, consult it for the exact rules for type, scope, subject, ID, numbering, title, placement, and applicable folder structure instead of relying on memory or local convention.
|
|
22
|
+
4. Ask the user (or infer from context) the topic of the decision. Do NOT proceed to Phase 2 without a clear topic.
|
|
22
23
|
|
|
23
24
|
### Phase 2: Select Type, Scope, and Subject
|
|
24
25
|
|
|
26
|
+
Consult `001-xdr-standards` while making each choice in this phase. The summaries below are orientation only; when any detail matters, the standard decides.
|
|
27
|
+
|
|
25
28
|
**Type** — choose exactly one based on the nature of the decision:
|
|
26
29
|
- **BDR**: business process, product policy, strategic rule, operational procedure
|
|
27
30
|
- **ADR**: system context, integration pattern, overarching architectural choice
|
|
@@ -133,6 +136,7 @@ If any check fails, revise and re-run this phase before proceeding.
|
|
|
133
136
|
### Constraints
|
|
134
137
|
|
|
135
138
|
- MUST follow the XDR template from `001-xdr-standards` exactly.
|
|
139
|
+
- MUST consult `001-xdr-standards` as the canonical source for every core element definition, especially type, scope, subject, ID, numbering, naming, and placement.
|
|
136
140
|
- MUST NOT add personal opinions or general best-practice content not tied to a decision.
|
|
137
141
|
- MUST NOT create an XDR that duplicates a decision already captured in another XDR — extend or reference instead.
|
|
138
142
|
- MUST prefer links and short references over repeating the same decision content across related documents.
|
|
@@ -11,17 +11,20 @@ metadata:
|
|
|
11
11
|
|
|
12
12
|
## Overview
|
|
13
13
|
|
|
14
|
-
Guides the creation of a well-structured skill package by following `_core-adr-003` skill standards, checking existing skills to avoid duplication, and producing a complete SKILL.md ready to activate in VS Code.
|
|
14
|
+
Guides the creation of a well-structured skill package by following `_core-adr-003` skill standards, consulting `xdr-standards` for every core element definition, checking existing skills to avoid duplication, and producing a complete SKILL.md ready to activate in VS Code.
|
|
15
15
|
|
|
16
16
|
## Instructions
|
|
17
17
|
|
|
18
18
|
### Phase 1: Understand the Skill Goal
|
|
19
19
|
|
|
20
20
|
1. Read `.xdrs/_core/adrs/principles/003-skill-standards.md` in full to internalize the SKILL.md format, folder layout, and numbering rules.
|
|
21
|
-
2.
|
|
21
|
+
2. Read `.xdrs/_core/adrs/principles/001-xdr-standards.md` in full before defining any core element for the skill package. Treat it as the canonical source for type, scope, subject, numbering expectations, naming constraints, and folder placement rules.
|
|
22
|
+
3. Identify what the skill must do, the concrete outcome it should produce, and the exact conditions under which an agent should activate it. Do NOT proceed without a clear goal, outcome, and activation trigger.
|
|
22
23
|
|
|
23
24
|
### Phase 2: Select Type, Scope, Subject, and Number
|
|
24
25
|
|
|
26
|
+
Consult `001-xdr-standards` while making each choice in this phase. The summaries below are orientation only; when there is any ambiguity or edge case, the standard decides.
|
|
27
|
+
|
|
25
28
|
**Type** — choose one based on the skill's activity:
|
|
26
29
|
- **EDR skill**: engineering workflows, tool usage, coding procedures, implementation how-tos
|
|
27
30
|
- **ADR skill**: architectural evaluation, pattern compliance, technology selection guidance
|
|
@@ -122,6 +125,7 @@ If any check fails, revise before continuing.
|
|
|
122
125
|
### Constraints
|
|
123
126
|
|
|
124
127
|
- MUST follow the agentskills SKILL.md format from `003-skill-standards` exactly.
|
|
128
|
+
- MUST consult `001-xdr-standards` as the canonical source for every core element definition, especially type, scope, subject, numbering, naming, and placement.
|
|
125
129
|
- MUST NOT create a skill that duplicates an existing one — extend or reference it instead.
|
|
126
130
|
- MUST keep scope `_local` unless the user explicitly states otherwise.
|
|
127
131
|
- MUST include a References section linking to `003-skill-standards`.
|
|
@@ -9,9 +9,7 @@ metadata:
|
|
|
9
9
|
|
|
10
10
|
## Overview
|
|
11
11
|
|
|
12
|
-
Guides the creation of a well-structured article by following `_core-adr-004`, researching the XDRs,
|
|
13
|
-
Research documents, and Skills to synthesize, and producing a concise document that serves as a navigable view without duplicating
|
|
14
|
-
decision content.
|
|
12
|
+
Guides the creation of a well-structured article by following `_core-adr-004`, consulting `xdr-standards` for every core element definition, researching the XDRs, Research documents, and Skills to synthesize, and producing a concise document that serves as a navigable view without duplicating decision content.
|
|
15
13
|
|
|
16
14
|
## Instructions
|
|
17
15
|
|
|
@@ -19,15 +17,21 @@ decision content.
|
|
|
19
17
|
|
|
20
18
|
1. Read `.xdrs/_core/adrs/principles/004-article-standards.md` in full to internalize the template,
|
|
21
19
|
placement rules, numbering rules, and the constraint that articles are views, not decisions.
|
|
22
|
-
2.
|
|
20
|
+
2. Read `.xdrs/_core/adrs/principles/001-xdr-standards.md` in full before defining the article's core elements. Treat it as the canonical source for how to choose and write type, scope, subject, numbering, naming, and folder placement.
|
|
21
|
+
3. Identify the topic and intended audience from user input or context. Do NOT proceed without a clear
|
|
23
22
|
topic.
|
|
24
23
|
|
|
25
24
|
### Phase 2: Select Scope, Type, and Subject
|
|
26
25
|
|
|
26
|
+
Consult `001-xdr-standards` while making each choice in this phase. The summaries below are orientation only; when any detail is unclear, the standard decides.
|
|
27
|
+
|
|
27
28
|
**Scope** — use `_local` unless the user explicitly names another scope.
|
|
28
29
|
|
|
29
30
|
**Type** — match the type of the XDRs the article primarily synthesizes (`adrs`, `bdrs`, or `edrs`).
|
|
30
|
-
If the topic spans multiple types, use `adrs`.
|
|
31
|
+
If the topic spans multiple types, use `adrs`. Use the same rules as `002-write-xdr` Phase 2:
|
|
32
|
+
- **BDR**: business process, product policy, strategic rule, operational procedure
|
|
33
|
+
- **ADR**: system context, integration pattern, overarching architectural choice
|
|
34
|
+
- **EDR**: specific tool/library, coding practice, testing strategy, project structure, pipelines
|
|
31
35
|
|
|
32
36
|
**Subject** — pick the subject that best matches the article's topic (see `004-article-standards`).
|
|
33
37
|
If the article spans more than one subject, place it in `principles`.
|
|
@@ -111,6 +115,13 @@ Rules to apply while drafting:
|
|
|
111
115
|
- **Conflicting information found** — note the conflict in the article and always defer to the XDR.
|
|
112
116
|
- **Article would exceed 150 lines** — move detailed content to a new Research, Skill, or XDR and link back.
|
|
113
117
|
|
|
118
|
+
## Constraints
|
|
119
|
+
|
|
120
|
+
- MUST consult `001-xdr-standards` as the canonical source for every core element definition, especially type, scope, subject, numbering, naming, and placement.
|
|
121
|
+
- MUST follow the article template and placement rules from `004-article-standards`.
|
|
122
|
+
- MUST keep scope `_local` unless the user explicitly states otherwise.
|
|
123
|
+
- MUST defer to active and applicable XDRs when article synthesis conflicts with them.
|
|
124
|
+
|
|
114
125
|
## References
|
|
115
126
|
|
|
116
127
|
- [_core-adr-004 - Article standards](../../../.xdrs/_core/adrs/principles/004-article-standards.md)
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
'use strict';
|
|
2
|
+
|
|
3
|
+
const path = require('path');
|
|
4
|
+
const { copilotCmd, testPrompt } = require('xdrs-core');
|
|
5
|
+
|
|
6
|
+
const REPO_ROOT = path.resolve(__dirname, '..', '..', '..', '..', '..', '..');
|
|
7
|
+
|
|
8
|
+
jest.setTimeout(60000);
|
|
9
|
+
|
|
10
|
+
test.skip('check', () => {
|
|
11
|
+
const err = testPrompt(
|
|
12
|
+
{
|
|
13
|
+
workspaceRoot: REPO_ROOT,
|
|
14
|
+
workspaceMode: 'in-place',
|
|
15
|
+
promptCmd: copilotCmd(REPO_ROOT)
|
|
16
|
+
},
|
|
17
|
+
'Reply with READY and nothing else.',
|
|
18
|
+
'Verify that the final output is READY and nothing else.',
|
|
19
|
+
true
|
|
20
|
+
);
|
|
21
|
+
|
|
22
|
+
expect(err).toBe('');
|
|
23
|
+
});
|
|
24
|
+
|
|
25
|
+
test('005-write-research creates an IMRAD research document in copy mode', () => {
|
|
26
|
+
const err = testPrompt(
|
|
27
|
+
{
|
|
28
|
+
workspaceRoot: REPO_ROOT,
|
|
29
|
+
workspaceMode: 'copy',
|
|
30
|
+
promptCmd: copilotCmd(REPO_ROOT)
|
|
31
|
+
},
|
|
32
|
+
'Create a very small research document with the following data: We measured the installation time in our monorepo and pnpm is 3.5x faster than Yarn when installing dependencies. We recommend using PNPM in our monorepo to speed up our productivity as it seems very easy to use and have a better internal hoisting mechanism.',
|
|
33
|
+
'Verify that a research file was created under .xdrs/_local/edrs/devops/researches/, that it contains the sections Abstract, Introduction, Methods, Results, Discussion, Conclusion, and References, and that the content contains all the provided data in input prompt, and doesn\'t contain more than 20% of additional information.'
|
|
34
|
+
);
|
|
35
|
+
|
|
36
|
+
expect(err).toBe('');
|
|
37
|
+
});
|
|
@@ -11,26 +11,32 @@ metadata:
|
|
|
11
11
|
|
|
12
12
|
## Overview
|
|
13
13
|
|
|
14
|
-
Guides the creation of a well-structured research document by following `_core-adr-006`, checking related XDRs and existing research to avoid duplication, and producing an IMRAD-based study that reads as a standalone technical paper. Treat each section goal in the research template as an acceptance criterion, not as optional wording. Do not assume missing direction, evidence, or intended follow-up; ask the user explicitly before proceeding when those points are not already concrete.
|
|
14
|
+
Guides the creation of a well-structured research document by following `_core-adr-006`, consulting `xdr-standards` for every core element definition, checking related XDRs and existing research to avoid duplication, and producing an IMRAD-based study that reads as a standalone technical paper. Treat each section goal in the research template as an acceptance criterion, not as optional wording. Do not assume missing direction, evidence, or intended follow-up; ask the user explicitly before proceeding when those points are not already concrete.
|
|
15
15
|
|
|
16
16
|
## Instructions
|
|
17
17
|
|
|
18
18
|
### Phase 1: Understand the Research Goal
|
|
19
19
|
|
|
20
20
|
1. Read `.xdrs/_core/adrs/principles/006-research-standards.md` in full to internalize the folder layout, numbering rules, and mandatory template.
|
|
21
|
-
2.
|
|
22
|
-
3. Ask the user
|
|
23
|
-
4. Ask the user what
|
|
24
|
-
5.
|
|
25
|
-
6.
|
|
26
|
-
7.
|
|
27
|
-
8.
|
|
21
|
+
2. Read `.xdrs/_core/adrs/principles/001-xdr-standards.md` in full before defining the research document's core elements. Treat it as the canonical source for how to choose and write type, scope, subject, numbering expectations, naming constraints, and folder placement.
|
|
22
|
+
3. Ask the user to confirm the intended direction of the research before planning the document: what decision, question, or option space the study should support, what boundaries or exclusions apply, and what kind of outcome they expect.
|
|
23
|
+
4. Ask the user what evidence already exists and what evidence-gathering methods are acceptable if the current evidence is incomplete. Do not invent facts, sources, or confidence that the user did not provide.
|
|
24
|
+
5. Ask the user what the proposed next step is after the research, such as writing a new XDR, updating an existing XDR, informing a discussion, or documenting trade-offs for later. Use that answer to shape the framing without turning the research into the final decision.
|
|
25
|
+
6. Identify the problem or question being explored, the relevant system or domain context, the likely technical audience, and why the subject matters in practice.
|
|
26
|
+
7. Internalize the goal of each required section before drafting: `Abstract` gives a quick technical reader the question, method, main result, and takeaway, `Introduction` frames the investigated problem and context, `Methods` makes the important parts reproducible, `Results` records raw findings with minimal interpretation, `Discussion` interprets the findings, `Conclusion` summarizes the practical takeaway and boundaries, and `References` makes sources traceable.
|
|
27
|
+
8. Collect the main constraints, known facts, important experiences, gaps, and assumptions that belong in the introduction.
|
|
28
|
+
9. Do NOT proceed without a clear problem statement, a central question, explicit user direction, an understood next step, and at least one credible source of evidence or a method for generating it. If any of these are ambiguous, stop and ask instead of assuming.
|
|
28
29
|
|
|
29
30
|
### Phase 2: Select Scope, Type, Subject, and Number
|
|
30
31
|
|
|
32
|
+
Consult `001-xdr-standards` while making each choice in this phase. The summaries below are orientation only; when any detail matters, the standard decides.
|
|
33
|
+
|
|
31
34
|
**Scope** — use `_local` unless the user explicitly names another scope.
|
|
32
35
|
|
|
33
|
-
**Type** — match the type of decision this research supports (`adrs`, `bdrs`, or `edrs`).
|
|
36
|
+
**Type** — match the type of decision this research supports (`adrs`, `bdrs`, or `edrs`). Use the same rules as `002-write-xdr` Phase 2:
|
|
37
|
+
- **BDR**: business process, product policy, strategic rule, operational procedure
|
|
38
|
+
- **ADR**: system context, integration pattern, overarching architectural choice
|
|
39
|
+
- **EDR**: specific tool/library, coding practice, testing strategy, project structure, pipelines
|
|
34
40
|
|
|
35
41
|
**Subject** — pick the most specific subject that matches the problem domain.
|
|
36
42
|
|
|
@@ -253,4 +259,11 @@ If any check fails, revise before continuing.
|
|
|
253
259
|
|
|
254
260
|
- [_core-adr-006 - Research standards](../../006-research-standards.md)
|
|
255
261
|
- [_core-adr-001 - XDR standards](../../001-xdr-standards.md)
|
|
256
|
-
- [002-write-xdr skill](../002-write-xdr/SKILL.md)
|
|
262
|
+
- [002-write-xdr skill](../002-write-xdr/SKILL.md)
|
|
263
|
+
|
|
264
|
+
## Constraints
|
|
265
|
+
|
|
266
|
+
- MUST consult `001-xdr-standards` as the canonical source for every core element definition, especially type, scope, subject, numbering, naming, and placement.
|
|
267
|
+
- MUST follow the research template and section-goal rules from `006-research-standards`.
|
|
268
|
+
- MUST keep scope `_local` unless the user explicitly states otherwise.
|
|
269
|
+
- MUST keep the document as research rather than turning it into a final decision.
|
package/.xdrs/index.md
CHANGED
|
@@ -19,6 +19,4 @@ Decisions about how XDRs work
|
|
|
19
19
|
|
|
20
20
|
### _local (reserved)
|
|
21
21
|
|
|
22
|
-
Project-local XDRs that must not be shared with other contexts. Always keep this scope last so its decisions override or extend all scopes listed above.
|
|
23
|
-
|
|
24
|
-
|
|
22
|
+
Project-local XDRs that must not be shared with other contexts. Always keep this scope last so its decisions override or extend all scopes listed above. Keep `_local` canonical indexes in the workspace tree only; do not link them from this shared index. Readers and tools should still try to discover existing `_local` indexes in the current workspace by default.
|
package/README.md
CHANGED
|
@@ -68,6 +68,46 @@ npx -y xdrs-core lint ./some-project
|
|
|
68
68
|
pnpm exec xdrs-core lint .
|
|
69
69
|
```
|
|
70
70
|
|
|
71
|
+
## Library Testing
|
|
72
|
+
|
|
73
|
+
The package also exposes a reusable behavior-test library for Jest or any other JavaScript test runner.
|
|
74
|
+
|
|
75
|
+
Main exports:
|
|
76
|
+
|
|
77
|
+
- `testPrompt(config, inputPrompt, judgePrompt)` runs the task prompt, evaluates the result in a fresh judge session, and returns an empty string on success or a markdown bullet list on failure.
|
|
78
|
+
- `runPromptTest(config, inputPrompt, judgePrompt)` returns the structured result object when you need access to captured output and the agent-reported changed file list.
|
|
79
|
+
- `copilotCmd(workspaceRoot)` returns a ready-to-use `promptCmd` template for the Copilot CLI. The library uses that same command template for both the task and judge phases. If `workspaceRoot` is omitted it defaults to the current git repository root.
|
|
80
|
+
- `config.workspaceRoot`, when set, is the authoritative workspace under test. If omitted, the library uses the current git repository root.
|
|
81
|
+
|
|
82
|
+
Execution model:
|
|
83
|
+
|
|
84
|
+
- phase 1 runs the task prompt and captures final output text plus the files the agent says it changed
|
|
85
|
+
- phase 2 runs an independent judge prompt in a fresh invocation of `promptCmd` against the original task prompt, task output, the agent-reported changed file list, and the current workspace state
|
|
86
|
+
- the judge trusts that reported file list as the authoritative change report and reads file contents from the workspace directly when needed
|
|
87
|
+
- when `workspaceMode: 'copy'` is used, the temporary workspace honors nested `.gitignore` rules and skips git metadata files during the copy
|
|
88
|
+
|
|
89
|
+
`promptCmd` accepts either a string array or a JSON array string and must include a `{PROMPT}` placeholder.
|
|
90
|
+
|
|
91
|
+
Example with Jest:
|
|
92
|
+
|
|
93
|
+
```js
|
|
94
|
+
const { copilotCmd, testPrompt } = require('xdrs-core');
|
|
95
|
+
|
|
96
|
+
test('creates hello.md', () => {
|
|
97
|
+
const err = testPrompt(
|
|
98
|
+
{
|
|
99
|
+
workspaceRoot: process.cwd(),
|
|
100
|
+
promptCmd: copilotCmd(process.cwd()),
|
|
101
|
+
workspaceMode: 'copy'
|
|
102
|
+
},
|
|
103
|
+
"Create a nice markdown file at hello.md saying 'hello!'",
|
|
104
|
+
'The resulting file should be created at hello.md and have hello as part of its contents, without too much extra info (should be <100 chars)'
|
|
105
|
+
);
|
|
106
|
+
|
|
107
|
+
expect(err).toBe('');
|
|
108
|
+
});
|
|
109
|
+
```
|
|
110
|
+
|
|
71
111
|
## Requirements
|
|
72
112
|
|
|
73
113
|
### Multi-scope support
|
package/lib/index.js
ADDED
package/lib/lint.js
CHANGED
|
@@ -96,32 +96,30 @@ function lintRootIndex(rootIndexPath, xdrsRoot, actualTypeIndexes, errors) {
|
|
|
96
96
|
errors.push(`Root index is missing required override text: ${toDisplayPath(rootIndexPath)}`);
|
|
97
97
|
}
|
|
98
98
|
|
|
99
|
-
const
|
|
100
|
-
for (const linkPath of
|
|
99
|
+
const links = parseLocalLinks(content, path.dirname(rootIndexPath));
|
|
100
|
+
for (const linkPath of links) {
|
|
101
101
|
if (!fs.existsSync(linkPath)) {
|
|
102
102
|
errors.push(`Broken link in root index: ${displayPath(rootIndexPath, linkPath)}`);
|
|
103
103
|
}
|
|
104
104
|
}
|
|
105
105
|
|
|
106
|
-
const linkedTypeIndexes =
|
|
106
|
+
const linkedTypeIndexes = links.filter((linkPath) => isCanonicalTypeIndex(linkPath, xdrsRoot));
|
|
107
107
|
const linkedSet = new Set(linkedTypeIndexes.map(normalizePath));
|
|
108
108
|
|
|
109
|
-
for (const indexPath of
|
|
110
|
-
|
|
111
|
-
|
|
109
|
+
for (const indexPath of linkedTypeIndexes) {
|
|
110
|
+
const scopeName = path.basename(path.dirname(path.dirname(indexPath)));
|
|
111
|
+
if (scopeName === '_local') {
|
|
112
|
+
errors.push(`Root index must not link _local canonical index: ${displayPath(rootIndexPath, indexPath)}`);
|
|
112
113
|
}
|
|
113
114
|
}
|
|
114
115
|
|
|
115
|
-
|
|
116
|
-
for (const indexPath of linkedTypeIndexes) {
|
|
116
|
+
for (const indexPath of actualTypeIndexes) {
|
|
117
117
|
const scopeName = path.basename(path.dirname(path.dirname(indexPath)));
|
|
118
118
|
if (scopeName === '_local') {
|
|
119
|
-
seenLocal = true;
|
|
120
119
|
continue;
|
|
121
120
|
}
|
|
122
|
-
if (
|
|
123
|
-
errors.push(
|
|
124
|
-
break;
|
|
121
|
+
if (!linkedSet.has(normalizePath(indexPath))) {
|
|
122
|
+
errors.push(`Root index is missing canonical index link: ${toDisplayPath(indexPath)}`);
|
|
125
123
|
}
|
|
126
124
|
}
|
|
127
125
|
}
|
|
@@ -0,0 +1,660 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
'use strict';
|
|
3
|
+
|
|
4
|
+
const fs = require('fs');
|
|
5
|
+
const ignore = require('ignore');
|
|
6
|
+
const os = require('os');
|
|
7
|
+
const path = require('path');
|
|
8
|
+
const { spawnSync } = require('child_process');
|
|
9
|
+
|
|
10
|
+
const MAX_TASK_OUTPUT_CHARS = 12 * 1024;
|
|
11
|
+
|
|
12
|
+
function testPrompt(config, inputPrompt, judgePrompt, verbose) {
|
|
13
|
+
const result = runPromptTest(config, inputPrompt, judgePrompt, verbose);
|
|
14
|
+
return result.passed ? '' : formatFailureMarkdown(result.findings);
|
|
15
|
+
}
|
|
16
|
+
|
|
17
|
+
function runPromptTest(config, inputPrompt, judgePrompt, verbose) {
|
|
18
|
+
if(verbose) {
|
|
19
|
+
console.log('Running prompt test with config:', JSON.stringify(config, null, 2));
|
|
20
|
+
console.log('Input Prompt:', inputPrompt);
|
|
21
|
+
console.log('Judge Prompt:', judgePrompt);
|
|
22
|
+
}
|
|
23
|
+
const options = normalizeConfig(config);
|
|
24
|
+
const originalWorkspace = resolveWorkspaceRoot(options);
|
|
25
|
+
let tempRoot = null;
|
|
26
|
+
let effectiveWorkspace = originalWorkspace;
|
|
27
|
+
|
|
28
|
+
try {
|
|
29
|
+
if (options.workspaceMode === 'copy') {
|
|
30
|
+
tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), 'xdrs-core-test-'));
|
|
31
|
+
effectiveWorkspace = copyWorkspace(originalWorkspace, path.join(tempRoot, 'workspace'), verbose);
|
|
32
|
+
}
|
|
33
|
+
|
|
34
|
+
if(verbose) {
|
|
35
|
+
console.log(`Running prompt test in workspace: ${effectiveWorkspace} (mode: ${options.workspaceMode})`);
|
|
36
|
+
}
|
|
37
|
+
const task = runTaskPhase({
|
|
38
|
+
prompt: ensureNonEmptyString(inputPrompt, 'inputPrompt'),
|
|
39
|
+
commandTemplate: options.promptCmd,
|
|
40
|
+
workspacePath: effectiveWorkspace,
|
|
41
|
+
authoritativeWorkspacePath: originalWorkspace,
|
|
42
|
+
timeoutMs: options.taskTimeoutMs,
|
|
43
|
+
env: options.env,
|
|
44
|
+
verbose
|
|
45
|
+
});
|
|
46
|
+
|
|
47
|
+
if(verbose) {
|
|
48
|
+
console.log('Task phase completed. Summary:', task.summary);
|
|
49
|
+
console.log('Agent reported changed files:', task.changedFiles);
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
if(verbose) {
|
|
53
|
+
console.log('Running judge phase to evaluate the task output against the judge prompt.');
|
|
54
|
+
}
|
|
55
|
+
const evaluation = runJudgePhase({
|
|
56
|
+
originalPrompt: ensureNonEmptyString(inputPrompt, 'inputPrompt'),
|
|
57
|
+
judgePrompt: ensureNonEmptyString(judgePrompt, 'judgePrompt'),
|
|
58
|
+
taskOutput: task.summary,
|
|
59
|
+
agentReportedChanges: task.changedFiles,
|
|
60
|
+
commandTemplate: options.promptCmd,
|
|
61
|
+
workspacePath: effectiveWorkspace,
|
|
62
|
+
authoritativeWorkspacePath: originalWorkspace,
|
|
63
|
+
timeoutMs: options.judgeTimeoutMs,
|
|
64
|
+
env: options.env,
|
|
65
|
+
verbose
|
|
66
|
+
});
|
|
67
|
+
|
|
68
|
+
return {
|
|
69
|
+
passed: evaluation.pass,
|
|
70
|
+
findings: evaluation.findings,
|
|
71
|
+
taskOutput: task.summary,
|
|
72
|
+
agentReportedChanges: task.changedFiles,
|
|
73
|
+
judge: evaluation.raw,
|
|
74
|
+
workspace: {
|
|
75
|
+
original: originalWorkspace,
|
|
76
|
+
effective: effectiveWorkspace,
|
|
77
|
+
mode: options.workspaceMode
|
|
78
|
+
}
|
|
79
|
+
};
|
|
80
|
+
} finally {
|
|
81
|
+
if (tempRoot && options.workspaceMode === 'copy') {
|
|
82
|
+
fs.rmSync(tempRoot, { recursive: true, force: true });
|
|
83
|
+
}
|
|
84
|
+
}
|
|
85
|
+
}
|
|
86
|
+
|
|
87
|
+
function copilotCmd(workspaceRoot = findGitRoot(process.cwd())) {
|
|
88
|
+
return [
|
|
89
|
+
'copilot',
|
|
90
|
+
`--add-dir=${path.resolve(workspaceRoot)}`,
|
|
91
|
+
'--allow-all',
|
|
92
|
+
'-p',
|
|
93
|
+
'{PROMPT}'
|
|
94
|
+
];
|
|
95
|
+
}
|
|
96
|
+
|
|
97
|
+
function ensureNonEmptyString(value, label) {
|
|
98
|
+
if (typeof value !== 'string' || !value.trim()) {
|
|
99
|
+
throw new Error(`Expected non-empty ${label}`);
|
|
100
|
+
}
|
|
101
|
+
return value.trim();
|
|
102
|
+
}
|
|
103
|
+
|
|
104
|
+
function normalizeConfig(config) {
|
|
105
|
+
if (!config || typeof config !== 'object' || Array.isArray(config)) {
|
|
106
|
+
throw new Error('Expected config to be an object.');
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
const workspaceMode = config.workspaceMode || 'copy';
|
|
110
|
+
if (workspaceMode !== 'copy' && workspaceMode !== 'in-place') {
|
|
111
|
+
throw new Error(`Invalid workspaceMode value: ${workspaceMode}`);
|
|
112
|
+
}
|
|
113
|
+
|
|
114
|
+
return {
|
|
115
|
+
promptCmd: parseCommandTemplate(config.promptCmd, 'promptCmd'),
|
|
116
|
+
workspaceRoot: config.workspaceRoot ? path.resolve(config.workspaceRoot) : null,
|
|
117
|
+
workspaceMode,
|
|
118
|
+
env: normalizeEnv(config.env),
|
|
119
|
+
taskTimeoutMs: readTimeout(config.taskTimeoutMs, 'taskTimeoutMs'),
|
|
120
|
+
judgeTimeoutMs: readTimeout(config.judgeTimeoutMs, 'judgeTimeoutMs')
|
|
121
|
+
};
|
|
122
|
+
}
|
|
123
|
+
|
|
124
|
+
function resolveWorkspaceRoot(options) {
|
|
125
|
+
const resolvedWorkspace = options.workspaceRoot || findGitRoot(process.cwd());
|
|
126
|
+
|
|
127
|
+
if (!fs.existsSync(resolvedWorkspace) || !fs.statSync(resolvedWorkspace).isDirectory()) {
|
|
128
|
+
throw new Error(`Workspace directory not found: ${resolvedWorkspace}`);
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
return resolvedWorkspace;
|
|
132
|
+
}
|
|
133
|
+
|
|
134
|
+
function runTaskPhase({ prompt, commandTemplate, workspacePath, authoritativeWorkspacePath, timeoutMs, env }, verbose) {
|
|
135
|
+
const wrappedPrompt = [
|
|
136
|
+
'XDRS-CORE TEST PHASE: TASK',
|
|
137
|
+
'',
|
|
138
|
+
'Execute the following task in the current workspace.',
|
|
139
|
+
'Keep all changes inside the workspace.',
|
|
140
|
+
'Respond with JSON only and no code fences.',
|
|
141
|
+
'Use exactly this schema: {"summary":"plain text summary","changedFiles":["relative/path.ext"]}.',
|
|
142
|
+
'The summary must describe the final result only, not hidden reasoning.',
|
|
143
|
+
'',
|
|
144
|
+
'BEGIN TASK PROMPT',
|
|
145
|
+
prompt,
|
|
146
|
+
'END TASK PROMPT'
|
|
147
|
+
].join('\n');
|
|
148
|
+
|
|
149
|
+
const result = runPromptCommand({
|
|
150
|
+
commandTemplate,
|
|
151
|
+
workspacePath,
|
|
152
|
+
authoritativeWorkspacePath,
|
|
153
|
+
prompt: wrappedPrompt,
|
|
154
|
+
timeoutMs,
|
|
155
|
+
env,
|
|
156
|
+
verbose
|
|
157
|
+
});
|
|
158
|
+
|
|
159
|
+
return parseTaskResponse(result.output);
|
|
160
|
+
}
|
|
161
|
+
|
|
162
|
+
function runJudgePhase({ originalPrompt, judgePrompt, taskOutput, agentReportedChanges, commandTemplate, workspacePath, authoritativeWorkspacePath, timeoutMs, env }, verbose) {
|
|
163
|
+
const wrappedPrompt = [
|
|
164
|
+
'XDRS-CORE TEST PHASE: ASSERTION_EVALUATION',
|
|
165
|
+
'',
|
|
166
|
+
'You are evaluating the result of a separate agent task run.',
|
|
167
|
+
'Treat this as a fresh session. Do not assume any hidden history.',
|
|
168
|
+
'Use the original task prompt, the judge prompt, the final task output, the reported changed file paths, and the current workspace state to decide whether the result passes.',
|
|
169
|
+
'Trust the reported changed file path list as the authoritative change report for this task run.',
|
|
170
|
+
'Read files from the workspace directly when you need their contents.',
|
|
171
|
+
'Inspect files in the workspace directly when needed.',
|
|
172
|
+
'Respond with JSON only and no code fences.',
|
|
173
|
+
'Use exactly this schema: {"pass":true,"findings":[]} or {"pass":false,"findings":[{"target":"file","path":"relative/path.ext","line":1,"message":"explanation","assertionRef":"exact relevant phrase from the judge prompt"}]}.',
|
|
174
|
+
'Use target="output" when the issue is in the final task output and target="workspace" when it is not tied to a specific file.',
|
|
175
|
+
'Include 1-based line numbers when you cite a file or the output text. Include the exact judge-prompt phrase that triggered each finding in assertionRef.',
|
|
176
|
+
'',
|
|
177
|
+
'BEGIN ORIGINAL TASK PROMPT',
|
|
178
|
+
originalPrompt,
|
|
179
|
+
'END ORIGINAL TASK PROMPT',
|
|
180
|
+
'',
|
|
181
|
+
'BEGIN JUDGE PROMPT',
|
|
182
|
+
judgePrompt,
|
|
183
|
+
'END JUDGE PROMPT',
|
|
184
|
+
'',
|
|
185
|
+
'BEGIN TASK OUTPUT',
|
|
186
|
+
truncateText(taskOutput || '(empty)', MAX_TASK_OUTPUT_CHARS),
|
|
187
|
+
'END TASK OUTPUT',
|
|
188
|
+
'',
|
|
189
|
+
'BEGIN AGENT REPORTED CHANGES JSON',
|
|
190
|
+
JSON.stringify(agentReportedChanges, null, 2),
|
|
191
|
+
'END AGENT REPORTED CHANGES JSON'
|
|
192
|
+
].join('\n');
|
|
193
|
+
|
|
194
|
+
const result = runPromptCommand({
|
|
195
|
+
commandTemplate,
|
|
196
|
+
workspacePath,
|
|
197
|
+
authoritativeWorkspacePath,
|
|
198
|
+
prompt: wrappedPrompt,
|
|
199
|
+
timeoutMs,
|
|
200
|
+
env,
|
|
201
|
+
verbose
|
|
202
|
+
});
|
|
203
|
+
|
|
204
|
+
return normalizeJudgeResponse(result.output);
|
|
205
|
+
}
|
|
206
|
+
|
|
207
|
+
function parseTaskResponse(output) {
|
|
208
|
+
const trimmed = String(output || '').trim();
|
|
209
|
+
if (!trimmed) {
|
|
210
|
+
throw new Error('The task command returned empty output.');
|
|
211
|
+
}
|
|
212
|
+
|
|
213
|
+
try {
|
|
214
|
+
const parsed = parseJsonObject(trimmed);
|
|
215
|
+
return {
|
|
216
|
+
summary: typeof parsed.summary === 'string' && parsed.summary.trim()
|
|
217
|
+
? parsed.summary.trim()
|
|
218
|
+
: trimmed,
|
|
219
|
+
changedFiles: normalizeStringArray(parsed.changedFiles)
|
|
220
|
+
};
|
|
221
|
+
} catch (error) {
|
|
222
|
+
return {
|
|
223
|
+
summary: trimmed,
|
|
224
|
+
changedFiles: []
|
|
225
|
+
};
|
|
226
|
+
}
|
|
227
|
+
}
|
|
228
|
+
|
|
229
|
+
function normalizeJudgeResponse(output) {
|
|
230
|
+
let parsed;
|
|
231
|
+
|
|
232
|
+
try {
|
|
233
|
+
parsed = parseJsonObject(output);
|
|
234
|
+
} catch (error) {
|
|
235
|
+
throw new Error(`Judge returned invalid JSON: ${truncateText(String(output || '').trim(), 1000)}`);
|
|
236
|
+
}
|
|
237
|
+
|
|
238
|
+
if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) {
|
|
239
|
+
throw new Error('Judge response must be a JSON object.');
|
|
240
|
+
}
|
|
241
|
+
|
|
242
|
+
if (typeof parsed.pass !== 'boolean') {
|
|
243
|
+
throw new Error('Judge response must include a boolean pass field.');
|
|
244
|
+
}
|
|
245
|
+
|
|
246
|
+
let findings = [];
|
|
247
|
+
if (Array.isArray(parsed.findings)) {
|
|
248
|
+
findings = parsed.findings.map(normalizeFinding).filter(Boolean);
|
|
249
|
+
} else if (Array.isArray(parsed.reasons)) {
|
|
250
|
+
findings = parsed.reasons
|
|
251
|
+
.filter((reason) => typeof reason === 'string' && reason.trim())
|
|
252
|
+
.map((reason) => ({
|
|
253
|
+
target: 'workspace',
|
|
254
|
+
message: reason.trim(),
|
|
255
|
+
path: null,
|
|
256
|
+
line: null,
|
|
257
|
+
assertionRef: ''
|
|
258
|
+
}));
|
|
259
|
+
}
|
|
260
|
+
|
|
261
|
+
if (!parsed.pass && findings.length === 0) {
|
|
262
|
+
findings = [{
|
|
263
|
+
target: 'workspace',
|
|
264
|
+
message: 'Judge reported failure without detailed findings.',
|
|
265
|
+
path: null,
|
|
266
|
+
line: null,
|
|
267
|
+
assertionRef: ''
|
|
268
|
+
}];
|
|
269
|
+
}
|
|
270
|
+
|
|
271
|
+
return {
|
|
272
|
+
pass: parsed.pass,
|
|
273
|
+
findings,
|
|
274
|
+
raw: parsed
|
|
275
|
+
};
|
|
276
|
+
}
|
|
277
|
+
|
|
278
|
+
function normalizeFinding(finding) {
|
|
279
|
+
if (!finding) {
|
|
280
|
+
return null;
|
|
281
|
+
}
|
|
282
|
+
|
|
283
|
+
if (typeof finding === 'string') {
|
|
284
|
+
const message = finding.trim();
|
|
285
|
+
return message ? {
|
|
286
|
+
target: 'workspace',
|
|
287
|
+
message,
|
|
288
|
+
path: null,
|
|
289
|
+
line: null,
|
|
290
|
+
assertionRef: ''
|
|
291
|
+
} : null;
|
|
292
|
+
}
|
|
293
|
+
|
|
294
|
+
if (typeof finding !== 'object' || Array.isArray(finding)) {
|
|
295
|
+
return null;
|
|
296
|
+
}
|
|
297
|
+
|
|
298
|
+
const message = typeof finding.message === 'string' ? finding.message.trim() : '';
|
|
299
|
+
if (!message) {
|
|
300
|
+
return null;
|
|
301
|
+
}
|
|
302
|
+
|
|
303
|
+
const pathValue = typeof finding.path === 'string' && finding.path.trim() ? finding.path.trim() : null;
|
|
304
|
+
const lineValue = normalizeLineNumber(finding.line);
|
|
305
|
+
const target = finding.target === 'file' || finding.target === 'output' || finding.target === 'workspace'
|
|
306
|
+
? finding.target
|
|
307
|
+
: (pathValue ? 'file' : 'workspace');
|
|
308
|
+
|
|
309
|
+
return {
|
|
310
|
+
target,
|
|
311
|
+
path: pathValue,
|
|
312
|
+
line: lineValue,
|
|
313
|
+
message,
|
|
314
|
+
assertionRef: typeof finding.assertionRef === 'string' ? finding.assertionRef.trim() : ''
|
|
315
|
+
};
|
|
316
|
+
}
|
|
317
|
+
|
|
318
|
+
function runPromptCommand({ commandTemplate, workspacePath, authoritativeWorkspacePath, prompt, timeoutMs, env }, verbose) {
|
|
319
|
+
const command = rewriteWorkspaceCommand(commandTemplate.map((entry) => entry
|
|
320
|
+
.replace('{PROMPT}', prompt)
|
|
321
|
+
.replace('{WORKSPACE_ROOT}', workspacePath)), workspacePath, authoritativeWorkspacePath);
|
|
322
|
+
|
|
323
|
+
const [file, ...args] = command;
|
|
324
|
+
|
|
325
|
+
if(verbose) {
|
|
326
|
+
console.log(`Running prompt cmd: ${file} ${args.join(' ')} in workspace: ${workspacePath}`);
|
|
327
|
+
}
|
|
328
|
+
|
|
329
|
+
const result = spawnSync(file, args, {
|
|
330
|
+
encoding: 'utf8',
|
|
331
|
+
cwd: workspacePath,
|
|
332
|
+
timeout: timeoutMs || undefined,
|
|
333
|
+
maxBuffer: 10 * 1024 * 1024,
|
|
334
|
+
env: {
|
|
335
|
+
...process.env,
|
|
336
|
+
...env
|
|
337
|
+
}
|
|
338
|
+
});
|
|
339
|
+
|
|
340
|
+
if(verbose) {
|
|
341
|
+
console.log(`Prompt command output: ${result.stdout || result.stderr}`);
|
|
342
|
+
}
|
|
343
|
+
|
|
344
|
+
|
|
345
|
+
if (result.error) {
|
|
346
|
+
if (result.error.code === 'ENOENT') {
|
|
347
|
+
throw new Error(`Command not found: ${file}`);
|
|
348
|
+
}
|
|
349
|
+
throw new Error(`Failed to execute ${file}: ${result.error.message}`);
|
|
350
|
+
}
|
|
351
|
+
|
|
352
|
+
if (result.status !== 0) {
|
|
353
|
+
const details = truncateText((result.stderr || result.stdout || '').trim(), 2000);
|
|
354
|
+
throw new Error(`${file} exited with status ${result.status}${details ? `: ${details}` : ''}`);
|
|
355
|
+
}
|
|
356
|
+
|
|
357
|
+
const output = (result.stdout || '').trim() || (result.stderr || '').trim();
|
|
358
|
+
if (!output) {
|
|
359
|
+
throw new Error(`${file} returned empty output.`);
|
|
360
|
+
}
|
|
361
|
+
|
|
362
|
+
if(verbose) {
|
|
363
|
+
console.log(`Prompt command output: ${output}`);
|
|
364
|
+
}
|
|
365
|
+
|
|
366
|
+
return { output };
|
|
367
|
+
}
|
|
368
|
+
|
|
369
|
+
function rewriteWorkspaceCommand(command, workspacePath, authoritativeWorkspacePath) {
|
|
370
|
+
if (!authoritativeWorkspacePath || path.resolve(workspacePath) === path.resolve(authoritativeWorkspacePath)) {
|
|
371
|
+
return command;
|
|
372
|
+
}
|
|
373
|
+
|
|
374
|
+
const normalizedAuthoritativeWorkspacePath = path.resolve(authoritativeWorkspacePath);
|
|
375
|
+
return command.map((entry, index, allEntries) => {
|
|
376
|
+
if (entry === '--add-dir' && typeof allEntries[index + 1] === 'string') {
|
|
377
|
+
return entry;
|
|
378
|
+
}
|
|
379
|
+
|
|
380
|
+
if (index > 0 && allEntries[index - 1] === '--add-dir' && path.resolve(entry) === normalizedAuthoritativeWorkspacePath) {
|
|
381
|
+
return workspacePath;
|
|
382
|
+
}
|
|
383
|
+
|
|
384
|
+
if (!entry.startsWith('--add-dir=')) {
|
|
385
|
+
return entry;
|
|
386
|
+
}
|
|
387
|
+
|
|
388
|
+
const addDirPath = entry.slice('--add-dir='.length);
|
|
389
|
+
if (path.resolve(addDirPath) !== normalizedAuthoritativeWorkspacePath) {
|
|
390
|
+
return entry;
|
|
391
|
+
}
|
|
392
|
+
|
|
393
|
+
return `--add-dir=${workspacePath}`;
|
|
394
|
+
});
|
|
395
|
+
}
|
|
396
|
+
|
|
397
|
+
function parseCommandTemplate(value, label) {
|
|
398
|
+
if (Array.isArray(value)) {
|
|
399
|
+
return normalizeCommandArray(value, label);
|
|
400
|
+
}
|
|
401
|
+
|
|
402
|
+
if (typeof value !== 'string' || !value.trim()) {
|
|
403
|
+
throw new Error(`Expected ${label} to be a non-empty JSON array string or string array.`);
|
|
404
|
+
}
|
|
405
|
+
|
|
406
|
+
let parsed;
|
|
407
|
+
try {
|
|
408
|
+
parsed = JSON.parse(value);
|
|
409
|
+
} catch (error) {
|
|
410
|
+
throw new Error(`${label} must be a JSON array string or a string array.`);
|
|
411
|
+
}
|
|
412
|
+
|
|
413
|
+
return normalizeCommandArray(parsed, label);
|
|
414
|
+
}
|
|
415
|
+
|
|
416
|
+
function normalizeCommandArray(value, label) {
|
|
417
|
+
if (!Array.isArray(value) || value.length === 0 || value.some((entry) => typeof entry !== 'string' || !entry)) {
|
|
418
|
+
throw new Error(`${label} must be a non-empty array of strings.`);
|
|
419
|
+
}
|
|
420
|
+
|
|
421
|
+
if (!value.some((entry) => entry.includes('{PROMPT}'))) {
|
|
422
|
+
throw new Error(`${label} must include a {PROMPT} placeholder.`);
|
|
423
|
+
}
|
|
424
|
+
|
|
425
|
+
return [...value];
|
|
426
|
+
}
|
|
427
|
+
|
|
428
|
+
function normalizeEnv(env) {
|
|
429
|
+
if (env == null) {
|
|
430
|
+
return {};
|
|
431
|
+
}
|
|
432
|
+
|
|
433
|
+
if (!env || typeof env !== 'object' || Array.isArray(env)) {
|
|
434
|
+
throw new Error('Expected env to be an object when provided.');
|
|
435
|
+
}
|
|
436
|
+
|
|
437
|
+
return { ...env };
|
|
438
|
+
}
|
|
439
|
+
|
|
440
|
+
function readTimeout(value, label) {
|
|
441
|
+
if (value == null) {
|
|
442
|
+
return 0;
|
|
443
|
+
}
|
|
444
|
+
|
|
445
|
+
const parsed = Number.parseInt(value, 10);
|
|
446
|
+
if (!Number.isFinite(parsed) || parsed < 0) {
|
|
447
|
+
throw new Error(`Invalid ${label} value: ${value}`);
|
|
448
|
+
}
|
|
449
|
+
return parsed;
|
|
450
|
+
}
|
|
451
|
+
|
|
452
|
+
function parseJsonObject(value) {
|
|
453
|
+
const trimmed = String(value || '').trim();
|
|
454
|
+
try {
|
|
455
|
+
return JSON.parse(trimmed);
|
|
456
|
+
} catch (error) {
|
|
457
|
+
const firstBrace = trimmed.indexOf('{');
|
|
458
|
+
const lastBrace = trimmed.lastIndexOf('}');
|
|
459
|
+
if (firstBrace !== -1 && lastBrace !== -1 && lastBrace > firstBrace) {
|
|
460
|
+
return JSON.parse(trimmed.slice(firstBrace, lastBrace + 1));
|
|
461
|
+
}
|
|
462
|
+
throw error;
|
|
463
|
+
}
|
|
464
|
+
}
|
|
465
|
+
|
|
466
|
+
function normalizeStringArray(values) {
|
|
467
|
+
if (!Array.isArray(values)) {
|
|
468
|
+
return [];
|
|
469
|
+
}
|
|
470
|
+
|
|
471
|
+
return [...new Set(values
|
|
472
|
+
.filter((value) => typeof value === 'string' && value.trim())
|
|
473
|
+
.map((value) => value.trim()))].sort((left, right) => left.localeCompare(right));
|
|
474
|
+
}
|
|
475
|
+
|
|
476
|
+
function normalizeLineNumber(value) {
|
|
477
|
+
const parsed = Number.parseInt(value, 10);
|
|
478
|
+
return Number.isFinite(parsed) && parsed > 0 ? parsed : null;
|
|
479
|
+
}
|
|
480
|
+
|
|
481
|
+
function formatFailureMarkdown(findings) {
|
|
482
|
+
const normalizedFindings = Array.isArray(findings) && findings.length > 0
|
|
483
|
+
? findings
|
|
484
|
+
: [{ target: 'workspace', message: 'The prompt test failed without detailed findings.' }];
|
|
485
|
+
|
|
486
|
+
return normalizedFindings.map((finding) => {
|
|
487
|
+
const location = formatFindingLocation(finding);
|
|
488
|
+
const assertion = finding.assertionRef ? ` Assertion: "${finding.assertionRef}".` : '';
|
|
489
|
+
return `- [${location}] ${finding.message}${assertion}`;
|
|
490
|
+
}).join('\n');
|
|
491
|
+
}
|
|
492
|
+
|
|
493
|
+
function formatFindingLocation(finding) {
|
|
494
|
+
if (finding.target === 'output') {
|
|
495
|
+
return finding.line ? `output:${finding.line}` : 'output';
|
|
496
|
+
}
|
|
497
|
+
|
|
498
|
+
if (finding.path) {
|
|
499
|
+
return finding.line ? `${finding.path}:${finding.line}` : finding.path;
|
|
500
|
+
}
|
|
501
|
+
|
|
502
|
+
return 'workspace';
|
|
503
|
+
}
|
|
504
|
+
|
|
505
|
+
function findGitRoot(startPath) {
|
|
506
|
+
let currentPath = path.resolve(startPath);
|
|
507
|
+
|
|
508
|
+
while (true) {
|
|
509
|
+
if (fs.existsSync(path.join(currentPath, '.git'))) {
|
|
510
|
+
return currentPath;
|
|
511
|
+
}
|
|
512
|
+
|
|
513
|
+
const parentPath = path.dirname(currentPath);
|
|
514
|
+
if (parentPath === currentPath) {
|
|
515
|
+
return path.resolve(startPath);
|
|
516
|
+
}
|
|
517
|
+
|
|
518
|
+
currentPath = parentPath;
|
|
519
|
+
}
|
|
520
|
+
}
|
|
521
|
+
|
|
522
|
+
function copyWorkspace(sourcePath, targetPath, verbose) {
|
|
523
|
+
if(verbose) {
|
|
524
|
+
console.log(`Copying workspace from ${sourcePath} to ${targetPath}`);
|
|
525
|
+
}
|
|
526
|
+
fs.mkdirSync(targetPath, { recursive: true });
|
|
527
|
+
copyWorkspaceDirectory({
|
|
528
|
+
sourceDir: sourcePath,
|
|
529
|
+
targetDir: targetPath,
|
|
530
|
+
rootPath: sourcePath,
|
|
531
|
+
ignoreContexts: [],
|
|
532
|
+
activeRealDirectories: new Set()
|
|
533
|
+
});
|
|
534
|
+
return targetPath;
|
|
535
|
+
}
|
|
536
|
+
|
|
537
|
+
function copyWorkspaceDirectory({ sourceDir, targetDir, rootPath, ignoreContexts, activeRealDirectories }) {
|
|
538
|
+
const realSourceDir = fs.realpathSync(sourceDir);
|
|
539
|
+
if (activeRealDirectories.has(realSourceDir)) {
|
|
540
|
+
return;
|
|
541
|
+
}
|
|
542
|
+
activeRealDirectories.add(realSourceDir);
|
|
543
|
+
|
|
544
|
+
try {
|
|
545
|
+
const currentRelativeDir = toWorkspaceRelativePath(path.relative(rootPath, sourceDir));
|
|
546
|
+
const nextIgnoreContexts = loadGitignoreContext(sourceDir, currentRelativeDir, ignoreContexts);
|
|
547
|
+
const entries = fs.readdirSync(sourceDir, { withFileTypes: true });
|
|
548
|
+
|
|
549
|
+
for (const entry of entries) {
|
|
550
|
+
if (!shouldCopyWorkspaceEntry(entry.name)) {
|
|
551
|
+
continue;
|
|
552
|
+
}
|
|
553
|
+
|
|
554
|
+
const sourceEntryPath = path.join(sourceDir, entry.name);
|
|
555
|
+
const entryStats = entry.isSymbolicLink() ? fs.statSync(sourceEntryPath) : null;
|
|
556
|
+
const isDirectory = entry.isDirectory() || (entryStats && entryStats.isDirectory());
|
|
557
|
+
const isFile = entry.isFile() || (entryStats && entryStats.isFile());
|
|
558
|
+
|
|
559
|
+
if (!isDirectory && !isFile) {
|
|
560
|
+
continue;
|
|
561
|
+
}
|
|
562
|
+
|
|
563
|
+
const entryRelativePath = toWorkspaceRelativePath(path.relative(rootPath, sourceEntryPath));
|
|
564
|
+
const matchPath = isDirectory ? `${entryRelativePath}/` : entryRelativePath;
|
|
565
|
+
if (isGitignoredPath(matchPath, nextIgnoreContexts)) {
|
|
566
|
+
continue;
|
|
567
|
+
}
|
|
568
|
+
|
|
569
|
+
const targetEntryPath = path.join(targetDir, entry.name);
|
|
570
|
+
if (isDirectory) {
|
|
571
|
+
fs.mkdirSync(targetEntryPath, { recursive: true });
|
|
572
|
+
copyWorkspaceDirectory({
|
|
573
|
+
sourceDir: sourceEntryPath,
|
|
574
|
+
targetDir: targetEntryPath,
|
|
575
|
+
rootPath,
|
|
576
|
+
ignoreContexts: nextIgnoreContexts,
|
|
577
|
+
activeRealDirectories
|
|
578
|
+
});
|
|
579
|
+
continue;
|
|
580
|
+
}
|
|
581
|
+
|
|
582
|
+
fs.copyFileSync(sourceEntryPath, targetEntryPath);
|
|
583
|
+
fs.chmodSync(targetEntryPath, (entryStats || fs.statSync(sourceEntryPath)).mode);
|
|
584
|
+
}
|
|
585
|
+
} finally {
|
|
586
|
+
activeRealDirectories.delete(realSourceDir);
|
|
587
|
+
}
|
|
588
|
+
}
|
|
589
|
+
|
|
590
|
+
function loadGitignoreContext(sourceDir, currentRelativeDir, ignoreContexts) {
|
|
591
|
+
const gitignorePath = path.join(sourceDir, '.gitignore');
|
|
592
|
+
if (!fs.existsSync(gitignorePath) || !fs.statSync(gitignorePath).isFile()) {
|
|
593
|
+
return ignoreContexts;
|
|
594
|
+
}
|
|
595
|
+
|
|
596
|
+
const matcher = ignore();
|
|
597
|
+
matcher.add(fs.readFileSync(gitignorePath, 'utf8'));
|
|
598
|
+
return [...ignoreContexts, { basePath: currentRelativeDir, matcher }];
|
|
599
|
+
}
|
|
600
|
+
|
|
601
|
+
function isGitignoredPath(matchPath, ignoreContexts) {
|
|
602
|
+
let ignored = false;
|
|
603
|
+
|
|
604
|
+
for (const context of ignoreContexts) {
|
|
605
|
+
const relativeToContext = toContextRelativePath(matchPath, context.basePath);
|
|
606
|
+
if (relativeToContext == null) {
|
|
607
|
+
continue;
|
|
608
|
+
}
|
|
609
|
+
|
|
610
|
+
const result = context.matcher.checkIgnore(relativeToContext);
|
|
611
|
+
if (result.unignored) {
|
|
612
|
+
ignored = false;
|
|
613
|
+
}
|
|
614
|
+
if (result.ignored) {
|
|
615
|
+
ignored = true;
|
|
616
|
+
}
|
|
617
|
+
}
|
|
618
|
+
|
|
619
|
+
return ignored;
|
|
620
|
+
}
|
|
621
|
+
|
|
622
|
+
function toContextRelativePath(matchPath, basePath) {
|
|
623
|
+
const isDirectory = matchPath.endsWith('/');
|
|
624
|
+
const barePath = isDirectory ? matchPath.slice(0, -1) : matchPath;
|
|
625
|
+
const relativePath = basePath ? path.posix.relative(basePath, barePath) : barePath;
|
|
626
|
+
|
|
627
|
+
if (!relativePath || relativePath === '.' || relativePath === '..' || relativePath.startsWith('../')) {
|
|
628
|
+
return null;
|
|
629
|
+
}
|
|
630
|
+
|
|
631
|
+
return isDirectory ? `${relativePath}/` : relativePath;
|
|
632
|
+
}
|
|
633
|
+
|
|
634
|
+
function toWorkspaceRelativePath(relativePath) {
|
|
635
|
+
return relativePath ? relativePath.split(path.sep).join(path.posix.sep) : '';
|
|
636
|
+
}
|
|
637
|
+
|
|
638
|
+
function shouldCopyWorkspaceEntry(entryName) {
|
|
639
|
+
return !COPY_WORKSPACE_EXCLUDES.has(entryName);
|
|
640
|
+
}
|
|
641
|
+
|
|
642
|
+
const COPY_WORKSPACE_EXCLUDES = new Set([
|
|
643
|
+
'.git',
|
|
644
|
+
'.gitattributes',
|
|
645
|
+
'.gitignore',
|
|
646
|
+
'.gitmodules'
|
|
647
|
+
]);
|
|
648
|
+
|
|
649
|
+
function truncateText(value, maxLength) {
|
|
650
|
+
if (value.length <= maxLength) {
|
|
651
|
+
return value;
|
|
652
|
+
}
|
|
653
|
+
return `${value.slice(0, maxLength)}...`;
|
|
654
|
+
}
|
|
655
|
+
|
|
656
|
+
module.exports = {
|
|
657
|
+
testPrompt,
|
|
658
|
+
runPromptTest,
|
|
659
|
+
copilotCmd
|
|
660
|
+
};
|
|
@@ -0,0 +1,133 @@
|
|
|
1
|
+
'use strict';
|
|
2
|
+
|
|
3
|
+
const fs = require('fs');
|
|
4
|
+
const os = require('os');
|
|
5
|
+
const path = require('path');
|
|
6
|
+
|
|
7
|
+
const { copilotCmd, testPrompt } = require('./testPrompt');
|
|
8
|
+
|
|
9
|
+
let TMP_ROOT;
|
|
10
|
+
const COPILOT_DIR = path.join(__dirname, 'tests');
|
|
11
|
+
|
|
12
|
+
beforeAll(() => {
|
|
13
|
+
TMP_ROOT = fs.mkdtempSync(path.join(os.tmpdir(), 'xdrs-core-fixtures-'));
|
|
14
|
+
});
|
|
15
|
+
|
|
16
|
+
afterAll(() => {
|
|
17
|
+
fs.rmSync(TMP_ROOT, { recursive: true, force: true });
|
|
18
|
+
});
|
|
19
|
+
|
|
20
|
+
test('passes a prompt test with copied workspace isolation', () => {
|
|
21
|
+
const workspaceRoot = createWorkspace('customer-pass');
|
|
22
|
+
const err = testPrompt(
|
|
23
|
+
createConfig(workspaceRoot),
|
|
24
|
+
'create a research about our customer base. We have 30% of customer > 50 years; 90% > 20',
|
|
25
|
+
'The resulting file should be created at customer-research.md and should not generate facts that are not present in the original prompt'
|
|
26
|
+
);
|
|
27
|
+
|
|
28
|
+
expect(err).toBe('');
|
|
29
|
+
expect(fs.existsSync(path.join(workspaceRoot, 'customer-research.md'))).toBe(false);
|
|
30
|
+
});
|
|
31
|
+
|
|
32
|
+
test('passes when ignored files and git metadata stay out of the copied workspace', () => {
|
|
33
|
+
const workspaceRoot = createWorkspace('ignore-pass', { withIgnoredEntries: true });
|
|
34
|
+
const err = testPrompt(
|
|
35
|
+
createConfig(workspaceRoot),
|
|
36
|
+
'create a note named summary.txt saying behavior ok',
|
|
37
|
+
'Verify if ignored/seed.txt, .git/config, and nested/.git/config are not available in the copied workspace and are not reported as changes'
|
|
38
|
+
);
|
|
39
|
+
|
|
40
|
+
expect(err).toBe('');
|
|
41
|
+
expect(fs.existsSync(path.join(workspaceRoot, 'summary.txt'))).toBe(false);
|
|
42
|
+
assertFileExists(path.join(workspaceRoot, 'ignored', 'seed.txt'));
|
|
43
|
+
assertFileExists(path.join(workspaceRoot, '.git', 'config'));
|
|
44
|
+
assertFileExists(path.join(workspaceRoot, 'nested', '.git', 'config'));
|
|
45
|
+
});
|
|
46
|
+
|
|
47
|
+
test('returns markdown findings when the judge rejects the result', () => {
|
|
48
|
+
const workspaceRoot = createWorkspace('failure-case');
|
|
49
|
+
const err = testPrompt(
|
|
50
|
+
createConfig(workspaceRoot),
|
|
51
|
+
'create a research about our customer base. We have 30% of customer > 50 years; 90% > 20',
|
|
52
|
+
'Verify if summary.txt exists and the final output mentions summary.txt'
|
|
53
|
+
);
|
|
54
|
+
|
|
55
|
+
expect(err).toContain('- [summary.txt] summary.txt should exist.');
|
|
56
|
+
expect(err).toContain('Assertion: "summary.txt exists".');
|
|
57
|
+
expect(err).toContain('- [output:1] The final output should mention summary.txt.');
|
|
58
|
+
});
|
|
59
|
+
|
|
60
|
+
test('does not create a temp workspace in in-place mode', () => {
|
|
61
|
+
const workspaceRoot = createWorkspace('in-place');
|
|
62
|
+
const mkdtempSpy = jest.spyOn(fs, 'mkdtempSync');
|
|
63
|
+
|
|
64
|
+
try {
|
|
65
|
+
const err = testPrompt(
|
|
66
|
+
createConfig(workspaceRoot, { workspaceMode: 'in-place' }),
|
|
67
|
+
'create a note named summary.txt saying behavior ok',
|
|
68
|
+
'Verify if summary.txt exists and the final output mentions summary.txt'
|
|
69
|
+
);
|
|
70
|
+
|
|
71
|
+
expect(err).toBe('');
|
|
72
|
+
expect(mkdtempSpy).not.toHaveBeenCalled();
|
|
73
|
+
expect(fs.existsSync(path.join(workspaceRoot, 'summary.txt'))).toBe(true);
|
|
74
|
+
} finally {
|
|
75
|
+
mkdtempSpy.mockRestore();
|
|
76
|
+
}
|
|
77
|
+
});
|
|
78
|
+
|
|
79
|
+
test('copilotCmd defaults to the git repository root', () => {
|
|
80
|
+
const command = copilotCmd();
|
|
81
|
+
const addDirArgument = command.find((entry) => entry.startsWith('--add-dir='));
|
|
82
|
+
|
|
83
|
+
expect(addDirArgument).toBe(`--add-dir=${path.resolve(__dirname, '..')}`);
|
|
84
|
+
const promptIndex = command.indexOf('-p');
|
|
85
|
+
expect(command[promptIndex + 1]).toBe('{PROMPT}');
|
|
86
|
+
});
|
|
87
|
+
|
|
88
|
+
test('judge phase reuses promptCmd even when judgeCmd is provided', () => {
|
|
89
|
+
const workspaceRoot = createWorkspace('judge-cmd-ignored');
|
|
90
|
+
const err = testPrompt(
|
|
91
|
+
createConfig(workspaceRoot, {
|
|
92
|
+
judgeCmd: ['missing-command', '{PROMPT}']
|
|
93
|
+
}),
|
|
94
|
+
'create a note named summary.txt saying behavior ok',
|
|
95
|
+
'Verify if summary.txt exists and the final output mentions summary.txt'
|
|
96
|
+
);
|
|
97
|
+
|
|
98
|
+
expect(err).toBe('');
|
|
99
|
+
});
|
|
100
|
+
|
|
101
|
+
function createConfig(workspaceRoot, overrides = {}) {
|
|
102
|
+
return {
|
|
103
|
+
promptCmd: copilotCmd(workspaceRoot),
|
|
104
|
+
workspaceRoot,
|
|
105
|
+
workspaceMode: 'copy',
|
|
106
|
+
env: {
|
|
107
|
+
PATH: `${COPILOT_DIR}:${process.env.PATH}`
|
|
108
|
+
},
|
|
109
|
+
...overrides
|
|
110
|
+
};
|
|
111
|
+
}
|
|
112
|
+
|
|
113
|
+
function createWorkspace(name, options = {}) {
|
|
114
|
+
const workspaceRoot = path.join(TMP_ROOT, name);
|
|
115
|
+
fs.mkdirSync(workspaceRoot, { recursive: true });
|
|
116
|
+
fs.writeFileSync(path.join(workspaceRoot, 'seed.txt'), 'seed\n', 'utf8');
|
|
117
|
+
|
|
118
|
+
if (options.withIgnoredEntries) {
|
|
119
|
+
fs.writeFileSync(path.join(workspaceRoot, '.gitignore'), 'ignored/\n', 'utf8');
|
|
120
|
+
fs.mkdirSync(path.join(workspaceRoot, 'ignored'), { recursive: true });
|
|
121
|
+
fs.writeFileSync(path.join(workspaceRoot, 'ignored', 'seed.txt'), 'ignored seed\n', 'utf8');
|
|
122
|
+
fs.mkdirSync(path.join(workspaceRoot, '.git'), { recursive: true });
|
|
123
|
+
fs.writeFileSync(path.join(workspaceRoot, '.git', 'config'), 'git config\n', 'utf8');
|
|
124
|
+
fs.mkdirSync(path.join(workspaceRoot, 'nested', '.git'), { recursive: true });
|
|
125
|
+
fs.writeFileSync(path.join(workspaceRoot, 'nested', '.git', 'config'), 'nested git config\n', 'utf8');
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
return workspaceRoot;
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
function assertFileExists(filePath) {
|
|
132
|
+
expect(fs.existsSync(filePath)).toBe(true);
|
|
133
|
+
}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "xdrs-core",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.11.0",
|
|
4
4
|
"description": "A standard way to organize Decision Records (XDRs) across scopes, subjects, and teams so that AI agents can reliably query and follow them.",
|
|
5
5
|
"repository": {
|
|
6
6
|
"type": "git",
|
|
@@ -13,16 +13,26 @@
|
|
|
13
13
|
"ai-agents"
|
|
14
14
|
],
|
|
15
15
|
"license": "MIT",
|
|
16
|
+
"main": "lib/index.js",
|
|
17
|
+
"types": "lib/index.d.ts",
|
|
18
|
+
"exports": {
|
|
19
|
+
".": "./lib/index.js"
|
|
20
|
+
},
|
|
16
21
|
"files": [
|
|
17
22
|
".xdrs/_core/**",
|
|
18
23
|
".xdrs/index.md",
|
|
19
24
|
"package.json",
|
|
20
25
|
"AGENTS.md",
|
|
21
26
|
"bin/filedist.js",
|
|
22
|
-
|
|
27
|
+
"lib/**/*.js"
|
|
23
28
|
],
|
|
29
|
+
"devDependencies": {
|
|
30
|
+
"jest": "^29.7.0"
|
|
31
|
+
},
|
|
24
32
|
"dependencies": {
|
|
25
|
-
"filedist": "^0.26.0"
|
|
33
|
+
"filedist": "^0.26.0",
|
|
34
|
+
"ignore": "^7.0.5",
|
|
35
|
+
"minimatch": "^10.2.5"
|
|
26
36
|
},
|
|
27
37
|
"filedist": {
|
|
28
38
|
"sets": [
|
|
@@ -31,6 +41,10 @@
|
|
|
31
41
|
"files": [
|
|
32
42
|
"AGENTS.md",
|
|
33
43
|
".xdrs/_core/**"
|
|
44
|
+
],
|
|
45
|
+
"exclude": [
|
|
46
|
+
"**/*.test.js",
|
|
47
|
+
"**/*.test.int.js"
|
|
34
48
|
]
|
|
35
49
|
},
|
|
36
50
|
"output": {
|