snapeval 2.1.0 → 2.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +144 -104
- package/bin/snapeval.ts +9 -0
- package/dist/bin/snapeval.js +8 -0
- package/dist/bin/snapeval.js.map +1 -1
- package/dist/src/adapters/copilot-sdk-client.js +3 -1
- package/dist/src/adapters/copilot-sdk-client.js.map +1 -1
- package/dist/src/adapters/report/terminal.js +2 -2
- package/dist/src/adapters/report/terminal.js.map +1 -1
- package/dist/src/commands/eval.js +46 -6
- package/dist/src/commands/eval.js.map +1 -1
- package/dist/src/engine/grader.js +68 -11
- package/dist/src/engine/grader.js.map +1 -1
- package/dist/src/engine/runner.d.ts +1 -0
- package/dist/src/engine/runner.js +1 -0
- package/dist/src/engine/runner.js.map +1 -1
- package/dist/src/types.d.ts +2 -0
- package/package.json +1 -1
- package/plugin.json +1 -1
- package/skills/snapeval/SKILL.md +103 -25
- package/src/adapters/copilot-sdk-client.ts +3 -1
- package/src/adapters/report/terminal.ts +2 -3
- package/src/commands/eval.ts +56 -6
- package/src/engine/grader.ts +78 -14
- package/src/engine/runner.ts +2 -0
- package/src/types.ts +2 -0
package/README.md
CHANGED
|
@@ -1,131 +1,178 @@
|
|
|
1
1
|
# snapeval
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Harness-agnostic eval runner for [agentskills.io](https://agentskills.io) skills.
|
|
4
4
|
|
|
5
5
|
[](https://github.com/matantsach/snapeval/actions/workflows/ci.yml)
|
|
6
6
|
[](https://www.npmjs.com/package/snapeval)
|
|
7
7
|
[](https://opensource.org/licenses/MIT)
|
|
8
8
|
|
|
9
|
-
snapeval
|
|
9
|
+
snapeval runs every eval case **with and without** your skill, grades assertions, and computes a benchmark delta — so you can see exactly what value your skill adds.
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
```
|
|
12
|
+
snapeval — greeter
|
|
13
|
+
Baseline = without SKILL.md (raw AI response)
|
|
14
|
+
────────────────────────────────────────────────────────────
|
|
15
|
+
#1 formal greeting for Eleanor
|
|
16
|
+
Skill: 100% | Baseline: 33% | 5.2s
|
|
17
|
+
#2 casual greeting for Marcus
|
|
18
|
+
Skill: 100% ↑ was 67% | Baseline: 67% | 2.7s
|
|
19
|
+
#3 pirate greeting for Zoe
|
|
20
|
+
Skill: 100% | Baseline: 67% | 2.5s
|
|
21
|
+
────────────────────────────────────────────────────────────
|
|
22
|
+
Summary:
|
|
23
|
+
Skill pass rate: 100.0%
|
|
24
|
+
Baseline pass rate: 55.6%
|
|
25
|
+
Improvement: +44.4%
|
|
26
|
+
```
|
|
12
27
|
|
|
13
|
-
|
|
14
|
-
- **Zero assertions** — No test logic to write. The AI generates realistic, messy prompts that mirror how real users actually type.
|
|
15
|
-
- **Semantic comparison** — Tiered pipeline: schema check (free) → LLM judge with order-swap debiasing (when needed). Most checks cost $0.
|
|
16
|
-
- **Free inference** — Uses gpt-5-mini via Copilot CLI and GitHub Models API.
|
|
17
|
-
- **Platform-agnostic** — Adapter-based architecture. Copilot CLI first, others coming.
|
|
28
|
+
## How it works
|
|
18
29
|
|
|
19
|
-
|
|
30
|
+
1. You write a `SKILL.md` and an `evals.json` with test cases and assertions
|
|
31
|
+
2. snapeval runs each eval **twice** — once with your skill loaded, once without (baseline)
|
|
32
|
+
3. Assertions are graded by an LLM judge (semantic) and/or shell scripts (deterministic)
|
|
33
|
+
4. A benchmark shows where your skill adds value vs. where the raw AI already handles it
|
|
20
34
|
|
|
21
|
-
|
|
35
|
+
## Quick start
|
|
22
36
|
|
|
23
|
-
|
|
37
|
+
### As a Copilot plugin
|
|
24
38
|
|
|
25
39
|
```bash
|
|
26
|
-
copilot plugin
|
|
27
|
-
copilot plugin install snapeval@snapeval-marketplace
|
|
40
|
+
copilot plugin install matantsach/snapeval
|
|
28
41
|
```
|
|
29
42
|
|
|
30
|
-
|
|
43
|
+
Then in Copilot CLI, just say `evaluate my skill` — the snapeval skill handles the rest.
|
|
44
|
+
|
|
45
|
+
### Standalone CLI
|
|
31
46
|
|
|
32
47
|
```bash
|
|
33
|
-
|
|
48
|
+
git clone https://github.com/matantsach/snapeval.git
|
|
49
|
+
cd snapeval && npm install
|
|
50
|
+
npx tsx bin/snapeval.ts eval <skill-dir>
|
|
34
51
|
```
|
|
35
52
|
|
|
36
|
-
|
|
53
|
+
## Eval format
|
|
37
54
|
|
|
38
|
-
```
|
|
39
|
-
|
|
55
|
+
```
|
|
56
|
+
my-skill/
|
|
57
|
+
├── SKILL.md
|
|
58
|
+
└── evals/
|
|
59
|
+
├── evals.json
|
|
60
|
+
└── scripts/ ← optional deterministic checks
|
|
61
|
+
└── validate.sh
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
**evals.json:**
|
|
65
|
+
|
|
66
|
+
```json
|
|
67
|
+
{
|
|
68
|
+
"skill_name": "greeter",
|
|
69
|
+
"evals": [
|
|
70
|
+
{
|
|
71
|
+
"id": 1,
|
|
72
|
+
"label": "formal greeting for Eleanor",
|
|
73
|
+
"prompt": "Can you give me a formal greeting for Eleanor?",
|
|
74
|
+
"expected_output": "Returns the formal greeting addressed to Eleanor.",
|
|
75
|
+
"assertions": [
|
|
76
|
+
"Output contains the name Eleanor",
|
|
77
|
+
"Output uses a formal tone",
|
|
78
|
+
"script:validate.sh"
|
|
79
|
+
]
|
|
80
|
+
}
|
|
81
|
+
]
|
|
82
|
+
}
|
|
40
83
|
```
|
|
41
84
|
|
|
42
|
-
|
|
85
|
+
| Field | Required | Description |
|
|
86
|
+
|-------|----------|-------------|
|
|
87
|
+
| `id` | yes | Unique numeric identifier |
|
|
88
|
+
| `prompt` | yes | The user prompt sent to the harness |
|
|
89
|
+
| `expected_output` | yes | Human description of the expected behavior |
|
|
90
|
+
| `label` | no | Human-readable name shown in terminal output |
|
|
91
|
+
| `slug` | no | Filesystem-safe name for the eval directory |
|
|
92
|
+
| `assertions` | no | List of assertions to grade (LLM semantic or `script:` prefixed) |
|
|
93
|
+
| `files` | no | Input files to attach to the prompt |
|
|
43
94
|
|
|
44
|
-
|
|
95
|
+
### Assertions
|
|
96
|
+
|
|
97
|
+
**Semantic** — graded by an LLM. Write specific, verifiable statements:
|
|
45
98
|
|
|
46
99
|
```
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
> check if I broke anything in my-skill
|
|
50
|
-
> approve scenario 3
|
|
100
|
+
"Output contains a YAML block with an 'id' field for each issue"
|
|
101
|
+
"Response declines because the pipeline already has unclaimed issues"
|
|
51
102
|
```
|
|
52
103
|
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
### What happens when you evaluate
|
|
104
|
+
**Script** — prefix with `script:`. Scripts live in `evals/scripts/`, receive the output directory as `$1`, and pass on exit code 0:
|
|
56
105
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
4. **Capture** — snapeval writes `evals.json` and runs the scenarios against your skill, saving baseline snapshots
|
|
106
|
+
```
|
|
107
|
+
"script:validate-json-structure.sh"
|
|
108
|
+
```
|
|
61
109
|
|
|
62
|
-
|
|
110
|
+
## CLI reference
|
|
63
111
|
|
|
64
|
-
|
|
112
|
+
### `eval`
|
|
65
113
|
|
|
66
|
-
|
|
114
|
+
Run evals, grade assertions, compute benchmark.
|
|
67
115
|
|
|
68
|
-
```
|
|
69
|
-
snapeval
|
|
70
|
-
snapeval capture [skill-dir] Run scenarios and save baseline snapshots
|
|
71
|
-
snapeval check [skill-dir] Compare current output against baselines
|
|
72
|
-
snapeval approve [skill-dir] Approve regressed scenarios as new baselines
|
|
73
|
-
snapeval report [skill-dir] Write results with optional HTML viewer
|
|
74
|
-
snapeval ideate [skill-dir] Open the interactive scenario ideation viewer
|
|
116
|
+
```bash
|
|
117
|
+
npx snapeval eval [skill-dir] [options]
|
|
75
118
|
```
|
|
76
119
|
|
|
77
120
|
| Flag | Description | Default |
|
|
78
121
|
|------|-------------|---------|
|
|
79
|
-
| `--
|
|
80
|
-
| `--inference <name>` | Inference adapter | `auto` |
|
|
81
|
-
| `--
|
|
82
|
-
| `--runs <n>` |
|
|
83
|
-
| `--
|
|
84
|
-
| `--
|
|
85
|
-
| `--
|
|
122
|
+
| `--harness <name>` | Harness adapter | `copilot-sdk` |
|
|
123
|
+
| `--inference <name>` | Inference adapter for grading | `auto` |
|
|
124
|
+
| `--workspace <path>` | Output directory | `../{skill_name}-workspace` |
|
|
125
|
+
| `--runs <n>` | Harness invocations per eval for statistical averaging | `1` |
|
|
126
|
+
| `--concurrency <n>` | Parallel eval cases (1-10) | `1` |
|
|
127
|
+
| `--only <ids>` | Run specific eval IDs (e.g. `--only 1,3,5`) | all |
|
|
128
|
+
| `--threshold <rate>` | Minimum pass rate 0-1 for exit code 0 | none |
|
|
129
|
+
| `--old-skill <path>` | Compare against old skill version | none |
|
|
86
130
|
| `--verbose` | Verbose output | off |
|
|
87
131
|
|
|
88
|
-
|
|
132
|
+
### `review`
|
|
89
133
|
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
↓
|
|
95
|
-
Schema match? → PASS (free, instant)
|
|
96
|
-
LLM Judge agrees? → PASS/REGRESSED
|
|
134
|
+
Run eval + generate HTML report + open in browser.
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
npx snapeval review [skill-dir] [options]
|
|
97
138
|
```
|
|
98
139
|
|
|
99
|
-
|
|
140
|
+
Same flags as `eval`, plus `--no-open` to skip opening the browser.
|
|
100
141
|
|
|
101
|
-
|
|
102
|
-
|------|--------|------|-----------|
|
|
103
|
-
| 1 | Schema check | Free | Structural skeleton matches |
|
|
104
|
-
| 2 | LLM judge (order-swap) | Cheap | Schema differs, needs semantic comparison |
|
|
142
|
+
### Exit codes
|
|
105
143
|
|
|
106
|
-
|
|
144
|
+
| Code | Meaning |
|
|
145
|
+
|------|---------|
|
|
146
|
+
| 0 | Success |
|
|
147
|
+
| 1 | Threshold not met (eval ran but pass rate below `--threshold`) |
|
|
148
|
+
| 2 | Config/input error (bad JSON, missing fields, invalid flags) |
|
|
149
|
+
| 3 | File not found (missing skill dir, evals.json, or script) |
|
|
150
|
+
| 4 | Runtime error (harness failure, grading failure, timeout) |
|
|
107
151
|
|
|
108
|
-
##
|
|
152
|
+
## Output artifacts
|
|
109
153
|
|
|
110
|
-
|
|
154
|
+
Each run creates an iteration directory:
|
|
111
155
|
|
|
112
156
|
```
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
├──
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
157
|
+
workspace/
|
|
158
|
+
└── iteration-1/
|
|
159
|
+
├── benchmark.json ← aggregate stats with delta
|
|
160
|
+
├── SKILL.md.snapshot ← copy of skill used
|
|
161
|
+
└── eval-{slug}/
|
|
162
|
+
├── with_skill/
|
|
163
|
+
│ ├── outputs/output.txt
|
|
164
|
+
│ ├── timing.json
|
|
165
|
+
│ ├── grading.json
|
|
166
|
+
│ └── transcript.log
|
|
167
|
+
└── without_skill/
|
|
168
|
+
├── outputs/output.txt
|
|
122
169
|
├── timing.json
|
|
123
|
-
└──
|
|
170
|
+
└── grading.json
|
|
124
171
|
```
|
|
125
172
|
|
|
126
|
-
|
|
173
|
+
**benchmark.json** includes metadata: `eval_count`, `eval_ids`, `skill_name`, `runs_per_eval`, `timestamp`.
|
|
127
174
|
|
|
128
|
-
|
|
175
|
+
## CI integration
|
|
129
176
|
|
|
130
177
|
```yaml
|
|
131
178
|
name: Skill Evaluation
|
|
@@ -140,22 +187,10 @@ jobs:
|
|
|
140
187
|
with:
|
|
141
188
|
node-version: 22
|
|
142
189
|
- run: npm ci
|
|
143
|
-
- run: npx snapeval
|
|
190
|
+
- run: npx snapeval eval skills/my-skill --threshold 0.8 --runs 3
|
|
144
191
|
```
|
|
145
192
|
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
```bash
|
|
149
|
-
git clone https://github.com/matantsach/snapeval.git
|
|
150
|
-
cd snapeval && npm install
|
|
151
|
-
npx tsx bin/snapeval.ts check <skill-path>
|
|
152
|
-
```
|
|
153
|
-
|
|
154
|
-
Or load as a local plugin:
|
|
155
|
-
|
|
156
|
-
```bash
|
|
157
|
-
copilot plugin install ./path/to/snapeval
|
|
158
|
-
```
|
|
193
|
+
Exit code 1 when pass rate falls below threshold — blocks the PR.
|
|
159
194
|
|
|
160
195
|
## Configuration
|
|
161
196
|
|
|
@@ -163,32 +198,37 @@ Create `snapeval.config.json` in your skill or project root:
|
|
|
163
198
|
|
|
164
199
|
```json
|
|
165
200
|
{
|
|
166
|
-
"
|
|
201
|
+
"harness": "copilot-sdk",
|
|
167
202
|
"inference": "auto",
|
|
168
|
-
"
|
|
169
|
-
"
|
|
203
|
+
"workspace": "../{skill_name}-workspace",
|
|
204
|
+
"runs": 1,
|
|
205
|
+
"concurrency": 1
|
|
170
206
|
}
|
|
171
207
|
```
|
|
172
208
|
|
|
173
|
-
|
|
209
|
+
Resolution order: defaults → project config → skill-dir config → CLI flags.
|
|
174
210
|
|
|
175
|
-
##
|
|
211
|
+
## Harness adapters
|
|
176
212
|
|
|
177
|
-
|
|
213
|
+
| Adapter | Description | Default |
|
|
214
|
+
|---------|-------------|---------|
|
|
215
|
+
| `copilot-sdk` | Programmatic via `@github/copilot-sdk` with native skill loading | yes |
|
|
216
|
+
| `copilot-cli` | Shells out to `copilot` CLI binary | no |
|
|
178
217
|
|
|
179
|
-
|
|
180
|
-
- **CLI** (`npx snapeval`) — Headless backend for CI and power users.
|
|
181
|
-
- **GitHub Action** — CI wrapper (planned).
|
|
218
|
+
The SDK harness loads skills natively via `skillDirectories`, captures full transcripts, and extracts real token counts from `assistant.usage` events.
|
|
182
219
|
|
|
183
|
-
|
|
220
|
+
## Inference adapters
|
|
184
221
|
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
222
|
+
| Adapter | Description |
|
|
223
|
+
|---------|-------------|
|
|
224
|
+
| `auto` | Copilot CLI if available, else GitHub Models API |
|
|
225
|
+
| `copilot` | Copilot CLI (`copilot` binary) |
|
|
226
|
+
| `copilot-sdk` | `@github/copilot-sdk` programmatic |
|
|
227
|
+
| `github-models` | GitHub Models API (requires `GITHUB_TOKEN`) |
|
|
188
228
|
|
|
189
229
|
## Contributing
|
|
190
230
|
|
|
191
|
-
See [CONTRIBUTING.md](CONTRIBUTING.md)
|
|
231
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md).
|
|
192
232
|
|
|
193
233
|
## License
|
|
194
234
|
|
package/bin/snapeval.ts
CHANGED
|
@@ -1,4 +1,13 @@
|
|
|
1
1
|
#!/usr/bin/env tsx
|
|
2
|
+
|
|
3
|
+
// Suppress Node.js ExperimentalWarning (e.g., SQLite) from polluting output
|
|
4
|
+
const _origEmit = process.emit;
|
|
5
|
+
// @ts-ignore — override to filter warnings
|
|
6
|
+
process.emit = function (event: string, ...args: any[]) {
|
|
7
|
+
if (event === 'warning' && args[0]?.name === 'ExperimentalWarning') return false;
|
|
8
|
+
return _origEmit.apply(process, [event, ...args] as any);
|
|
9
|
+
};
|
|
10
|
+
|
|
2
11
|
import { Command } from 'commander';
|
|
3
12
|
import { resolveConfig } from '../src/config.js';
|
|
4
13
|
import { resolveInference } from '../src/adapters/inference/resolve.js';
|
package/dist/bin/snapeval.js
CHANGED
|
@@ -1,4 +1,12 @@
|
|
|
1
1
|
#!/usr/bin/env tsx
|
|
2
|
+
// Suppress Node.js ExperimentalWarning (e.g., SQLite) from polluting output
|
|
3
|
+
const _origEmit = process.emit;
|
|
4
|
+
// @ts-ignore — override to filter warnings
|
|
5
|
+
process.emit = function (event, ...args) {
|
|
6
|
+
if (event === 'warning' && args[0]?.name === 'ExperimentalWarning')
|
|
7
|
+
return false;
|
|
8
|
+
return _origEmit.apply(process, [event, ...args]);
|
|
9
|
+
};
|
|
2
10
|
import { Command } from 'commander';
|
|
3
11
|
import { resolveConfig } from '../src/config.js';
|
|
4
12
|
import { resolveInference } from '../src/adapters/inference/resolve.js';
|
package/dist/bin/snapeval.js.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"snapeval.js","sourceRoot":"","sources":["../../bin/snapeval.ts"],"names":[],"mappings":";
|
|
1
|
+
{"version":3,"file":"snapeval.js","sourceRoot":"","sources":["../../bin/snapeval.ts"],"names":[],"mappings":";AAEA,4EAA4E;AAC5E,MAAM,SAAS,GAAG,OAAO,CAAC,IAAI,CAAC;AAC/B,2CAA2C;AAC3C,OAAO,CAAC,IAAI,GAAG,UAAU,KAAa,EAAE,GAAG,IAAW;IACpD,IAAI,KAAK,KAAK,SAAS,IAAI,IAAI,CAAC,CAAC,CAAC,EAAE,IAAI,KAAK,qBAAqB;QAAE,OAAO,KAAK,CAAC;IACjF,OAAO,SAAS,CAAC,KAAK,CAAC,OAAO,EAAE,CAAC,KAAK,EAAE,GAAG,IAAI,CAAQ,CAAC,CAAC;AAC3D,CAAC,CAAC;AAEF,OAAO,EAAE,OAAO,EAAE,MAAM,WAAW,CAAC;AACpC,OAAO,EAAE,aAAa,EAAE,MAAM,kBAAkB,CAAC;AACjD,OAAO,EAAE,gBAAgB,EAAE,MAAM,sCAAsC,CAAC;AACxE,OAAO,EAAE,cAAc,EAAE,MAAM,oCAAoC,CAAC;AACpE,OAAO,EAAE,WAAW,EAAE,MAAM,yBAAyB,CAAC;AACtD,OAAO,EAAE,aAAa,EAAE,MAAM,2BAA2B,CAAC;AAC1D,OAAO,EAAE,gBAAgB,EAAE,MAAM,oCAAoC,CAAC;AACtE,OAAO,EAAE,aAAa,EAAE,MAAM,kBAAkB,CAAC;AACjD,OAAO,EAAE,UAAU,EAAE,MAAM,uCAAuC,CAAC;AACnE,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AAElC,MAAM,OAAO,GAAG,IAAI,OAAO,EAAE,CAAC;AAE9B,OAAO;KACJ,IAAI,CAAC,UAAU,CAAC;KAChB,WAAW,CAAC,wDAAwD,CAAC;KACrE,OAAO,CAAC,OAAO,CAAC,CAAC;AAEpB,eAAe;AACf,OAAO;KACJ,OAAO,CAAC,MAAM,CAAC;KACf,WAAW,CAAC,qEAAqE,CAAC;KAClF,MAAM,CAAC,qBAAqB,EAAE,gBAAgB,CAAC;KAC/C,MAAM,CAAC,yBAAyB,EAAE,0BAA0B,CAAC;KAC7D,MAAM,CAAC,oBAAoB,EAAE,qBAAqB,CAAC;KACnD,MAAM,CAAC,YAAY,EAAE,4CAA4C,EAAE,GAAG,CAAC;KACvE,MAAM,CAAC,mBAAmB,EAAE,gDAAgD,EAAE,GAAG,CAAC;KAClF,MAAM,CAAC,cAAc,EAAE,iEAAiE,CAAC;KACzF,MAAM,CAAC,oBAAoB,EAAE,6EAA6E,CAAC;KAC3G,MAAM,CAAC,oBAAoB,EAAE,uDAAuD,CAAC;KACrF,MAAM,CAAC,WAAW,EAAE,gBAAgB,CAAC;KACrC,QAAQ,CAAC,aAAa,EAAE,yBAAyB,EAAE,OAAO,CAAC,GAAG,EAAE,CAAC;KACjE,MAAM,CAAC,KAAK,EAAE,QAAgB,EAAE,IAAsC,EAAE,EAAE;IACzE,IAAI,CAAC;QACH,MAAM,SAAS,GAAG,IAAI,CAAC,OAAO,CAAC,QAAQ,CAAC,CAAC;QACzC,MAAM,MAAM,GAAG,aAAa,CAC1B;YACE,OAAO,EAAE,IAAI,CAAC,OAAiB;YAC/B,SAAS,EAAE,IAAI,CAAC,SAAmB;YACnC,SAAS,EAAE,IAAI,CAAC,SAAmB;YACnC,IAAI,EAAE,IAAI,CAAC,IAAI,CAAC,CAAC,CAAC,QAAQ,CAAC,IAAI,CAAC,IAAc,EAAE,EAAE,CAAC,CAAC,CAAC,CAAC,SAAS;YAC/D,WAAW,EAAE,IAAI,CAAC,WAAW,CAAC,CAAC,CAAC,QAAQ,CAAC,IAAI,CAAC,WAAqB,EAAE,EAAE,CAAC,CAAC,CAAC,CAAC,SAAS;SACrF,EACD,OAAO,CAAC,GAAG,EAAE,EAAE,SAAS,CACzB,CAAC;QACF,MAAM,OAAO,GAAG,cAAc,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC;QAC/C,MAAM,SAAS,GAAG,gBAAgB,CAAC,MAAM,CAAC,SAAS,CAAC,CAAC;QAErD,MAAM,IAAI,GAAG,IAAI,CAAC,IAAI;YACpB,CAAC,CAAE,IAAI,CAAC,IAAe,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,QAAQ,CAAC,CAAC,CAAC,IAAI,EAAE,EAAE,EAAE,CAAC,CAAC;YACrE,CAAC,CAAC,SAAS,CAAC;QACd,MAAM,SAAS,GAAG,IAAI,CAAC,SAAS;YAC9B,CAAC,CAAC,UAAU,CAAC,IAAI,CAAC,SAAmB,CAAC;YACtC,CAAC,CAAC,SAAS,CAAC;QAEd,MAAM,OAAO,GAAG,MAAM,WAAW,CAAC,SAAS,EAAE,OAAO,EAAE,SAAS,EAAE;YAC/D,SAAS,EAAE,MAAM,CAAC,SAAS;YAC3B,IAAI,EAAE,MAAM,CAAC,IAAI;YACjB,WAAW,EAAE,MAAM,CAAC,WAAW;YAC/B,IAAI;YACJ,SAAS;YACT,QAAQ,EAAE,IAAI,CAAC,QAA8B;SAC9C,CAAC,CAAC;QAEH,MAAM,QAAQ,GAAG,IAAI,gBAAgB,EAAE,CAAC;QACxC,MAAM,QAAQ,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC;QAC/B,OAAO,CAAC,GAAG,CAAC,cAAc,OAAO,CAAC,YAAY,EAAE,CAAC,CAAC;QAClD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IAClB,CAAC;IAAC,OAAO,GAAQ,EAAE,CAAC;QAClB,iEAAiE;QACjE,IAAI,GAAG,CAAC,OAAO,EAAE,CAAC;YAChB,MAAM,QAAQ,GAAG,IAAI,gBAAgB,EAAE,CAAC;YACxC,MAAM,QAAQ,CAAC,MAAM,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;YACnC,OAAO,CAAC,GAAG,CAAC,cAAc,GAAG,CAAC,OAAO,CAAC,YAAY,EAAE,CAAC,CAAC;QACxD,CAAC;QACD,WAAW,CAAC,GAAG,CAAC,CAAC;IACnB,CAAC;AACH,CAAC,CAAC,CAAC;AAEL,iBAAiB;AACjB,OAAO;KACJ,OAAO,CAAC,QAAQ,CAAC;KACjB,WAAW,CAAC,mDAAmD,CAAC;KAChE,MAAM,CAAC,qBAAqB,EAAE,gBAAgB,CAAC;KAC/C,MAAM,CAAC,yBAAyB,EAAE,0BAA0B,CAAC;KAC7D,MAAM,CAAC,oBAAoB,EAAE,qBAAqB,CAAC;KACnD,MAAM,CAAC,YAAY,EAAE,4CAA4C,EAAE,GAAG,CAAC;KACvE,MAAM,CAAC,mBAAmB,EAAE,gDAAgD,EAAE,GAAG,CAAC;KAClF,MAAM,CAAC,oBAAoB,EAAE,uDAAuD,CAAC;KACrF,MAAM,CAAC,WAAW,EAAE,qBAAqB,CAAC;KAC1C,MAAM,CAAC,WAAW,EAAE,gBAAgB,CAAC;KACrC,QAAQ,CAAC,aAAa,EAAE,yBAAyB,EAAE,OAAO,CAAC,GAAG,EAAE,CAAC;KACjE,MAAM,CAAC,KAAK,EAAE,QAAgB,EAAE,IAAsC,EAAE,EAAE;IACzE,IAAI,CAAC;QACH,MAAM,SAAS,GAAG,IAAI,CAAC,OAAO,CAAC,QAAQ,CAAC,CAAC;QACzC,MAAM,MAAM,GAAG,aAAa,CAC1B;YACE,OAAO,EAAE,IAAI,CAAC,OAAiB;YAC/B,SAAS,EAAE,IAAI,CAAC,SAAmB;YACnC,SAAS,EAAE,IAAI,CAAC,SAAmB;YACnC,IAAI,EAAE,IAAI,CAAC,IAAI,CAAC,CAAC,CAAC,QAAQ,CAAC,IAAI,CAAC,IAAc,EAAE,EAAE,CAAC,CAAC,CAAC,CAAC,SAAS;YAC/D,WAAW,EAAE,IAAI,CAAC,WAAW,CAAC,CAAC,CAAC,QAAQ,CAAC,IAAI,CAAC,WAAqB,EAAE,EAAE,CAAC,CAAC,CAAC,CAAC,SAAS;SACrF,EACD,OAAO,CAAC,GAAG,EAAE,EAAE,SAAS,CACzB,CAAC;QACF,MAAM,OAAO,GAAG,cAAc,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC;QAC/C,MAAM,SAAS,GAAG,gBAAgB,CAAC,MAAM,CAAC,SAAS,CAAC,CAAC;QAErD,MAAM,aAAa,CAAC,SAAS,EAAE,OAAO,EAAE,SAAS,EAAE;YACjD,SAAS,EAAE,MAAM,CAAC,SAAS;YAC3B,IAAI,EAAE,MAAM,CAAC,IAAI;YACjB,WAAW,EAAE,MAAM,CAAC,WAAW;YAC/B,QAAQ,EAAE,IAAI,CAAC,QAA8B;YAC7C,MAAM,EAAE,IAAI,CAAC,IAAI,KAAK,KAAK;SAC5B,CAAC,CAAC;QACH,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IAClB,CAAC;IAAC,OAAO,GAAG,EAAE,CAAC;QAAC,WAAW,CAAC,GAAG,CAAC,CAAC;IAAC,CAAC;AACrC,CAAC,CAAC,CAAC;AAEL,uDAAuD;AACvD,OAAO,CAAC,EAAE,CAAC,MAAM,EAAE,GAAG,EAAE,GAAG,UAAU,EAAE,CAAC,KAAK,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;AAE5D,SAAS,WAAW,CAAC,GAAY;IAC/B,IAAI,GAAG,YAAY,aAAa,EAAE,CAAC;QACjC,OAAO,CAAC,KAAK,CAAC,UAAU,GAAG,CAAC,OAAO,EAAE,CAAC,CAAC;QACvC,OAAO,CAAC,IAAI,CAAC,GAAG,CAAC,QAAQ,IAAI,CAAC,CAAC,CAAC;IAClC,CAAC;IACD,IAAI,GAAG,YAAY,KAAK,EAAE,CAAC;QACzB,OAAO,CAAC,KAAK,CAAC,UAAU,GAAG,CAAC,OAAO,EAAE,CAAC,CAAC;QACvC,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IAClB,CAAC;IACD,OAAO,CAAC,KAAK,CAAC,4BAA4B,CAAC,CAAC;IAC5C,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC;AAED,OAAO,CAAC,KAAK,CAAC,OAAO,CAAC,IAAI,CAAC,CAAC"}
|
|
@@ -25,7 +25,9 @@ export async function getClient() {
|
|
|
25
25
|
if (!CopilotClient) {
|
|
26
26
|
throw new Error('Could not find CopilotClient export in @github/copilot-sdk. The package may have changed its API.');
|
|
27
27
|
}
|
|
28
|
-
|
|
28
|
+
// Suppress ExperimentalWarning (e.g., SQLite) in the spawned CLI subprocess
|
|
29
|
+
const env = { ...process.env, NODE_OPTIONS: [process.env.NODE_OPTIONS, '--no-warnings'].filter(Boolean).join(' ') };
|
|
30
|
+
clientInstance = new CopilotClient({ logLevel: 'none', env });
|
|
29
31
|
await clientInstance.start();
|
|
30
32
|
clientStarted = true;
|
|
31
33
|
return clientInstance;
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"copilot-sdk-client.js","sourceRoot":"","sources":["../../../src/adapters/copilot-sdk-client.ts"],"names":[],"mappings":"AAAA;;;;;GAKG;AAEH,OAAO,KAAK,EAAE,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AAElC,iEAAiE;AACjE,4DAA4D;AAC5D,IAAI,cAAc,GAAQ,IAAI,CAAC;AAC/B,IAAI,aAAa,GAAG,KAAK,CAAC;AAE1B,MAAM,CAAC,KAAK,UAAU,SAAS;IAC7B,IAAI,cAAc,IAAI,aAAa;QAAE,OAAO,cAAc,CAAC;IAE3D,IAAI,GAAQ,CAAC;IACb,IAAI,CAAC;QACH,+DAA+D;QAC/D,GAAG,GAAG,MAAM,MAAM,CAAC,qBAAqB,CAAC,CAAC;IAC5C,CAAC;IAAC,MAAM,CAAC;QACP,MAAM,IAAI,KAAK,CACb,mGAAmG,CACpG,CAAC;IACJ,CAAC;IAED,MAAM,aAAa,GAAG,GAAG,CAAC,aAAa,IAAI,GAAG,CAAC,OAAO,EAAE,aAAa,CAAC;IACtE,IAAI,CAAC,aAAa,EAAE,CAAC;QACnB,MAAM,IAAI,KAAK,CACb,mGAAmG,CACpG,CAAC;IACJ,CAAC;IAED,cAAc,GAAG,IAAI,aAAa,CAAC,EAAE,QAAQ,EAAE,MAAM,EAAE,CAAC,CAAC;
|
|
1
|
+
{"version":3,"file":"copilot-sdk-client.js","sourceRoot":"","sources":["../../../src/adapters/copilot-sdk-client.ts"],"names":[],"mappings":"AAAA;;;;;GAKG;AAEH,OAAO,KAAK,EAAE,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AAElC,iEAAiE;AACjE,4DAA4D;AAC5D,IAAI,cAAc,GAAQ,IAAI,CAAC;AAC/B,IAAI,aAAa,GAAG,KAAK,CAAC;AAE1B,MAAM,CAAC,KAAK,UAAU,SAAS;IAC7B,IAAI,cAAc,IAAI,aAAa;QAAE,OAAO,cAAc,CAAC;IAE3D,IAAI,GAAQ,CAAC;IACb,IAAI,CAAC;QACH,+DAA+D;QAC/D,GAAG,GAAG,MAAM,MAAM,CAAC,qBAAqB,CAAC,CAAC;IAC5C,CAAC;IAAC,MAAM,CAAC;QACP,MAAM,IAAI,KAAK,CACb,mGAAmG,CACpG,CAAC;IACJ,CAAC;IAED,MAAM,aAAa,GAAG,GAAG,CAAC,aAAa,IAAI,GAAG,CAAC,OAAO,EAAE,aAAa,CAAC;IACtE,IAAI,CAAC,aAAa,EAAE,CAAC;QACnB,MAAM,IAAI,KAAK,CACb,mGAAmG,CACpG,CAAC;IACJ,CAAC;IAED,4EAA4E;IAC5E,MAAM,GAAG,GAAG,EAAE,GAAG,OAAO,CAAC,GAAG,EAAE,YAAY,EAAE,CAAC,OAAO,CAAC,GAAG,CAAC,YAAY,EAAE,eAAe,CAAC,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC;IACpH,cAAc,GAAG,IAAI,aAAa,CAAC,EAAE,QAAQ,EAAE,MAAM,EAAE,GAAG,EAAE,CAAC,CAAC;IAC9D,MAAM,cAAc,CAAC,KAAK,EAAE,CAAC;IAC7B,aAAa,GAAG,IAAI,CAAC;IACrB,OAAO,cAAc,CAAC;AACxB,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,UAAU;IAC9B,IAAI,cAAc,IAAI,aAAa,EAAE,CAAC;QACpC,MAAM,cAAc,CAAC,IAAI,EAAE,CAAC;QAC5B,aAAa,GAAG,KAAK,CAAC;QACtB,cAAc,GAAG,IAAI,CAAC;IACxB,CAAC;AACH,CAAC;AAED,MAAM,UAAU,cAAc;IAC5B,iEAAiE;IACjE,mEAAmE;IACnE,IAAI,GAAG,GAAG,OAAO,CAAC,GAAG,EAAE,CAAC;IACxB,OAAO,IAAI,EAAE,CAAC;QACZ,MAAM,SAAS,GAAG,IAAI,CAAC,IAAI,CAAC,GAAG,EAAE,cAAc,EAAE,SAAS,EAAE,aAAa,EAAE,cAAc,CAAC,CAAC;QAC3F,IAAI,EAAE,CAAC,UAAU,CAAC,SAAS,CAAC;YAAE,OAAO,IAAI,CAAC;QAC1C,MAAM,MAAM,GAAG,IAAI,CAAC,OAAO,CAAC,GAAG,CAAC,CAAC;QACjC,IAAI,MAAM,KAAK,GAAG;YAAE,MAAM;QAC1B,GAAG,GAAG,MAAM,CAAC;IACf,CAAC;IACD,OAAO,KAAK,CAAC;AACf,CAAC"}
|
|
@@ -29,10 +29,10 @@ function loadPreviousIteration(iterationDir) {
|
|
|
29
29
|
}
|
|
30
30
|
}
|
|
31
31
|
function evalLabel(run) {
|
|
32
|
-
|
|
32
|
+
if (run.label)
|
|
33
|
+
return run.label;
|
|
33
34
|
if (run.slug && run.slug !== `${run.evalId}`)
|
|
34
35
|
return run.slug;
|
|
35
|
-
// Truncate prompt but show first meaningful line
|
|
36
36
|
const firstLine = run.prompt.split('\n')[0].slice(0, 60);
|
|
37
37
|
return firstLine;
|
|
38
38
|
}
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"terminal.js","sourceRoot":"","sources":["../../../../src/adapters/report/terminal.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AAClC,OAAO,KAAK,MAAM,OAAO,CAAC;AAQ1B,SAAS,qBAAqB,CAAC,YAAoB;IACjD,MAAM,YAAY,GAAG,IAAI,CAAC,OAAO,CAAC,YAAY,CAAC,CAAC;IAChD,MAAM,WAAW,GAAG,IAAI,CAAC,QAAQ,CAAC,YAAY,CAAC,CAAC;IAChD,MAAM,UAAU,GAAG,QAAQ,CAAC,WAAW,CAAC,OAAO,CAAC,YAAY,EAAE,EAAE,CAAC,EAAE,EAAE,CAAC,CAAC;IACvE,IAAI,KAAK,CAAC,UAAU,CAAC,IAAI,UAAU,IAAI,CAAC;QAAE,OAAO,IAAI,CAAC;IACtD,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,YAAY,EAAE,aAAa,UAAU,GAAG,CAAC,EAAE,CAAC,CAAC;IACvE,MAAM,iBAAiB,GAAG,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,gBAAgB,CAAC,CAAC;IAC/D,IAAI,CAAC,EAAE,CAAC,UAAU,CAAC,iBAAiB,CAAC;QAAE,OAAO,IAAI,CAAC;IACnD,IAAI,CAAC;QACH,MAAM,SAAS,GAAG,IAAI,CAAC,KAAK,CAAC,EAAE,CAAC,YAAY,CAAC,iBAAiB,EAAE,OAAO,CAAC,CAAC,CAAC;QAC1E,MAAM,QAAQ,GAAG,IAAI,GAAG,EAAuE,CAAC;QAChG,MAAM,QAAQ,GAAG,EAAE,CAAC,WAAW,CAAC,OAAO,CAAC,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,UAAU,CAAC,OAAO,CAAC,CAAC,CAAC;QAC5E,KAAK,MAAM,OAAO,IAAI,QAAQ,EAAE,CAAC;YAC/B,MAAM,MAAM,GAAG,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,OAAO,EAAE,YAAY,EAAE,cAAc,CAAC,CAAC;YACzE,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,OAAO,EAAE,eAAe,EAAE,cAAc,CAAC,CAAC;YAC7E,MAAM,EAAE,GAAG,EAAE,CAAC,UAAU,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,EAAE,CAAC,YAAY,CAAC,MAAM,EAAE,OAAO,CAAC,CAAC,CAAC,CAAC,CAAC,SAAS,CAAC;YAC5F,MAAM,GAAG,GAAG,EAAE,CAAC,UAAU,CAAC,OAAO,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,EAAE,CAAC,YAAY,CAAC,OAAO,EAAE,OAAO,CAAC,CAAC,CAAC,CAAC,CAAC,SAAS,CAAC;YAC/F,QAAQ,CAAC,GAAG,CAAC,OAAO,EAAE,EAAE,SAAS,EAAE,EAAE,EAAE,YAAY,EAAE,GAAG,EAAE,CAAC,CAAC;QAC9D,CAAC;QACD,OAAO,EAAE,SAAS,EAAE,QAAQ,EAAE,CAAC;IACjC,CAAC;IAAC,MAAM,CAAC;QACP,OAAO,IAAI,CAAC;IACd,CAAC;AACH,CAAC;AAED,SAAS,SAAS,CAAC,
|
|
1
|
+
{"version":3,"file":"terminal.js","sourceRoot":"","sources":["../../../../src/adapters/report/terminal.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AAClC,OAAO,KAAK,MAAM,OAAO,CAAC;AAQ1B,SAAS,qBAAqB,CAAC,YAAoB;IACjD,MAAM,YAAY,GAAG,IAAI,CAAC,OAAO,CAAC,YAAY,CAAC,CAAC;IAChD,MAAM,WAAW,GAAG,IAAI,CAAC,QAAQ,CAAC,YAAY,CAAC,CAAC;IAChD,MAAM,UAAU,GAAG,QAAQ,CAAC,WAAW,CAAC,OAAO,CAAC,YAAY,EAAE,EAAE,CAAC,EAAE,EAAE,CAAC,CAAC;IACvE,IAAI,KAAK,CAAC,UAAU,CAAC,IAAI,UAAU,IAAI,CAAC;QAAE,OAAO,IAAI,CAAC;IACtD,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,YAAY,EAAE,aAAa,UAAU,GAAG,CAAC,EAAE,CAAC,CAAC;IACvE,MAAM,iBAAiB,GAAG,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,gBAAgB,CAAC,CAAC;IAC/D,IAAI,CAAC,EAAE,CAAC,UAAU,CAAC,iBAAiB,CAAC;QAAE,OAAO,IAAI,CAAC;IACnD,IAAI,CAAC;QACH,MAAM,SAAS,GAAG,IAAI,CAAC,KAAK,CAAC,EAAE,CAAC,YAAY,CAAC,iBAAiB,EAAE,OAAO,CAAC,CAAC,CAAC;QAC1E,MAAM,QAAQ,GAAG,IAAI,GAAG,EAAuE,CAAC;QAChG,MAAM,QAAQ,GAAG,EAAE,CAAC,WAAW,CAAC,OAAO,CAAC,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,UAAU,CAAC,OAAO,CAAC,CAAC,CAAC;QAC5E,KAAK,MAAM,OAAO,IAAI,QAAQ,EAAE,CAAC;YAC/B,MAAM,MAAM,GAAG,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,OAAO,EAAE,YAAY,EAAE,cAAc,CAAC,CAAC;YACzE,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,OAAO,EAAE,eAAe,EAAE,cAAc,CAAC,CAAC;YAC7E,MAAM,EAAE,GAAG,EAAE,CAAC,UAAU,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,EAAE,CAAC,YAAY,CAAC,MAAM,EAAE,OAAO,CAAC,CAAC,CAAC,CAAC,CAAC,SAAS,CAAC;YAC5F,MAAM,GAAG,GAAG,EAAE,CAAC,UAAU,CAAC,OAAO,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,EAAE,CAAC,YAAY,CAAC,OAAO,EAAE,OAAO,CAAC,CAAC,CAAC,CAAC,CAAC,SAAS,CAAC;YAC/F,QAAQ,CAAC,GAAG,CAAC,OAAO,EAAE,EAAE,SAAS,EAAE,EAAE,EAAE,YAAY,EAAE,GAAG,EAAE,CAAC,CAAC;QAC9D,CAAC;QACD,OAAO,EAAE,SAAS,EAAE,QAAQ,EAAE,CAAC;IACjC,CAAC;IAAC,MAAM,CAAC;QACP,OAAO,IAAI,CAAC;IACd,CAAC;AACH,CAAC;AAED,SAAS,SAAS,CAAC,GAAqE;IACtF,IAAI,GAAG,CAAC,KAAK;QAAE,OAAO,GAAG,CAAC,KAAK,CAAC;IAChC,IAAI,GAAG,CAAC,IAAI,IAAI,GAAG,CAAC,IAAI,KAAK,GAAG,GAAG,CAAC,MAAM,EAAE;QAAE,OAAO,GAAG,CAAC,IAAI,CAAC;IAC9D,MAAM,SAAS,GAAG,GAAG,CAAC,MAAM,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC;IACzD,OAAO,SAAS,CAAC;AACnB,CAAC;AAED,MAAM,OAAO,gBAAgB;IAClB,IAAI,GAAG,UAAU,CAAC;IAE3B,KAAK,CAAC,MAAM,CAAC,OAAoB;QAC/B,MAAM,EAAE,SAAS,EAAE,QAAQ,EAAE,SAAS,EAAE,GAAG,OAAO,CAAC;QAEnD,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,IAAI,CAAC,gBAAgB,SAAS,EAAE,CAAC,CAAC,CAAC;QACrD,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,+CAA+C,CAAC,CAAC,CAAC;QACxE,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,GAAG,CAAC,MAAM,CAAC,EAAE,CAAC,CAAC,CAAC,CAAC;QAEvC,MAAM,IAAI,GAAG,qBAAqB,CAAC,OAAO,CAAC,YAAY,CAAC,CAAC;QAEzD,KAAK,MAAM,GAAG,IAAI,QAAQ,EAAE,CAAC;YAC3B,MAAM,SAAS,GAAG,GAAG,CAAC,SAAS,CAAC,OAAO,CAAC;YACxC,MAAM,MAAM,GAAG,SAAS,EAAE,OAAO,CAAC,SAAS,CAAC;YAC5C,MAAM,OAAO,GAAG,GAAG,CAAC,YAAY,CAAC,OAAO,EAAE,OAAO,CAAC,SAAS,CAAC;YAC5D,MAAM,OAAO,GAAG,MAAM,KAAK,SAAS,CAAC,CAAC,CAAC,GAAG,CAAC,MAAM,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,KAAK,CAAC;YAC/E,MAAM,QAAQ,GAAG,OAAO,KAAK,SAAS,CAAC,CAAC,CAAC,GAAG,CAAC,OAAO,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,KAAK,CAAC;YAClF,MAAM,OAAO,GAAG,MAAM,KAAK,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC,CAAC,MAAM,KAAK,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,KAAK,CAAC,MAAM,CAAC;YACrF,MAAM,SAAS,GAAG,CAAC,GAAG,CAAC,SAAS,CAAC,MAAM,CAAC,WAAW,GAAG,IAAI,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,CAAC;YAEvE,8CAA8C;YAC9C,IAAI,YAAY,GAAG,EAAE,CAAC;YACtB,IAAI,IAAI,EAAE,CAAC;gBACT,MAAM,WAAW,GAAG,IAAI,CAAC,QAAQ,CAAC,GAAG,CAAC,QAAQ,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC;gBAC1D,MAAM,QAAQ,GAAG,WAAW,EAAE,SAAS,EAAE,OAAO,CAAC,SAAS,CAAC;gBAC3D,IAAI,QAAQ,KAAK,SAAS,IAAI,MAAM,KAAK,SAAS,EAAE,CAAC;oBACnD,MAAM,MAAM,GAAG,MAAM,GAAG,QAAQ,CAAC;oBACjC,IAAI,MAAM,KAAK,CAAC,EAAE,CAAC;wBACjB,MAAM,KAAK,GAAG,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC;wBAC7D,YAAY,GAAG,IAAI,KAAK,QAAQ,CAAC,QAAQ,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC;oBACjE,CAAC;gBACH,CAAC;YACH,CAAC;YAED,OAAO,CAAC,GAAG,CAAC,KAAK,KAAK,CAAC,IAAI,CAAC,IAAI,GAAG,CAAC,MAAM,EAAE,CAAC,IAAI,SAAS,CAAC,GAAG,CAAC,EAAE,CAAC,CAAC;YACnE,OAAO,CAAC,GAAG,CAAC,cAAc,OAAO,CAAC,OAAO,CAAC,GAAG,YAAY,gBAAgB,QAAQ,MAAM,SAAS,GAAG,CAAC,CAAC;YAErG,gCAAgC;YAChC,IAAI,SAAS,EAAE,CAAC;gBACd,MAAM,MAAM,GAAG,SAAS,CAAC,iBAAiB,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC;gBACpE,KAAK,MAAM,CAAC,IAAI,MAAM,EAAE,CAAC;oBACvB,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,aAAa,CAAC,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;oBAC9C,IAAI,CAAC,CAAC,QAAQ,EAAE,CAAC;wBACf,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,aAAa,CAAC,CAAC,QAAQ,CAAC,KAAK,CAAC,CAAC,EAAE,GAAG,CAAC,EAAE,CAAC,CAAC,CAAC;oBAClE,CAAC;gBACH,CAAC;YACH,CAAC;QACH,CAAC;QAED,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,GAAG,CAAC,MAAM,CAAC,EAAE,CAAC,CAAC,CAAC,CAAC;QAEvC,MAAM,EAAE,GAAG,SAAS,CAAC,WAAW,CAAC,UAAU,CAAC;QAC5C,MAAM,GAAG,GAAG,SAAS,CAAC,WAAW,CAAC,aAAa,CAAC;QAChD,MAAM,KAAK,GAAG,SAAS,CAAC,WAAW,CAAC,KAAK,CAAC;QAC1C,MAAM,UAAU,GAAG,KAAK,CAAC,SAAS,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,SAAS,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,KAAK,CAAC,GAAG,CAAC;QAEnG,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,IAAI,CAAC,UAAU,CAAC,CAAC,CAAC;QACpC,OAAO,CAAC,GAAG,CAAC,yBAAyB,CAAC,EAAE,CAAC,SAAS,CAAC,IAAI,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC;QAC9E,OAAO,CAAC,GAAG,CAAC,yBAAyB,CAAC,GAAG,CAAC,SAAS,CAAC,IAAI,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC;QAC/E,OAAO,CAAC,GAAG,CAAC,yBAAyB,UAAU,CAAC,GAAG,KAAK,CAAC,SAAS,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,GAAG,CAAC,KAAK,CAAC,SAAS,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC,EAAE,CAAC,CAAC;QAE9H,IAAI,IAAI,EAAE,CAAC;YACT,MAAM,QAAQ,GAAG,IAAI,CAAC,SAAS,CAAC,WAAW,CAAC,UAAU,CAAC,SAAS,CAAC,IAAI,CAAC;YACtE,MAAM,QAAQ,GAAG,EAAE,CAAC,SAAS,CAAC,IAAI,CAAC;YACnC,MAAM,MAAM,GAAG,QAAQ,GAAG,QAAQ,CAAC;YACnC,MAAM,WAAW,GAAG,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,KAAK,CAAC,GAAG,CAAC;YAClF,OAAO,CAAC,GAAG,CAAC,yBAAyB,WAAW,CAAC,GAAG,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,GAAG,CAAC,MAAM,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,GAAG,CAAC,SAAS,CAAC,QAAQ,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC;YAEnJ,gCAAgC;YAChC,MAAM,aAAa,GAAG,IAAI,CAAC,QAAQ,CAAC,IAAI,CAAC;YACzC,MAAM,aAAa,GAAG,QAAQ,CAAC,MAAM,CAAC;YACtC,IAAI,aAAa,KAAK,aAAa,EAAE,CAAC;gBACpC,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,6BAA6B,aAAa,MAAM,aAAa,SAAS,CAAC,CAAC,CAAC;YACjG,CAAC;QACH,CAAC;IACH,CAAC;CACF"}
|
|
@@ -18,6 +18,31 @@ async function runWithConcurrency(tasks, limit) {
|
|
|
18
18
|
return results;
|
|
19
19
|
}
|
|
20
20
|
const MAX_CONCURRENCY = 10;
|
|
21
|
+
/**
|
|
22
|
+
* Average pass rates across multiple grading runs.
|
|
23
|
+
* Uses the last run's assertion_results for display, but averages the
|
|
24
|
+
* pass_rate across all runs so --runs N provides statistical significance.
|
|
25
|
+
*/
|
|
26
|
+
function averageGradings(gradings) {
|
|
27
|
+
const valid = gradings.filter((g) => g !== null);
|
|
28
|
+
if (valid.length === 0)
|
|
29
|
+
return undefined;
|
|
30
|
+
if (valid.length === 1)
|
|
31
|
+
return valid[0];
|
|
32
|
+
const avgPassRate = valid.reduce((sum, g) => sum + g.summary.pass_rate, 0) / valid.length;
|
|
33
|
+
const avgPassed = valid.reduce((sum, g) => sum + g.summary.passed, 0) / valid.length;
|
|
34
|
+
const avgFailed = valid.reduce((sum, g) => sum + g.summary.failed, 0) / valid.length;
|
|
35
|
+
const last = valid[valid.length - 1];
|
|
36
|
+
return {
|
|
37
|
+
assertion_results: last.assertion_results,
|
|
38
|
+
summary: {
|
|
39
|
+
passed: Math.round(avgPassed),
|
|
40
|
+
failed: Math.round(avgFailed),
|
|
41
|
+
total: last.summary.total,
|
|
42
|
+
pass_rate: avgPassRate,
|
|
43
|
+
},
|
|
44
|
+
};
|
|
45
|
+
}
|
|
21
46
|
function validateEvalsFile(evalsFile, evalsPath) {
|
|
22
47
|
if (!evalsFile.skill_name || typeof evalsFile.skill_name !== 'string') {
|
|
23
48
|
throw new SnapevalError(`Invalid evals.json at ${evalsPath}: missing or invalid "skill_name" field.`);
|
|
@@ -63,6 +88,9 @@ export async function evalCommand(skillPath, harness, inference, options) {
|
|
|
63
88
|
}
|
|
64
89
|
evalsFile = { ...evalsFile, evals: filtered };
|
|
65
90
|
}
|
|
91
|
+
if (options.threshold !== undefined && (options.threshold < 0 || options.threshold > 1)) {
|
|
92
|
+
throw new SnapevalError(`Threshold must be between 0 and 1 (e.g., 0.8 for 80%). Got: ${options.threshold}`);
|
|
93
|
+
}
|
|
66
94
|
const ws = new WorkspaceManager(skillPath, options.workspace);
|
|
67
95
|
const iterationDir = ws.createIteration();
|
|
68
96
|
// Track which SKILL.md was used for this iteration
|
|
@@ -95,20 +123,31 @@ export async function evalCommand(skillPath, harness, inference, options) {
|
|
|
95
123
|
if (!lastRun) {
|
|
96
124
|
throw new SnapevalError(`No runs completed for eval ${evalCase.id}`);
|
|
97
125
|
}
|
|
98
|
-
//
|
|
99
|
-
|
|
100
|
-
const
|
|
126
|
+
// Average pass rates across all runs for statistical significance
|
|
127
|
+
const withSkillGrading = averageGradings(allGradings.map(g => g.withSkill));
|
|
128
|
+
const withoutSkillGrading = averageGradings(allGradings.map(g => g.withoutSkill));
|
|
129
|
+
// When runs > 1, overwrite grading.json with averaged results so
|
|
130
|
+
// artifacts match the benchmark (not just the last run's raw data)
|
|
131
|
+
if (runs > 1) {
|
|
132
|
+
if (withSkillGrading) {
|
|
133
|
+
fs.writeFileSync(path.join(evalDir, 'with_skill', 'grading.json'), JSON.stringify(withSkillGrading, null, 2));
|
|
134
|
+
}
|
|
135
|
+
if (withoutSkillGrading) {
|
|
136
|
+
fs.writeFileSync(path.join(evalDir, baselineVariant, 'grading.json'), JSON.stringify(withoutSkillGrading, null, 2));
|
|
137
|
+
}
|
|
138
|
+
}
|
|
101
139
|
return {
|
|
102
140
|
evalId: evalCase.id,
|
|
103
141
|
slug,
|
|
142
|
+
label: evalCase.label,
|
|
104
143
|
prompt: evalCase.prompt,
|
|
105
144
|
withSkill: {
|
|
106
145
|
output: lastRun.withSkill.output,
|
|
107
|
-
grading:
|
|
146
|
+
grading: withSkillGrading,
|
|
108
147
|
},
|
|
109
148
|
withoutSkill: {
|
|
110
149
|
output: lastRun.withoutSkill.output,
|
|
111
|
-
grading:
|
|
150
|
+
grading: withoutSkillGrading,
|
|
112
151
|
},
|
|
113
152
|
};
|
|
114
153
|
});
|
|
@@ -121,10 +160,11 @@ export async function evalCommand(skillPath, harness, inference, options) {
|
|
|
121
160
|
eval_count: evalRuns.length,
|
|
122
161
|
eval_ids: evalRuns.map((r) => r.evalId),
|
|
123
162
|
skill_name: evalsFile.skill_name,
|
|
163
|
+
runs_per_eval: runs,
|
|
124
164
|
timestamp: new Date().toISOString(),
|
|
125
165
|
},
|
|
126
166
|
};
|
|
127
|
-
fs.writeFileSync(path.join(iterationDir, 'benchmark.json'), JSON.stringify(benchmarkWithMeta,
|
|
167
|
+
fs.writeFileSync(path.join(iterationDir, 'benchmark.json'), JSON.stringify(benchmarkWithMeta, (_key, value) => typeof value === 'number' ? Math.round(value * 10000) / 10000 : value, 2));
|
|
128
168
|
// Check threshold if set (for CI gating)
|
|
129
169
|
if (options.threshold !== undefined) {
|
|
130
170
|
const passRate = benchmark.run_summary.with_skill.pass_rate.mean;
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"eval.js","sourceRoot":"","sources":["../../../src/commands/eval.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AASlC,OAAO,EAAE,gBAAgB,EAAE,MAAM,wBAAwB,CAAC;AAC1D,OAAO,EAAE,OAAO,EAAE,MAAM,qBAAqB,CAAC;AAC9C,OAAO,EAAE,eAAe,EAAE,MAAM,qBAAqB,CAAC;AACtD,OAAO,EAAE,gBAAgB,EAAE,MAAM,yBAAyB,CAAC;AAC3D,OAAO,EAAE,aAAa,EAAE,iBAAiB,EAAE,cAAc,EAAE,MAAM,cAAc,CAAC;AAEhF,KAAK,UAAU,kBAAkB,CAC/B,KAA2B,EAC3B,KAAa;IAEb,MAAM,OAAO,GAAQ,IAAI,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,CAAC;IAC7C,IAAI,KAAK,GAAG,CAAC,CAAC;IACd,KAAK,UAAU,MAAM;QACnB,OAAO,KAAK,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC;YAC5B,MAAM,CAAC,GAAG,KAAK,EAAE,CAAC;YAClB,OAAO,CAAC,CAAC,CAAC,GAAG,MAAM,KAAK,CAAC,CAAC,CAAC,EAAE,CAAC;QAChC,CAAC;IACH,CAAC;IACD,MAAM,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,IAAI,CAAC,EAAE,MAAM,EAAE,IAAI,CAAC,GAAG,CAAC,KAAK,EAAE,KAAK,CAAC,MAAM,CAAC,EAAE,EAAE,MAAM,CAAC,CAAC,CAAC;IACjF,OAAO,OAAO,CAAC;AACjB,CAAC;AAED,MAAM,eAAe,GAAG,EAAE,CAAC;AAE3B,SAAS,iBAAiB,CAAC,SAAoB,EAAE,SAAiB;IAChE,IAAI,CAAC,SAAS,CAAC,UAAU,IAAI,OAAO,SAAS,CAAC,UAAU,KAAK,QAAQ,EAAE,CAAC;QACtE,MAAM,IAAI,aAAa,CAAC,yBAAyB,SAAS,0CAA0C,CAAC,CAAC;IACxG,CAAC;IACD,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,SAAS,CAAC,KAAK,CAAC,EAAE,CAAC;QACpC,MAAM,IAAI,aAAa,CAAC,yBAAyB,SAAS,6BAA6B,CAAC,CAAC;IAC3F,CAAC;IACD,KAAK,MAAM,CAAC,CAAC,EAAE,QAAQ,CAAC,IAAI,SAAS,CAAC,KAAK,CAAC,OAAO,EAAE,EAAE,CAAC;QACtD,MAAM,MAAM,GAAG,yBAAyB,SAAS,WAAW,CAAC,GAAG,CAAC;QACjE,IAAI,OAAO,QAAQ,CAAC,EAAE,KAAK,QAAQ,EAAE,CAAC;YACpC,MAAM,IAAI,aAAa,CAAC,GAAG,MAAM,8CAA8C,CAAC,CAAC;QACnF,CAAC;QACD,IAAI,OAAO,QAAQ,CAAC,MAAM,KAAK,QAAQ,EAAE,CAAC;YACxC,MAAM,IAAI,aAAa,CAAC,GAAG,MAAM,QAAQ,QAAQ,CAAC,EAAE,2BAA2B,CAAC,CAAC;QACnF,CAAC;QACD,IAAI,OAAO,QAAQ,CAAC,eAAe,KAAK,QAAQ,EAAE,CAAC;YACjD,MAAM,IAAI,aAAa,CAAC,GAAG,MAAM,QAAQ,QAAQ,CAAC,EAAE,oCAAoC,CAAC,CAAC;QAC5F,CAAC;QACD,IAAI,QAAQ,CAAC,UAAU,KAAK,SAAS,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,QAAQ,CAAC,UAAU,CAAC,EAAE,CAAC;YAC7E,MAAM,IAAI,aAAa,CAAC,GAAG,MAAM,QAAQ,QAAQ,CAAC,EAAE,6CAA6C,CAAC,CAAC;QACrG,CAAC;IACH,CAAC;AACH,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,WAAW,CAC/B,SAAiB,EACjB,OAAgB,EAChB,SAA2B,EAC3B,OAA4H;IAE5H,MAAM,SAAS,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;IAC9D,IAAI,CAAC,EAAE,CAAC,UAAU,CAAC,SAAS,CAAC,EAAE,CAAC;QAC9B,MAAM,IAAI,iBAAiB,CAAC,SAAS,EAAE,mDAAmD,CAAC,CAAC;IAC9F,CAAC;IAED,IAAI,SAAoB,CAAC;IACzB,IAAI,CAAC;QACH,SAAS,GAAG,IAAI,CAAC,KAAK,CAAC,EAAE,CAAC,YAAY,CAAC,SAAS,EAAE,OAAO,CAAC,CAAC,CAAC;IAC9D,CAAC;IAAC,MAAM,CAAC;QACP,MAAM,IAAI,aAAa,CAAC,mBAAmB,SAAS,mEAAmE,CAAC,CAAC;IAC3H,CAAC;IACD,iBAAiB,CAAC,SAAS,EAAE,SAAS,CAAC,CAAC;IAExC,oDAAoD;IACpD,IAAI,OAAO,CAAC,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC,MAAM,GAAG,CAAC,EAAE,CAAC;QAC5C,MAAM,GAAG,GAAG,IAAI,GAAG,CAAC,OAAO,CAAC,IAAI,CAAC,CAAC;QAClC,MAAM,QAAQ,GAAG,SAAS,CAAC,KAAK,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QAC9D,IAAI,QAAQ,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;YAC1B,MAAM,IAAI,aAAa,CAAC,8BAA8B,OAAO,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,CAAC,oBAAoB,SAAS,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,EAAE,CAAC,CAAC;QACjJ,CAAC;QACD,SAAS,GAAG,EAAE,GAAG,SAAS,EAAE,KAAK,EAAE,QAAQ,EAAE,CAAC;IAChD,CAAC;IAED,MAAM,EAAE,GAAG,IAAI,gBAAgB,CAAC,SAAS,EAAE,OAAO,CAAC,SAAS,CAAC,CAAC;IAC9D,MAAM,YAAY,GAAG,EAAE,CAAC,eAAe,EAAE,CAAC;IAE1C,mDAAmD;IACnD,MAAM,WAAW,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,UAAU,CAAC,CAAC;IACrD,IAAI,EAAE,CAAC,UAAU,CAAC,WAAW,CAAC,EAAE,CAAC;QAC/B,EAAE,CAAC,YAAY,CAAC,WAAW,EAAE,IAAI,CAAC,IAAI,CAAC,YAAY,EAAE,mBAAmB,CAAC,CAAC,CAAC;IAC7E,CAAC;IACD,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,IAAI,CAAC,CAAC;IAC/B,MAAM,WAAW,GAAG,IAAI,CAAC,GAAG,CAAC,IAAI,CAAC,GAAG,CAAC,OAAO,CAAC,WAAW,IAAI,CAAC,EAAE,CAAC,CAAC,EAAE,eAAe,CAAC,CAAC;IACrF,MAAM,eAAe,GAAG,OAAO,CAAC,QAAQ,CAAC,CAAC,CAAC,WAAW,CAAC,CAAC,CAAC,eAAe,CAAC;IACzE,MAAM,UAAU,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,OAAO,EAAE,SAAS,CAAC,CAAC;IAE5D,8DAA8D;IAC9D,MAAM,QAAQ,GAAG,SAAS,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,QAAQ,EAAE,EAAE;QAChD,MAAM,IAAI,GAAG,gBAAgB,CAAC,WAAW,CAAC,QAAQ,CAAC,CAAC,OAAO,CAAC,OAAO,EAAE,EAAE,CAAC,CAAC;QACzE,OAAO,EAAE,QAAQ,EAAE,IAAI,EAAE,OAAO,EAAE,EAAE,CAAC,aAAa,CAAC,YAAY,EAAE,IAAI,EAAE,eAAe,CAAC,EAAE,CAAC;IAC5F,CAAC,CAAC,CAAC;IAEH,MAAM,KAAK,GAAG,QAAQ,CAAC,GAAG,CAAC,CAAC,EAAE,QAAQ,EAAE,IAAI,EAAE,OAAO,EAAE,EAAE,EAAE,CAAC,KAAK,IAA4B,EAAE;QAC7F,MAAM,UAAU,GAAG,QAAQ,CAAC,UAAU,IAAI,EAAE,CAAC;QAC7C,MAAM,WAAW,GAA8E,EAAE,CAAC;QAClG,IAAI,OAAO,GAA+C,IAAI,CAAC;QAE/D,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,EAAE,CAAC,EAAE,EAAE,CAAC;YAC9B,OAAO,GAAG,MAAM,OAAO,CAAC,QAAQ,EAAE,SAAS,EAAE,OAAO,EAAE,OAAO,EAAE,OAAO,CAAC,QAAQ,CAAC,CAAC;YAEjF,qCAAqC;YACrC,MAAM,CAAC,SAAS,EAAE,UAAU,CAAC,GAAG,MAAM,OAAO,CAAC,GAAG,CAAC;gBAChD,eAAe,CACb,UAAU,EACV,OAAO,CAAC,SAAS,CAAC,MAAM,EACxB,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,YAAY,CAAC,EAChC,SAAS,EACT,EAAE,CAAC,UAAU,CAAC,UAAU,CAAC,CAAC,CAAC,CAAC,UAAU,CAAC,CAAC,CAAC,SAAS,CACnD;gBACD,eAAe,CACb,UAAU,EACV,OAAO,CAAC,YAAY,CAAC,MAAM,EAC3B,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,eAAe,CAAC,EACnC,SAAS,EACT,EAAE,CAAC,UAAU,CAAC,UAAU,CAAC,CAAC,CAAC,CAAC,UAAU,CAAC,CAAC,CAAC,SAAS,CACnD;aACF,CAAC,CAAC;YACH,WAAW,CAAC,IAAI,CAAC,EAAE,SAAS,EAAE,SAAS,EAAE,YAAY,EAAE,UAAU,EAAE,CAAC,CAAC;QACvE,CAAC;QAED,IAAI,CAAC,OAAO,EAAE,CAAC;YACb,MAAM,IAAI,aAAa,CAAC,8BAA8B,QAAQ,CAAC,EAAE,EAAE,CAAC,CAAC;QACvE,CAAC;QAED,
|
|
1
|
+
{"version":3,"file":"eval.js","sourceRoot":"","sources":["../../../src/commands/eval.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AASlC,OAAO,EAAE,gBAAgB,EAAE,MAAM,wBAAwB,CAAC;AAC1D,OAAO,EAAE,OAAO,EAAE,MAAM,qBAAqB,CAAC;AAC9C,OAAO,EAAE,eAAe,EAAE,MAAM,qBAAqB,CAAC;AACtD,OAAO,EAAE,gBAAgB,EAAE,MAAM,yBAAyB,CAAC;AAC3D,OAAO,EAAE,aAAa,EAAE,iBAAiB,EAAE,cAAc,EAAE,MAAM,cAAc,CAAC;AAEhF,KAAK,UAAU,kBAAkB,CAC/B,KAA2B,EAC3B,KAAa;IAEb,MAAM,OAAO,GAAQ,IAAI,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,CAAC;IAC7C,IAAI,KAAK,GAAG,CAAC,CAAC;IACd,KAAK,UAAU,MAAM;QACnB,OAAO,KAAK,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC;YAC5B,MAAM,CAAC,GAAG,KAAK,EAAE,CAAC;YAClB,OAAO,CAAC,CAAC,CAAC,GAAG,MAAM,KAAK,CAAC,CAAC,CAAC,EAAE,CAAC;QAChC,CAAC;IACH,CAAC;IACD,MAAM,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,IAAI,CAAC,EAAE,MAAM,EAAE,IAAI,CAAC,GAAG,CAAC,KAAK,EAAE,KAAK,CAAC,MAAM,CAAC,EAAE,EAAE,MAAM,CAAC,CAAC,CAAC;IACjF,OAAO,OAAO,CAAC;AACjB,CAAC;AAED,MAAM,eAAe,GAAG,EAAE,CAAC;AAE3B;;;;GAIG;AACH,SAAS,eAAe,CAAC,QAAkC;IACzD,MAAM,KAAK,GAAG,QAAQ,CAAC,MAAM,CAAC,CAAC,CAAC,EAAsB,EAAE,CAAC,CAAC,KAAK,IAAI,CAAC,CAAC;IACrE,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC;QAAE,OAAO,SAAS,CAAC;IACzC,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC;QAAE,OAAO,KAAK,CAAC,CAAC,CAAC,CAAC;IAExC,MAAM,WAAW,GAAG,KAAK,CAAC,MAAM,CAAC,CAAC,GAAG,EAAE,CAAC,EAAE,EAAE,CAAC,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,SAAS,EAAE,CAAC,CAAC,GAAG,KAAK,CAAC,MAAM,CAAC;IAC1F,MAAM,SAAS,GAAG,KAAK,CAAC,MAAM,CAAC,CAAC,GAAG,EAAE,CAAC,EAAE,EAAE,CAAC,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,MAAM,EAAE,CAAC,CAAC,GAAG,KAAK,CAAC,MAAM,CAAC;IACrF,MAAM,SAAS,GAAG,KAAK,CAAC,MAAM,CAAC,CAAC,GAAG,EAAE,CAAC,EAAE,EAAE,CAAC,GAAG,GAAG,CAAC,CAAC,OAAO,CAAC,MAAM,EAAE,CAAC,CAAC,GAAG,KAAK,CAAC,MAAM,CAAC;IACrF,MAAM,IAAI,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC;IAErC,OAAO;QACL,iBAAiB,EAAE,IAAI,CAAC,iBAAiB;QACzC,OAAO,EAAE;YACP,MAAM,EAAE,IAAI,CAAC,KAAK,CAAC,SAAS,CAAC;YAC7B,MAAM,EAAE,IAAI,CAAC,KAAK,CAAC,SAAS,CAAC;YAC7B,KAAK,EAAE,IAAI,CAAC,OAAO,CAAC,KAAK;YACzB,SAAS,EAAE,WAAW;SACvB;KACF,CAAC;AACJ,CAAC;AAED,SAAS,iBAAiB,CAAC,SAAoB,EAAE,SAAiB;IAChE,IAAI,CAAC,SAAS,CAAC,UAAU,IAAI,OAAO,SAAS,CAAC,UAAU,KAAK,QAAQ,EAAE,CAAC;QACtE,MAAM,IAAI,aAAa,CAAC,yBAAyB,SAAS,0CAA0C,CAAC,CAAC;IACxG,CAAC;IACD,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,SAAS,CAAC,KAAK,CAAC,EAAE,CAAC;QACpC,MAAM,IAAI,aAAa,CAAC,yBAAyB,SAAS,6BAA6B,CAAC,CAAC;IAC3F,CAAC;IACD,KAAK,MAAM,CAAC,CAAC,EAAE,QAAQ,CAAC,IAAI,SAAS,CAAC,KAAK,CAAC,OAAO,EAAE,EAAE,CAAC;QACtD,MAAM,MAAM,GAAG,yBAAyB,SAAS,WAAW,CAAC,GAAG,CAAC;QACjE,IAAI,OAAO,QAAQ,CAAC,EAAE,KAAK,QAAQ,EAAE,CAAC;YACpC,MAAM,IAAI,aAAa,CAAC,GAAG,MAAM,8CAA8C,CAAC,CAAC;QACnF,CAAC;QACD,IAAI,OAAO,QAAQ,CAAC,MAAM,KAAK,QAAQ,EAAE,CAAC;YACxC,MAAM,IAAI,aAAa,CAAC,GAAG,MAAM,QAAQ,QAAQ,CAAC,EAAE,2BAA2B,CAAC,CAAC;QACnF,CAAC;QACD,IAAI,OAAO,QAAQ,CAAC,eAAe,KAAK,QAAQ,EAAE,CAAC;YACjD,MAAM,IAAI,aAAa,CAAC,GAAG,MAAM,QAAQ,QAAQ,CAAC,EAAE,oCAAoC,CAAC,CAAC;QAC5F,CAAC;QACD,IAAI,QAAQ,CAAC,UAAU,KAAK,SAAS,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,QAAQ,CAAC,UAAU,CAAC,EAAE,CAAC;YAC7E,MAAM,IAAI,aAAa,CAAC,GAAG,MAAM,QAAQ,QAAQ,CAAC,EAAE,6CAA6C,CAAC,CAAC;QACrG,CAAC;IACH,CAAC;AACH,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,WAAW,CAC/B,SAAiB,EACjB,OAAgB,EAChB,SAA2B,EAC3B,OAA4H;IAE5H,MAAM,SAAS,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,OAAO,EAAE,YAAY,CAAC,CAAC;IAC9D,IAAI,CAAC,EAAE,CAAC,UAAU,CAAC,SAAS,CAAC,EAAE,CAAC;QAC9B,MAAM,IAAI,iBAAiB,CAAC,SAAS,EAAE,mDAAmD,CAAC,CAAC;IAC9F,CAAC;IAED,IAAI,SAAoB,CAAC;IACzB,IAAI,CAAC;QACH,SAAS,GAAG,IAAI,CAAC,KAAK,CAAC,EAAE,CAAC,YAAY,CAAC,SAAS,EAAE,OAAO,CAAC,CAAC,CAAC;IAC9D,CAAC;IAAC,MAAM,CAAC;QACP,MAAM,IAAI,aAAa,CAAC,mBAAmB,SAAS,mEAAmE,CAAC,CAAC;IAC3H,CAAC;IACD,iBAAiB,CAAC,SAAS,EAAE,SAAS,CAAC,CAAC;IAExC,oDAAoD;IACpD,IAAI,OAAO,CAAC,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC,MAAM,GAAG,CAAC,EAAE,CAAC;QAC5C,MAAM,GAAG,GAAG,IAAI,GAAG,CAAC,OAAO,CAAC,IAAI,CAAC,CAAC;QAClC,MAAM,QAAQ,GAAG,SAAS,CAAC,KAAK,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QAC9D,IAAI,QAAQ,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;YAC1B,MAAM,IAAI,aAAa,CAAC,8BAA8B,OAAO,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,CAAC,oBAAoB,SAAS,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,EAAE,CAAC,CAAC;QACjJ,CAAC;QACD,SAAS,GAAG,EAAE,GAAG,SAAS,EAAE,KAAK,EAAE,QAAQ,EAAE,CAAC;IAChD,CAAC;IAED,IAAI,OAAO,CAAC,SAAS,KAAK,SAAS,IAAI,CAAC,OAAO,CAAC,SAAS,GAAG,CAAC,IAAI,OAAO,CAAC,SAAS,GAAG,CAAC,CAAC,EAAE,CAAC;QACxF,MAAM,IAAI,aAAa,CAAC,+DAA+D,OAAO,CAAC,SAAS,EAAE,CAAC,CAAC;IAC9G,CAAC;IAED,MAAM,EAAE,GAAG,IAAI,gBAAgB,CAAC,SAAS,EAAE,OAAO,CAAC,SAAS,CAAC,CAAC;IAC9D,MAAM,YAAY,GAAG,EAAE,CAAC,eAAe,EAAE,CAAC;IAE1C,mDAAmD;IACnD,MAAM,WAAW,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,UAAU,CAAC,CAAC;IACrD,IAAI,EAAE,CAAC,UAAU,CAAC,WAAW,CAAC,EAAE,CAAC;QAC/B,EAAE,CAAC,YAAY,CAAC,WAAW,EAAE,IAAI,CAAC,IAAI,CAAC,YAAY,EAAE,mBAAmB,CAAC,CAAC,CAAC;IAC7E,CAAC;IACD,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,IAAI,CAAC,CAAC;IAC/B,MAAM,WAAW,GAAG,IAAI,CAAC,GAAG,CAAC,IAAI,CAAC,GAAG,CAAC,OAAO,CAAC,WAAW,IAAI,CAAC,EAAE,CAAC,CAAC,EAAE,eAAe,CAAC,CAAC;IACrF,MAAM,eAAe,GAAG,OAAO,CAAC,QAAQ,CAAC,CAAC,CAAC,WAAW,CAAC,CAAC,CAAC,eAAe,CAAC;IACzE,MAAM,UAAU,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,OAAO,EAAE,SAAS,CAAC,CAAC;IAE5D,8DAA8D;IAC9D,MAAM,QAAQ,GAAG,SAAS,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,QAAQ,EAAE,EAAE;QAChD,MAAM,IAAI,GAAG,gBAAgB,CAAC,WAAW,CAAC,QAAQ,CAAC,CAAC,OAAO,CAAC,OAAO,EAAE,EAAE,CAAC,CAAC;QACzE,OAAO,EAAE,QAAQ,EAAE,IAAI,EAAE,OAAO,EAAE,EAAE,CAAC,aAAa,CAAC,YAAY,EAAE,IAAI,EAAE,eAAe,CAAC,EAAE,CAAC;IAC5F,CAAC,CAAC,CAAC;IAEH,MAAM,KAAK,GAAG,QAAQ,CAAC,GAAG,CAAC,CAAC,EAAE,QAAQ,EAAE,IAAI,EAAE,OAAO,EAAE,EAAE,EAAE,CAAC,KAAK,IAA4B,EAAE;QAC7F,MAAM,UAAU,GAAG,QAAQ,CAAC,UAAU,IAAI,EAAE,CAAC;QAC7C,MAAM,WAAW,GAA8E,EAAE,CAAC;QAClG,IAAI,OAAO,GAA+C,IAAI,CAAC;QAE/D,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,EAAE,CAAC,EAAE,EAAE,CAAC;YAC9B,OAAO,GAAG,MAAM,OAAO,CAAC,QAAQ,EAAE,SAAS,EAAE,OAAO,EAAE,OAAO,EAAE,OAAO,CAAC,QAAQ,CAAC,CAAC;YAEjF,qCAAqC;YACrC,MAAM,CAAC,SAAS,EAAE,UAAU,CAAC,GAAG,MAAM,OAAO,CAAC,GAAG,CAAC;gBAChD,eAAe,CACb,UAAU,EACV,OAAO,CAAC,SAAS,CAAC,MAAM,EACxB,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,YAAY,CAAC,EAChC,SAAS,EACT,EAAE,CAAC,UAAU,CAAC,UAAU,CAAC,CAAC,CAAC,CAAC,UAAU,CAAC,CAAC,CAAC,SAAS,CACnD;gBACD,eAAe,CACb,UAAU,EACV,OAAO,CAAC,YAAY,CAAC,MAAM,EAC3B,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,eAAe,CAAC,EACnC,SAAS,EACT,EAAE,CAAC,UAAU,CAAC,UAAU,CAAC,CAAC,CAAC,CAAC,UAAU,CAAC,CAAC,CAAC,SAAS,CACnD;aACF,CAAC,CAAC;YACH,WAAW,CAAC,IAAI,CAAC,EAAE,SAAS,EAAE,SAAS,EAAE,YAAY,EAAE,UAAU,EAAE,CAAC,CAAC;QACvE,CAAC;QAED,IAAI,CAAC,OAAO,EAAE,CAAC;YACb,MAAM,IAAI,aAAa,CAAC,8BAA8B,QAAQ,CAAC,EAAE,EAAE,CAAC,CAAC;QACvE,CAAC;QAED,kEAAkE;QAClE,MAAM,gBAAgB,GAAG,eAAe,CAAC,WAAW,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC;QAC5E,MAAM,mBAAmB,GAAG,eAAe,CAAC,WAAW,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,YAAY,CAAC,CAAC,CAAC;QAElF,iEAAiE;QACjE,mEAAmE;QACnE,IAAI,IAAI,GAAG,CAAC,EAAE,CAAC;YACb,IAAI,gBAAgB,EAAE,CAAC;gBACrB,EAAE,CAAC,aAAa,CACd,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,YAAY,EAAE,cAAc,CAAC,EAChD,IAAI,CAAC,SAAS,CAAC,gBAAgB,EAAE,IAAI,EAAE,CAAC,CAAC,CAC1C,CAAC;YACJ,CAAC;YACD,IAAI,mBAAmB,EAAE,CAAC;gBACxB,EAAE,CAAC,aAAa,CACd,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,eAAe,EAAE,cAAc,CAAC,EACnD,IAAI,CAAC,SAAS,CAAC,mBAAmB,EAAE,IAAI,EAAE,CAAC,CAAC,CAC7C,CAAC;YACJ,CAAC;QACH,CAAC;QAED,OAAO;YACL,MAAM,EAAE,QAAQ,CAAC,EAAE;YACnB,IAAI;YACJ,KAAK,EAAE,QAAQ,CAAC,KAAK;YACrB,MAAM,EAAE,QAAQ,CAAC,MAAM;YACvB,SAAS,EAAE;gBACT,MAAM,EAAE,OAAO,CAAC,SAAS,CAAC,MAAM;gBAChC,OAAO,EAAE,gBAAgB;aAC1B;YACD,YAAY,EAAE;gBACZ,MAAM,EAAE,OAAO,CAAC,YAAY,CAAC,MAAM;gBACnC,OAAO,EAAE,mBAAmB;aAC7B;SACF,CAAC;IACJ,CAAC,CAAC,CAAC;IAEH,MAAM,QAAQ,GAAG,MAAM,kBAAkB,CAAC,KAAK,EAAE,WAAW,CAAC,CAAC;IAC9D,MAAM,SAAS,GAAG,gBAAgB,CAAC,QAAQ,CAAC,CAAC;IAE7C,wDAAwD;IACxD,MAAM,iBAAiB,GAAG;QACxB,GAAG,SAAS;QACZ,QAAQ,EAAE;YACR,UAAU,EAAE,QAAQ,CAAC,MAAM;YAC3B,QAAQ,EAAE,QAAQ,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,MAAM,CAAC;YACvC,UAAU,EAAE,SAAS,CAAC,UAAU;YAChC,aAAa,EAAE,IAAI;YACnB,SAAS,EAAE,IAAI,IAAI,EAAE,CAAC,WAAW,EAAE;SACpC;KACF,CAAC;IAEF,EAAE,CAAC,aAAa,CACd,IAAI,CAAC,IAAI,CAAC,YAAY,EAAE,gBAAgB,CAAC,EACzC,IAAI,CAAC,SAAS,CAAC,iBAAiB,EAAE,CAAC,IAAI,EAAE,KAAK,EAAE,EAAE,CAChD,OAAO,KAAK,KAAK,QAAQ,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,KAAK,GAAG,KAAK,CAAC,GAAG,KAAK,CAAC,CAAC,CAAC,KAAK,EAAE,CAAC,CAAC,CAC5E,CAAC;IAEF,yCAAyC;IACzC,IAAI,OAAO,CAAC,SAAS,KAAK,SAAS,EAAE,CAAC;QACpC,MAAM,QAAQ,GAAG,SAAS,CAAC,WAAW,CAAC,UAAU,CAAC,SAAS,CAAC,IAAI,CAAC;QACjE,IAAI,QAAQ,GAAG,OAAO,CAAC,SAAS,EAAE,CAAC;YACjC,yEAAyE;YACzE,MAAM,OAAO,GAAG,EAAE,SAAS,EAAE,SAAS,CAAC,UAAU,EAAE,QAAQ,EAAE,SAAS,EAAE,YAAY,EAAE,CAAC;YACvF,MAAM,MAAM,CAAC,MAAM,CAAC,IAAI,cAAc,CAAC,QAAQ,EAAE,OAAO,CAAC,SAAS,CAAC,EAAE,EAAE,OAAO,EAAE,CAAC,CAAC;QACpF,CAAC;IACH,CAAC;IAED,OAAO;QACL,SAAS,EAAE,SAAS,CAAC,UAAU;QAC/B,QAAQ;QACR,SAAS;QACT,YAAY;KACb,CAAC;AACJ,CAAC"}
|