agent-bober 0.5.3 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +13 -7
- package/agents/bober-evaluator.md +10 -0
- package/dist/contracts/eval-result.d.ts +339 -0
- package/dist/contracts/eval-result.d.ts.map +1 -1
- package/dist/contracts/eval-result.js +36 -0
- package/dist/contracts/eval-result.js.map +1 -1
- package/dist/evaluators/builtin/playwright.d.ts.map +1 -1
- package/dist/evaluators/builtin/playwright.js +50 -15
- package/dist/evaluators/builtin/playwright.js.map +1 -1
- package/dist/index.d.ts +5 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +4 -0
- package/dist/index.js.map +1 -1
- package/dist/orchestrator/agent-loader.d.ts +26 -0
- package/dist/orchestrator/agent-loader.d.ts.map +1 -0
- package/dist/orchestrator/agent-loader.js +125 -0
- package/dist/orchestrator/agent-loader.js.map +1 -0
- package/dist/orchestrator/agentic-loop.d.ts +53 -0
- package/dist/orchestrator/agentic-loop.d.ts.map +1 -0
- package/dist/orchestrator/agentic-loop.js +145 -0
- package/dist/orchestrator/agentic-loop.js.map +1 -0
- package/dist/orchestrator/evaluator-agent.d.ts +4 -1
- package/dist/orchestrator/evaluator-agent.d.ts.map +1 -1
- package/dist/orchestrator/evaluator-agent.js +107 -84
- package/dist/orchestrator/evaluator-agent.js.map +1 -1
- package/dist/orchestrator/generator-agent.d.ts +14 -2
- package/dist/orchestrator/generator-agent.d.ts.map +1 -1
- package/dist/orchestrator/generator-agent.js +96 -73
- package/dist/orchestrator/generator-agent.js.map +1 -1
- package/dist/orchestrator/model-resolver.d.ts +9 -0
- package/dist/orchestrator/model-resolver.d.ts.map +1 -0
- package/dist/orchestrator/model-resolver.js +21 -0
- package/dist/orchestrator/model-resolver.js.map +1 -0
- package/dist/orchestrator/pipeline.d.ts.map +1 -1
- package/dist/orchestrator/pipeline.js +21 -4
- package/dist/orchestrator/pipeline.js.map +1 -1
- package/dist/orchestrator/planner-agent.d.ts +3 -2
- package/dist/orchestrator/planner-agent.d.ts.map +1 -1
- package/dist/orchestrator/planner-agent.js +39 -75
- package/dist/orchestrator/planner-agent.js.map +1 -1
- package/dist/orchestrator/tools/handlers.d.ts +9 -0
- package/dist/orchestrator/tools/handlers.d.ts.map +1 -0
- package/dist/orchestrator/tools/handlers.js +279 -0
- package/dist/orchestrator/tools/handlers.js.map +1 -0
- package/dist/orchestrator/tools/index.d.ts +21 -0
- package/dist/orchestrator/tools/index.d.ts.map +1 -0
- package/dist/orchestrator/tools/index.js +33 -0
- package/dist/orchestrator/tools/index.js.map +1 -0
- package/dist/orchestrator/tools/schemas.d.ts +16 -0
- package/dist/orchestrator/tools/schemas.d.ts.map +1 -0
- package/dist/orchestrator/tools/schemas.js +138 -0
- package/dist/orchestrator/tools/schemas.js.map +1 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -451,15 +451,20 @@ To debug failing E2E tests:
|
|
|
451
451
|
|
|
452
452
|
This architecture implements the patterns described in Anthropic's [**"Harness design for long-running application development"**](https://www.anthropic.com/engineering/harness-design-long-running-apps) by Prithvi Rajasekaran. The key insight from that research: separating code generation from code evaluation creates a feedback loop that catches errors early and dramatically improves output quality. In their tests, a solo agent produced broken output in 20 minutes, while the full harness produced a polished, working application — demonstrating that multi-agent orchestration with honest evaluation is worth the investment.
|
|
453
453
|
|
|
454
|
-
|
|
455
|
-
|
|
456
|
-
|
|
454
|
+
### Agentic Tool-Use Architecture
|
|
455
|
+
|
|
456
|
+
Each agent runs as a **multi-turn agentic loop** with tool access via the Anthropic SDK. System prompts are loaded from the detailed agent definitions in `agents/bober-*.md` (300-600 lines of role-specific instructions, anti-leniency protocols, and evaluation criteria).
|
|
457
|
+
|
|
458
|
+
- **Planner** (Claude Opus): Explores the codebase via read-only tools (`read_file`, `glob`, `grep`), then produces sprint-decomposed plans. Thinks about scope, dependencies, and risk.
|
|
459
|
+
- **Generator** (Claude Sonnet): Full tool access (`bash`, `read_file`, `write_file`, `edit_file`, `glob`, `grep`). Reads existing code, writes implementation, runs tests, and commits — all autonomously within the sprint contract boundaries.
|
|
460
|
+
- **Evaluator** (Claude Sonnet): Read-only + bash tools (`bash`, `read_file`, `glob`, `grep` — deliberately NO write/edit). Independently verifies by running the dev server, taking Playwright screenshots, executing tests, and inspecting code. Cannot fix bugs — only report them with precise feedback.
|
|
457
461
|
|
|
458
462
|
The separation ensures that:
|
|
459
|
-
1. The Generator cannot "mark its own homework"
|
|
463
|
+
1. The Generator cannot "mark its own homework" — an independent evaluation step with its own tool access catches issues through actual runtime verification, not just reading the generator's self-report.
|
|
460
464
|
2. Sprint contracts provide clear scope boundaries, preventing feature creep.
|
|
461
|
-
3. Automated checks
|
|
465
|
+
3. Automated checks (programmatic evaluators) + agent-based qualitative evaluation run after every sprint.
|
|
462
466
|
4. Context resets between sprints keep the Generator focused and prevent context degradation.
|
|
467
|
+
5. The Evaluator's anti-leniency protocol ensures passing on the first iteration is rare for non-trivial work.
|
|
463
468
|
|
|
464
469
|
### State Management
|
|
465
470
|
|
|
@@ -522,10 +527,11 @@ agent-bober/
|
|
|
522
527
|
config/ Config schema, loader, defaults
|
|
523
528
|
contracts/ Sprint contract and eval result types
|
|
524
529
|
evaluators/ Built-in evaluator plugins
|
|
525
|
-
orchestrator/
|
|
530
|
+
orchestrator/ Agent runners, agentic loop, tool infrastructure
|
|
531
|
+
tools/ Tool schemas, sandboxed handlers, role-based sets
|
|
526
532
|
state/ State management for .bober/ directory
|
|
527
533
|
utils/ Shared utilities
|
|
528
|
-
agents/ Agent system prompts (.md files)
|
|
534
|
+
agents/ Agent system prompts (.md files, loaded at runtime)
|
|
529
535
|
skills/ Claude Code slash command definitions
|
|
530
536
|
templates/ Project templates and scaffolds
|
|
531
537
|
hooks/ Claude Code hooks
|
|
@@ -6,6 +6,16 @@ tools:
|
|
|
6
6
|
- Bash
|
|
7
7
|
- Grep
|
|
8
8
|
- Glob
|
|
9
|
+
- mcp__plugin_playwright_playwright__browser_navigate
|
|
10
|
+
- mcp__plugin_playwright_playwright__browser_snapshot
|
|
11
|
+
- mcp__plugin_playwright_playwright__browser_take_screenshot
|
|
12
|
+
- mcp__plugin_playwright_playwright__browser_click
|
|
13
|
+
- mcp__plugin_playwright_playwright__browser_fill_form
|
|
14
|
+
- mcp__plugin_playwright_playwright__browser_evaluate
|
|
15
|
+
- mcp__plugin_playwright_playwright__browser_console_messages
|
|
16
|
+
- mcp__plugin_playwright_playwright__browser_network_requests
|
|
17
|
+
- mcp__plugin_playwright_playwright__browser_tabs
|
|
18
|
+
- mcp__plugin_playwright_playwright__browser_close
|
|
9
19
|
model: sonnet
|
|
10
20
|
---
|
|
11
21
|
|
|
@@ -24,6 +24,69 @@ export declare const EvalDetailSchema: z.ZodObject<{
|
|
|
24
24
|
line?: number | undefined;
|
|
25
25
|
}>;
|
|
26
26
|
export type EvalDetail = z.infer<typeof EvalDetailSchema>;
|
|
27
|
+
export declare const CriterionResultSchema: z.ZodObject<{
|
|
28
|
+
criterionId: z.ZodString;
|
|
29
|
+
description: z.ZodString;
|
|
30
|
+
required: z.ZodBoolean;
|
|
31
|
+
result: z.ZodEnum<["pass", "fail", "skipped"]>;
|
|
32
|
+
evidence: z.ZodOptional<z.ZodString>;
|
|
33
|
+
feedback: z.ZodOptional<z.ZodString>;
|
|
34
|
+
}, "strip", z.ZodTypeAny, {
|
|
35
|
+
description: string;
|
|
36
|
+
required: boolean;
|
|
37
|
+
criterionId: string;
|
|
38
|
+
result: "pass" | "fail" | "skipped";
|
|
39
|
+
feedback?: string | undefined;
|
|
40
|
+
evidence?: string | undefined;
|
|
41
|
+
}, {
|
|
42
|
+
description: string;
|
|
43
|
+
required: boolean;
|
|
44
|
+
criterionId: string;
|
|
45
|
+
result: "pass" | "fail" | "skipped";
|
|
46
|
+
feedback?: string | undefined;
|
|
47
|
+
evidence?: string | undefined;
|
|
48
|
+
}>;
|
|
49
|
+
export type CriterionResult = z.infer<typeof CriterionResultSchema>;
|
|
50
|
+
export declare const RegressionSchema: z.ZodObject<{
|
|
51
|
+
description: z.ZodString;
|
|
52
|
+
evidence: z.ZodString;
|
|
53
|
+
severity: z.ZodEnum<["critical", "major", "minor"]>;
|
|
54
|
+
}, "strip", z.ZodTypeAny, {
|
|
55
|
+
description: string;
|
|
56
|
+
severity: "critical" | "major" | "minor";
|
|
57
|
+
evidence: string;
|
|
58
|
+
}, {
|
|
59
|
+
description: string;
|
|
60
|
+
severity: "critical" | "major" | "minor";
|
|
61
|
+
evidence: string;
|
|
62
|
+
}>;
|
|
63
|
+
export type Regression = z.infer<typeof RegressionSchema>;
|
|
64
|
+
export declare const GeneratorFeedbackItemSchema: z.ZodObject<{
|
|
65
|
+
priority: z.ZodEnum<["critical", "high", "medium", "low"]>;
|
|
66
|
+
category: z.ZodEnum<["bug", "missing-feature", "regression", "quality", "performance"]>;
|
|
67
|
+
file: z.ZodOptional<z.ZodString>;
|
|
68
|
+
line: z.ZodOptional<z.ZodNumber>;
|
|
69
|
+
description: z.ZodString;
|
|
70
|
+
expected: z.ZodOptional<z.ZodString>;
|
|
71
|
+
reproduction: z.ZodOptional<z.ZodString>;
|
|
72
|
+
}, "strip", z.ZodTypeAny, {
|
|
73
|
+
description: string;
|
|
74
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
75
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
76
|
+
expected?: string | undefined;
|
|
77
|
+
file?: string | undefined;
|
|
78
|
+
line?: number | undefined;
|
|
79
|
+
reproduction?: string | undefined;
|
|
80
|
+
}, {
|
|
81
|
+
description: string;
|
|
82
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
83
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
84
|
+
expected?: string | undefined;
|
|
85
|
+
file?: string | undefined;
|
|
86
|
+
line?: number | undefined;
|
|
87
|
+
reproduction?: string | undefined;
|
|
88
|
+
}>;
|
|
89
|
+
export type GeneratorFeedbackItem = z.infer<typeof GeneratorFeedbackItemSchema>;
|
|
27
90
|
export declare const EvalResultSchema: z.ZodObject<{
|
|
28
91
|
evaluator: z.ZodString;
|
|
29
92
|
passed: z.ZodBoolean;
|
|
@@ -53,6 +116,69 @@ export declare const EvalResultSchema: z.ZodObject<{
|
|
|
53
116
|
summary: z.ZodString;
|
|
54
117
|
feedback: z.ZodString;
|
|
55
118
|
timestamp: z.ZodString;
|
|
119
|
+
iteration: z.ZodOptional<z.ZodNumber>;
|
|
120
|
+
contractId: z.ZodOptional<z.ZodString>;
|
|
121
|
+
criteriaResults: z.ZodOptional<z.ZodArray<z.ZodObject<{
|
|
122
|
+
criterionId: z.ZodString;
|
|
123
|
+
description: z.ZodString;
|
|
124
|
+
required: z.ZodBoolean;
|
|
125
|
+
result: z.ZodEnum<["pass", "fail", "skipped"]>;
|
|
126
|
+
evidence: z.ZodOptional<z.ZodString>;
|
|
127
|
+
feedback: z.ZodOptional<z.ZodString>;
|
|
128
|
+
}, "strip", z.ZodTypeAny, {
|
|
129
|
+
description: string;
|
|
130
|
+
required: boolean;
|
|
131
|
+
criterionId: string;
|
|
132
|
+
result: "pass" | "fail" | "skipped";
|
|
133
|
+
feedback?: string | undefined;
|
|
134
|
+
evidence?: string | undefined;
|
|
135
|
+
}, {
|
|
136
|
+
description: string;
|
|
137
|
+
required: boolean;
|
|
138
|
+
criterionId: string;
|
|
139
|
+
result: "pass" | "fail" | "skipped";
|
|
140
|
+
feedback?: string | undefined;
|
|
141
|
+
evidence?: string | undefined;
|
|
142
|
+
}>, "many">>;
|
|
143
|
+
regressions: z.ZodOptional<z.ZodArray<z.ZodObject<{
|
|
144
|
+
description: z.ZodString;
|
|
145
|
+
evidence: z.ZodString;
|
|
146
|
+
severity: z.ZodEnum<["critical", "major", "minor"]>;
|
|
147
|
+
}, "strip", z.ZodTypeAny, {
|
|
148
|
+
description: string;
|
|
149
|
+
severity: "critical" | "major" | "minor";
|
|
150
|
+
evidence: string;
|
|
151
|
+
}, {
|
|
152
|
+
description: string;
|
|
153
|
+
severity: "critical" | "major" | "minor";
|
|
154
|
+
evidence: string;
|
|
155
|
+
}>, "many">>;
|
|
156
|
+
designScore: z.ZodOptional<z.ZodNumber>;
|
|
157
|
+
generatorFeedback: z.ZodOptional<z.ZodArray<z.ZodObject<{
|
|
158
|
+
priority: z.ZodEnum<["critical", "high", "medium", "low"]>;
|
|
159
|
+
category: z.ZodEnum<["bug", "missing-feature", "regression", "quality", "performance"]>;
|
|
160
|
+
file: z.ZodOptional<z.ZodString>;
|
|
161
|
+
line: z.ZodOptional<z.ZodNumber>;
|
|
162
|
+
description: z.ZodString;
|
|
163
|
+
expected: z.ZodOptional<z.ZodString>;
|
|
164
|
+
reproduction: z.ZodOptional<z.ZodString>;
|
|
165
|
+
}, "strip", z.ZodTypeAny, {
|
|
166
|
+
description: string;
|
|
167
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
168
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
169
|
+
expected?: string | undefined;
|
|
170
|
+
file?: string | undefined;
|
|
171
|
+
line?: number | undefined;
|
|
172
|
+
reproduction?: string | undefined;
|
|
173
|
+
}, {
|
|
174
|
+
description: string;
|
|
175
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
176
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
177
|
+
expected?: string | undefined;
|
|
178
|
+
file?: string | undefined;
|
|
179
|
+
line?: number | undefined;
|
|
180
|
+
reproduction?: string | undefined;
|
|
181
|
+
}>, "many">>;
|
|
56
182
|
}, "strip", z.ZodTypeAny, {
|
|
57
183
|
evaluator: string;
|
|
58
184
|
passed: boolean;
|
|
@@ -68,6 +194,31 @@ export declare const EvalResultSchema: z.ZodObject<{
|
|
|
68
194
|
summary: string;
|
|
69
195
|
feedback: string;
|
|
70
196
|
score?: number | undefined;
|
|
197
|
+
iteration?: number | undefined;
|
|
198
|
+
contractId?: string | undefined;
|
|
199
|
+
criteriaResults?: {
|
|
200
|
+
description: string;
|
|
201
|
+
required: boolean;
|
|
202
|
+
criterionId: string;
|
|
203
|
+
result: "pass" | "fail" | "skipped";
|
|
204
|
+
feedback?: string | undefined;
|
|
205
|
+
evidence?: string | undefined;
|
|
206
|
+
}[] | undefined;
|
|
207
|
+
regressions?: {
|
|
208
|
+
description: string;
|
|
209
|
+
severity: "critical" | "major" | "minor";
|
|
210
|
+
evidence: string;
|
|
211
|
+
}[] | undefined;
|
|
212
|
+
designScore?: number | undefined;
|
|
213
|
+
generatorFeedback?: {
|
|
214
|
+
description: string;
|
|
215
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
216
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
217
|
+
expected?: string | undefined;
|
|
218
|
+
file?: string | undefined;
|
|
219
|
+
line?: number | undefined;
|
|
220
|
+
reproduction?: string | undefined;
|
|
221
|
+
}[] | undefined;
|
|
71
222
|
}, {
|
|
72
223
|
evaluator: string;
|
|
73
224
|
passed: boolean;
|
|
@@ -83,6 +234,31 @@ export declare const EvalResultSchema: z.ZodObject<{
|
|
|
83
234
|
summary: string;
|
|
84
235
|
feedback: string;
|
|
85
236
|
score?: number | undefined;
|
|
237
|
+
iteration?: number | undefined;
|
|
238
|
+
contractId?: string | undefined;
|
|
239
|
+
criteriaResults?: {
|
|
240
|
+
description: string;
|
|
241
|
+
required: boolean;
|
|
242
|
+
criterionId: string;
|
|
243
|
+
result: "pass" | "fail" | "skipped";
|
|
244
|
+
feedback?: string | undefined;
|
|
245
|
+
evidence?: string | undefined;
|
|
246
|
+
}[] | undefined;
|
|
247
|
+
regressions?: {
|
|
248
|
+
description: string;
|
|
249
|
+
severity: "critical" | "major" | "minor";
|
|
250
|
+
evidence: string;
|
|
251
|
+
}[] | undefined;
|
|
252
|
+
designScore?: number | undefined;
|
|
253
|
+
generatorFeedback?: {
|
|
254
|
+
description: string;
|
|
255
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
256
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
257
|
+
expected?: string | undefined;
|
|
258
|
+
file?: string | undefined;
|
|
259
|
+
line?: number | undefined;
|
|
260
|
+
reproduction?: string | undefined;
|
|
261
|
+
}[] | undefined;
|
|
86
262
|
}>;
|
|
87
263
|
export type EvalResult = z.infer<typeof EvalResultSchema>;
|
|
88
264
|
export declare const SprintEvaluationSchema: z.ZodObject<{
|
|
@@ -117,6 +293,69 @@ export declare const SprintEvaluationSchema: z.ZodObject<{
|
|
|
117
293
|
summary: z.ZodString;
|
|
118
294
|
feedback: z.ZodString;
|
|
119
295
|
timestamp: z.ZodString;
|
|
296
|
+
iteration: z.ZodOptional<z.ZodNumber>;
|
|
297
|
+
contractId: z.ZodOptional<z.ZodString>;
|
|
298
|
+
criteriaResults: z.ZodOptional<z.ZodArray<z.ZodObject<{
|
|
299
|
+
criterionId: z.ZodString;
|
|
300
|
+
description: z.ZodString;
|
|
301
|
+
required: z.ZodBoolean;
|
|
302
|
+
result: z.ZodEnum<["pass", "fail", "skipped"]>;
|
|
303
|
+
evidence: z.ZodOptional<z.ZodString>;
|
|
304
|
+
feedback: z.ZodOptional<z.ZodString>;
|
|
305
|
+
}, "strip", z.ZodTypeAny, {
|
|
306
|
+
description: string;
|
|
307
|
+
required: boolean;
|
|
308
|
+
criterionId: string;
|
|
309
|
+
result: "pass" | "fail" | "skipped";
|
|
310
|
+
feedback?: string | undefined;
|
|
311
|
+
evidence?: string | undefined;
|
|
312
|
+
}, {
|
|
313
|
+
description: string;
|
|
314
|
+
required: boolean;
|
|
315
|
+
criterionId: string;
|
|
316
|
+
result: "pass" | "fail" | "skipped";
|
|
317
|
+
feedback?: string | undefined;
|
|
318
|
+
evidence?: string | undefined;
|
|
319
|
+
}>, "many">>;
|
|
320
|
+
regressions: z.ZodOptional<z.ZodArray<z.ZodObject<{
|
|
321
|
+
description: z.ZodString;
|
|
322
|
+
evidence: z.ZodString;
|
|
323
|
+
severity: z.ZodEnum<["critical", "major", "minor"]>;
|
|
324
|
+
}, "strip", z.ZodTypeAny, {
|
|
325
|
+
description: string;
|
|
326
|
+
severity: "critical" | "major" | "minor";
|
|
327
|
+
evidence: string;
|
|
328
|
+
}, {
|
|
329
|
+
description: string;
|
|
330
|
+
severity: "critical" | "major" | "minor";
|
|
331
|
+
evidence: string;
|
|
332
|
+
}>, "many">>;
|
|
333
|
+
designScore: z.ZodOptional<z.ZodNumber>;
|
|
334
|
+
generatorFeedback: z.ZodOptional<z.ZodArray<z.ZodObject<{
|
|
335
|
+
priority: z.ZodEnum<["critical", "high", "medium", "low"]>;
|
|
336
|
+
category: z.ZodEnum<["bug", "missing-feature", "regression", "quality", "performance"]>;
|
|
337
|
+
file: z.ZodOptional<z.ZodString>;
|
|
338
|
+
line: z.ZodOptional<z.ZodNumber>;
|
|
339
|
+
description: z.ZodString;
|
|
340
|
+
expected: z.ZodOptional<z.ZodString>;
|
|
341
|
+
reproduction: z.ZodOptional<z.ZodString>;
|
|
342
|
+
}, "strip", z.ZodTypeAny, {
|
|
343
|
+
description: string;
|
|
344
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
345
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
346
|
+
expected?: string | undefined;
|
|
347
|
+
file?: string | undefined;
|
|
348
|
+
line?: number | undefined;
|
|
349
|
+
reproduction?: string | undefined;
|
|
350
|
+
}, {
|
|
351
|
+
description: string;
|
|
352
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
353
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
354
|
+
expected?: string | undefined;
|
|
355
|
+
file?: string | undefined;
|
|
356
|
+
line?: number | undefined;
|
|
357
|
+
reproduction?: string | undefined;
|
|
358
|
+
}>, "many">>;
|
|
120
359
|
}, "strip", z.ZodTypeAny, {
|
|
121
360
|
evaluator: string;
|
|
122
361
|
passed: boolean;
|
|
@@ -132,6 +371,31 @@ export declare const SprintEvaluationSchema: z.ZodObject<{
|
|
|
132
371
|
summary: string;
|
|
133
372
|
feedback: string;
|
|
134
373
|
score?: number | undefined;
|
|
374
|
+
iteration?: number | undefined;
|
|
375
|
+
contractId?: string | undefined;
|
|
376
|
+
criteriaResults?: {
|
|
377
|
+
description: string;
|
|
378
|
+
required: boolean;
|
|
379
|
+
criterionId: string;
|
|
380
|
+
result: "pass" | "fail" | "skipped";
|
|
381
|
+
feedback?: string | undefined;
|
|
382
|
+
evidence?: string | undefined;
|
|
383
|
+
}[] | undefined;
|
|
384
|
+
regressions?: {
|
|
385
|
+
description: string;
|
|
386
|
+
severity: "critical" | "major" | "minor";
|
|
387
|
+
evidence: string;
|
|
388
|
+
}[] | undefined;
|
|
389
|
+
designScore?: number | undefined;
|
|
390
|
+
generatorFeedback?: {
|
|
391
|
+
description: string;
|
|
392
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
393
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
394
|
+
expected?: string | undefined;
|
|
395
|
+
file?: string | undefined;
|
|
396
|
+
line?: number | undefined;
|
|
397
|
+
reproduction?: string | undefined;
|
|
398
|
+
}[] | undefined;
|
|
135
399
|
}, {
|
|
136
400
|
evaluator: string;
|
|
137
401
|
passed: boolean;
|
|
@@ -147,6 +411,31 @@ export declare const SprintEvaluationSchema: z.ZodObject<{
|
|
|
147
411
|
summary: string;
|
|
148
412
|
feedback: string;
|
|
149
413
|
score?: number | undefined;
|
|
414
|
+
iteration?: number | undefined;
|
|
415
|
+
contractId?: string | undefined;
|
|
416
|
+
criteriaResults?: {
|
|
417
|
+
description: string;
|
|
418
|
+
required: boolean;
|
|
419
|
+
criterionId: string;
|
|
420
|
+
result: "pass" | "fail" | "skipped";
|
|
421
|
+
feedback?: string | undefined;
|
|
422
|
+
evidence?: string | undefined;
|
|
423
|
+
}[] | undefined;
|
|
424
|
+
regressions?: {
|
|
425
|
+
description: string;
|
|
426
|
+
severity: "critical" | "major" | "minor";
|
|
427
|
+
evidence: string;
|
|
428
|
+
}[] | undefined;
|
|
429
|
+
designScore?: number | undefined;
|
|
430
|
+
generatorFeedback?: {
|
|
431
|
+
description: string;
|
|
432
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
433
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
434
|
+
expected?: string | undefined;
|
|
435
|
+
file?: string | undefined;
|
|
436
|
+
line?: number | undefined;
|
|
437
|
+
reproduction?: string | undefined;
|
|
438
|
+
}[] | undefined;
|
|
150
439
|
}>, "many">;
|
|
151
440
|
overallPassed: z.ZodBoolean;
|
|
152
441
|
aggregateFeedback: z.ZodString;
|
|
@@ -168,6 +457,31 @@ export declare const SprintEvaluationSchema: z.ZodObject<{
|
|
|
168
457
|
summary: string;
|
|
169
458
|
feedback: string;
|
|
170
459
|
score?: number | undefined;
|
|
460
|
+
iteration?: number | undefined;
|
|
461
|
+
contractId?: string | undefined;
|
|
462
|
+
criteriaResults?: {
|
|
463
|
+
description: string;
|
|
464
|
+
required: boolean;
|
|
465
|
+
criterionId: string;
|
|
466
|
+
result: "pass" | "fail" | "skipped";
|
|
467
|
+
feedback?: string | undefined;
|
|
468
|
+
evidence?: string | undefined;
|
|
469
|
+
}[] | undefined;
|
|
470
|
+
regressions?: {
|
|
471
|
+
description: string;
|
|
472
|
+
severity: "critical" | "major" | "minor";
|
|
473
|
+
evidence: string;
|
|
474
|
+
}[] | undefined;
|
|
475
|
+
designScore?: number | undefined;
|
|
476
|
+
generatorFeedback?: {
|
|
477
|
+
description: string;
|
|
478
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
479
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
480
|
+
expected?: string | undefined;
|
|
481
|
+
file?: string | undefined;
|
|
482
|
+
line?: number | undefined;
|
|
483
|
+
reproduction?: string | undefined;
|
|
484
|
+
}[] | undefined;
|
|
171
485
|
}[];
|
|
172
486
|
overallPassed: boolean;
|
|
173
487
|
aggregateFeedback: string;
|
|
@@ -189,6 +503,31 @@ export declare const SprintEvaluationSchema: z.ZodObject<{
|
|
|
189
503
|
summary: string;
|
|
190
504
|
feedback: string;
|
|
191
505
|
score?: number | undefined;
|
|
506
|
+
iteration?: number | undefined;
|
|
507
|
+
contractId?: string | undefined;
|
|
508
|
+
criteriaResults?: {
|
|
509
|
+
description: string;
|
|
510
|
+
required: boolean;
|
|
511
|
+
criterionId: string;
|
|
512
|
+
result: "pass" | "fail" | "skipped";
|
|
513
|
+
feedback?: string | undefined;
|
|
514
|
+
evidence?: string | undefined;
|
|
515
|
+
}[] | undefined;
|
|
516
|
+
regressions?: {
|
|
517
|
+
description: string;
|
|
518
|
+
severity: "critical" | "major" | "minor";
|
|
519
|
+
evidence: string;
|
|
520
|
+
}[] | undefined;
|
|
521
|
+
designScore?: number | undefined;
|
|
522
|
+
generatorFeedback?: {
|
|
523
|
+
description: string;
|
|
524
|
+
priority: "medium" | "critical" | "high" | "low";
|
|
525
|
+
category: "bug" | "missing-feature" | "regression" | "quality" | "performance";
|
|
526
|
+
expected?: string | undefined;
|
|
527
|
+
file?: string | undefined;
|
|
528
|
+
line?: number | undefined;
|
|
529
|
+
reproduction?: string | undefined;
|
|
530
|
+
}[] | undefined;
|
|
192
531
|
}[];
|
|
193
532
|
overallPassed: boolean;
|
|
194
533
|
aggregateFeedback: string;
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"eval-result.d.ts","sourceRoot":"","sources":["../../src/contracts/eval-result.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAIxB,eAAO,MAAM,cAAc,yCAAuC,CAAC;AACnE,MAAM,MAAM,QAAQ,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,cAAc,CAAC,CAAC;AAItD,eAAO,MAAM,gBAAgB;;;;;;;;;;;;;;;;;;;;;EAO3B,CAAC;AACH,MAAM,MAAM,UAAU,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,gBAAgB,CAAC,CAAC;AAI1D,eAAO,MAAM,gBAAgB
|
|
1
|
+
{"version":3,"file":"eval-result.d.ts","sourceRoot":"","sources":["../../src/contracts/eval-result.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAIxB,eAAO,MAAM,cAAc,yCAAuC,CAAC;AACnE,MAAM,MAAM,QAAQ,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,cAAc,CAAC,CAAC;AAItD,eAAO,MAAM,gBAAgB;;;;;;;;;;;;;;;;;;;;;EAO3B,CAAC;AACH,MAAM,MAAM,UAAU,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,gBAAgB,CAAC,CAAC;AAI1D,eAAO,MAAM,qBAAqB;;;;;;;;;;;;;;;;;;;;;EAOhC,CAAC;AACH,MAAM,MAAM,eAAe,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,qBAAqB,CAAC,CAAC;AAEpE,eAAO,MAAM,gBAAgB;;;;;;;;;;;;EAI3B,CAAC;AACH,MAAM,MAAM,UAAU,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,gBAAgB,CAAC,CAAC;AAE1D,eAAO,MAAM,2BAA2B;;;;;;;;;;;;;;;;;;;;;;;;EActC,CAAC;AACH,MAAM,MAAM,qBAAqB,GAAG,CAAC,CAAC,KAAK,CACzC,OAAO,2BAA2B,CACnC,CAAC;AAIF,eAAO,MAAM,gBAAgB;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;EAe3B,CAAC;AACH,MAAM,MAAM,UAAU,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,gBAAgB,CAAC,CAAC;AAI1D,eAAO,MAAM,sBAAsB;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;EAMjC,CAAC;AACH,MAAM,MAAM,gBAAgB,GAAG,CAAC,CAAC,KAAK,CAAC,OAAO,sBAAsB,CAAC,CAAC;AAItE;;GAEG;AACH,wBAAgB,gBAAgB,CAC9B,QAAQ,EAAE,MAAM,EAChB,KAAK,EAAE,MAAM,EACb,OAAO,EAAE,UAAU,EAAE,GACpB,gBAAgB,CAsBlB;AAED;;GAEG;AACH,wBAAgB,cAAc,CAAC,UAAU,EAAE,gBAAgB,GAAG,MAAM,CA2CnE"}
|
|
@@ -10,6 +10,35 @@ export const EvalDetailSchema = z.object({
|
|
|
10
10
|
line: z.number().int().optional(),
|
|
11
11
|
severity: SeveritySchema,
|
|
12
12
|
});
|
|
13
|
+
// ── Structured feedback types (for enriched results) ────────────────
|
|
14
|
+
export const CriterionResultSchema = z.object({
|
|
15
|
+
criterionId: z.string(),
|
|
16
|
+
description: z.string(),
|
|
17
|
+
required: z.boolean(),
|
|
18
|
+
result: z.enum(["pass", "fail", "skipped"]),
|
|
19
|
+
evidence: z.string().optional(),
|
|
20
|
+
feedback: z.string().optional(),
|
|
21
|
+
});
|
|
22
|
+
export const RegressionSchema = z.object({
|
|
23
|
+
description: z.string(),
|
|
24
|
+
evidence: z.string(),
|
|
25
|
+
severity: z.enum(["critical", "major", "minor"]),
|
|
26
|
+
});
|
|
27
|
+
export const GeneratorFeedbackItemSchema = z.object({
|
|
28
|
+
priority: z.enum(["critical", "high", "medium", "low"]),
|
|
29
|
+
category: z.enum([
|
|
30
|
+
"bug",
|
|
31
|
+
"missing-feature",
|
|
32
|
+
"regression",
|
|
33
|
+
"quality",
|
|
34
|
+
"performance",
|
|
35
|
+
]),
|
|
36
|
+
file: z.string().optional(),
|
|
37
|
+
line: z.number().optional(),
|
|
38
|
+
description: z.string(),
|
|
39
|
+
expected: z.string().optional(),
|
|
40
|
+
reproduction: z.string().optional(),
|
|
41
|
+
});
|
|
13
42
|
// ── Eval Result ─────────────────────────────────────────────────────
|
|
14
43
|
export const EvalResultSchema = z.object({
|
|
15
44
|
evaluator: z.string().min(1),
|
|
@@ -19,6 +48,13 @@ export const EvalResultSchema = z.object({
|
|
|
19
48
|
summary: z.string(),
|
|
20
49
|
feedback: z.string(),
|
|
21
50
|
timestamp: z.string().datetime(),
|
|
51
|
+
// Enriched fields (optional, populated by agent evaluator)
|
|
52
|
+
iteration: z.number().int().min(1).optional(),
|
|
53
|
+
contractId: z.string().optional(),
|
|
54
|
+
criteriaResults: z.array(CriterionResultSchema).optional(),
|
|
55
|
+
regressions: z.array(RegressionSchema).optional(),
|
|
56
|
+
designScore: z.number().min(0).max(100).optional(),
|
|
57
|
+
generatorFeedback: z.array(GeneratorFeedbackItemSchema).optional(),
|
|
22
58
|
});
|
|
23
59
|
// ── Sprint Evaluation ───────────────────────────────────────────────
|
|
24
60
|
export const SprintEvaluationSchema = z.object({
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"eval-result.js","sourceRoot":"","sources":["../../src/contracts/eval-result.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAExB,uEAAuE;AAEvE,MAAM,CAAC,MAAM,cAAc,GAAG,CAAC,CAAC,IAAI,CAAC,CAAC,OAAO,EAAE,SAAS,EAAE,MAAM,CAAC,CAAC,CAAC;AAGnE,uEAAuE;AAEvE,MAAM,CAAC,MAAM,gBAAgB,GAAG,CAAC,CAAC,MAAM,CAAC;IACvC,SAAS,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC;IAC5B,MAAM,EAAE,CAAC,CAAC,OAAO,EAAE;IACnB,OAAO,EAAE,CAAC,CAAC,MAAM,EAAE;IACnB,IAAI,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;IAC3B,IAAI,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,EAAE,CAAC,QAAQ,EAAE;IACjC,QAAQ,EAAE,cAAc;CACzB,CAAC,CAAC;AAGH,uEAAuE;AAEvE,MAAM,CAAC,MAAM,gBAAgB,GAAG,CAAC,CAAC,MAAM,CAAC;IACvC,SAAS,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC;IAC5B,MAAM,EAAE,CAAC,CAAC,OAAO,EAAE;IACnB,KAAK,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,QAAQ,EAAE;IAC5C,OAAO,EAAE,CAAC,CAAC,KAAK,CAAC,gBAAgB,CAAC;IAClC,OAAO,EAAE,CAAC,CAAC,MAAM,EAAE;IACnB,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE;IACpB,SAAS,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;
|
|
1
|
+
{"version":3,"file":"eval-result.js","sourceRoot":"","sources":["../../src/contracts/eval-result.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAExB,uEAAuE;AAEvE,MAAM,CAAC,MAAM,cAAc,GAAG,CAAC,CAAC,IAAI,CAAC,CAAC,OAAO,EAAE,SAAS,EAAE,MAAM,CAAC,CAAC,CAAC;AAGnE,uEAAuE;AAEvE,MAAM,CAAC,MAAM,gBAAgB,GAAG,CAAC,CAAC,MAAM,CAAC;IACvC,SAAS,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC;IAC5B,MAAM,EAAE,CAAC,CAAC,OAAO,EAAE;IACnB,OAAO,EAAE,CAAC,CAAC,MAAM,EAAE;IACnB,IAAI,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;IAC3B,IAAI,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,EAAE,CAAC,QAAQ,EAAE;IACjC,QAAQ,EAAE,cAAc;CACzB,CAAC,CAAC;AAGH,uEAAuE;AAEvE,MAAM,CAAC,MAAM,qBAAqB,GAAG,CAAC,CAAC,MAAM,CAAC;IAC5C,WAAW,EAAE,CAAC,CAAC,MAAM,EAAE;IACvB,WAAW,EAAE,CAAC,CAAC,MAAM,EAAE;IACvB,QAAQ,EAAE,CAAC,CAAC,OAAO,EAAE;IACrB,MAAM,EAAE,CAAC,CAAC,IAAI,CAAC,CAAC,MAAM,EAAE,MAAM,EAAE,SAAS,CAAC,CAAC;IAC3C,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;IAC/B,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;CAChC,CAAC,CAAC;AAGH,MAAM,CAAC,MAAM,gBAAgB,GAAG,CAAC,CAAC,MAAM,CAAC;IACvC,WAAW,EAAE,CAAC,CAAC,MAAM,EAAE;IACvB,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE;IACpB,QAAQ,EAAE,CAAC,CAAC,IAAI,CAAC,CAAC,UAAU,EAAE,OAAO,EAAE,OAAO,CAAC,CAAC;CACjD,CAAC,CAAC;AAGH,MAAM,CAAC,MAAM,2BAA2B,GAAG,CAAC,CAAC,MAAM,CAAC;IAClD,QAAQ,EAAE,CAAC,CAAC,IAAI,CAAC,CAAC,UAAU,EAAE,MAAM,EAAE,QAAQ,EAAE,KAAK,CAAC,CAAC;IACvD,QAAQ,EAAE,CAAC,CAAC,IAAI,CAAC;QACf,KAAK;QACL,iBAAiB;QACjB,YAAY;QACZ,SAAS;QACT,aAAa;KACd,CAAC;IACF,IAAI,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;IAC3B,IAAI,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;IAC3B,WAAW,EAAE,CAAC,CAAC,MAAM,EAAE;IACvB,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;IAC/B,YAAY,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;CACpC,CAAC,CAAC;AAKH,uEAAuE;AAEvE,MAAM,CAAC,MAAM,gBAAgB,GAAG,CAAC,CAAC,MAAM,CAAC;IACvC,SAAS,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC;IAC5B,MAAM,EAAE,CAAC,CAAC,OAAO,EAAE;IACnB,KAAK,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,QAAQ,EAAE;IAC5C,OAAO,EAAE,CAAC,CAAC,KAAK,CAAC,gBAAgB,CAAC;IAClC,OAAO,EAAE,CAAC,CAAC,MAAM,EAAE;IACnB,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE;IACpB,SAAS,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;IAChC,2DAA2D;IAC3D,SAAS,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,QAAQ,EAAE;IAC7C,UAAU,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,EAAE;IACjC,eAAe,EAAE,CAAC,CAAC,KAAK,CAAC,qBAAqB,CAAC,CAAC,QAAQ,EAAE;IAC1D,WAAW,EAAE,CAAC,CAAC,KAAK,CAAC,gBAAgB,CAAC,CAAC,QAAQ,EAAE;IACjD,WAAW,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,QAAQ,EAAE;IAClD,iBAAiB,EAAE,CAAC,CAAC,KAAK,CAAC,2BAA2B,CAAC,CAAC,QAAQ,EAAE;CACnE,CAAC,CAAC;AAGH,uEAAuE;AAEvE,MAAM,CAAC,MAAM,sBAAsB,GAAG,CAAC,CAAC,MAAM,CAAC;IAC7C,QAAQ,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC;IAC3B,KAAK,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,GAAG,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC;IAC9B,OAAO,EAAE,CAAC,CAAC,KAAK,CAAC,gBAAgB,CAAC;IAClC,aAAa,EAAE,CAAC,CAAC,OAAO,EAAE;IAC1B,iBAAiB,EAAE,CAAC,CAAC,MAAM,EAAE;CAC9B,CAAC,CAAC;AAGH,uEAAuE;AAEvE;;GAEG;AACH,MAAM,UAAU,gBAAgB,CAC9B,QAAgB,EAChB,KAAa,EACb,OAAqB;IAErB,MAAM,aAAa,GAAG,OAAO,CAAC,KAAK,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC;IAErD,MAAM,aAAa,GAAa,EAAE,CAAC;IAEnC,KAAK,MAAM,MAAM,IAAI,OAAO,EAAE,CAAC;QAC7B,IAAI,CAAC,MAAM,CAAC,MAAM,EAAE,CAAC;YACnB,aAAa,CAAC,IAAI,CAAC,IAAI,MAAM,CAAC,SAAS,aAAa,MAAM,CAAC,QAAQ,EAAE,CAAC,CAAC;QACzE,CAAC;aAAM,CAAC;YACN,aAAa,CAAC,IAAI,CAAC,IAAI,MAAM,CAAC,SAAS,aAAa,MAAM,CAAC,OAAO,EAAE,CAAC,CAAC;QACxE,CAAC;IACH,CAAC;IAED,MAAM,iBAAiB,GAAG,aAAa,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC;IAEnD,OAAO;QACL,QAAQ;QACR,KAAK;QACL,OAAO;QACP,aAAa;QACb,iBAAiB;KAClB,CAAC;AACJ,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,cAAc,CAAC,UAA4B;IACzD,MAAM,KAAK,GAAa,EAAE,CAAC;IAC3B,MAAM,WAAW,GAAG,UAAU,CAAC,aAAa,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC,CAAC,MAAM,CAAC;IAE/D,KAAK,CAAC,IAAI,CACR,UAAU,UAAU,CAAC,QAAQ,YAAY,UAAU,CAAC,KAAK,KAAK,WAAW,EAAE,CAC5E,CAAC;IACF,KAAK,CAAC,IAAI,CAAC,GAAG,CAAC,MAAM,CAAC,EAAE,CAAC,CAAC,CAAC;IAE3B,KAAK,MAAM,MAAM,IAAI,UAAU,CAAC,OAAO,EAAE,CAAC;QACxC,MAAM,YAAY,GAAG,MAAM,CAAC,MAAM,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC,CAAC,MAAM,CAAC;QACrD,KAAK,CAAC,IAAI,CACR,MAAM,YAAY,KAAK,MAAM,CAAC,SAAS,GAAG,MAAM,CAAC,KAAK,KAAK,SAAS,CAAC,CAAC,CAAC,YAAY,MAAM,CAAC,KAAK,OAAO,CAAC,CAAC,CAAC,EAAE,EAAE,CAC9G,CAAC;QACF,KAAK,CAAC,IAAI,CAAC,cAAc,MAAM,CAAC,OAAO,EAAE,CAAC,CAAC;QAE3C,MAAM,QAAQ,GAAG,MAAM,CAAC,OAAO,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC;QACzD,IAAI,QAAQ,CAAC,MAAM,GAAG,CAAC,EAAE,CAAC;YACxB,KAAK,CAAC,IAAI,CAAC,aAAa,QAAQ,CAAC,MAAM,IAAI,CAAC,CAAC;YAC7C,KAAK,MAAM,MAAM,IAAI,QAAQ,EAAE,CAAC;gBAC9B,MAAM,QAAQ,GACZ,MAAM,CAAC,IAAI;oBACT,CAAC,CAAC,OAAO,MAAM,CAAC,IAAI,GAAG,MAAM,CAAC,IAAI,KAAK,SAAS,CAAC,CAAC,CAAC,IAAI,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC,EAAE,EAAE;oBAC3E,CAAC,CAAC,EAAE,CAAC;gBACT,KAAK,CAAC,IAAI,CACR,UAAU,MAAM,CAAC,QAAQ,CAAC,WAAW,EAAE,KAAK,MAAM,CAAC,OAAO,GAAG,QAAQ,EAAE,CACxE,CAAC;YACJ,CAAC;QACH,CAAC;QAED,IAAI,CAAC,MAAM,CAAC,MAAM,IAAI,MAAM,CAAC,QAAQ,EAAE,CAAC;YACtC,KAAK,CAAC,IAAI,CAAC,eAAe,MAAM,CAAC,QAAQ,EAAE,CAAC,CAAC;QAC/C,CAAC;IACH,CAAC;IAED,KAAK,CAAC,IAAI,CAAC,IAAI,GAAG,GAAG,CAAC,MAAM,CAAC,EAAE,CAAC,CAAC,CAAC;IAClC,KAAK,CAAC,IAAI,CACR,UAAU,CAAC,aAAa;QACtB,CAAC,CAAC,yCAAyC;QAC3C,CAAC,CAAC,wCAAwC,CAC7C,CAAC;IAEF,OAAO,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC;AAC1B,CAAC"}
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"playwright.d.ts","sourceRoot":"","sources":["../../../src/evaluators/builtin/playwright.ts"],"names":[],"mappings":"AAIA,OAAO,KAAK,EACV,eAAe,EACf,WAAW,EACX,UAAU,EAEV,WAAW,EACZ,MAAM,wBAAwB,CAAC;AA8QhC,qBAAa,mBAAoB,YAAW,eAAe;IACzD,QAAQ,CAAC,IAAI,oBAAoB;IACjC,QAAQ,CAAC,WAAW,2DAA2D;IAEzE,MAAM,CAAC,WAAW,EAAE,MAAM,EAAE,OAAO,EAAE,WAAW,GAAG,OAAO,CAAC,OAAO,CAAC;IAInE,QAAQ,CAAC,OAAO,EAAE,WAAW,GAAG,OAAO,CAAC,UAAU,CAAC;
|
|
1
|
+
{"version":3,"file":"playwright.d.ts","sourceRoot":"","sources":["../../../src/evaluators/builtin/playwright.ts"],"names":[],"mappings":"AAIA,OAAO,KAAK,EACV,eAAe,EACf,WAAW,EACX,UAAU,EAEV,WAAW,EACZ,MAAM,wBAAwB,CAAC;AA8QhC,qBAAa,mBAAoB,YAAW,eAAe;IACzD,QAAQ,CAAC,IAAI,oBAAoB;IACjC,QAAQ,CAAC,WAAW,2DAA2D;IAEzE,MAAM,CAAC,WAAW,EAAE,MAAM,EAAE,OAAO,EAAE,WAAW,GAAG,OAAO,CAAC,OAAO,CAAC;IAInE,QAAQ,CAAC,OAAO,EAAE,WAAW,GAAG,OAAO,CAAC,UAAU,CAAC;IAgHzD;;;OAGG;YACW,kBAAkB;YAYlB,aAAa;IA2E3B;;;;OAIG;YACW,mBAAmB;YAsBnB,eAAe;IAkF7B,OAAO,CAAC,aAAa;IA8BrB,OAAO,CAAC,WAAW;CAmCpB;AAED;;GAEG;AACH,wBAAgB,yBAAyB,IAAI,eAAe,CAE3D"}
|
|
@@ -199,14 +199,27 @@ export class PlaywrightEvaluator {
|
|
|
199
199
|
const timestamp = new Date().toISOString();
|
|
200
200
|
const timeout = strategy.config?.timeout ?? DEFAULT_TIMEOUT_MS;
|
|
201
201
|
// Check if Playwright is installed at all
|
|
202
|
+
// When strategy is required, missing prerequisites = FAIL (not skip)
|
|
203
|
+
const isRequired = strategy.required;
|
|
202
204
|
if (!(await hasPlaywright(projectRoot))) {
|
|
203
205
|
return {
|
|
204
206
|
evaluator: this.name,
|
|
205
|
-
passed:
|
|
206
|
-
score: undefined,
|
|
207
|
-
details:
|
|
208
|
-
|
|
209
|
-
|
|
207
|
+
passed: !isRequired,
|
|
208
|
+
score: isRequired ? 0 : undefined,
|
|
209
|
+
details: isRequired
|
|
210
|
+
? [
|
|
211
|
+
{
|
|
212
|
+
criterion: "Playwright installation",
|
|
213
|
+
passed: false,
|
|
214
|
+
message: "Playwright is configured as a required evaluation strategy but is not installed.",
|
|
215
|
+
severity: "error",
|
|
216
|
+
},
|
|
217
|
+
]
|
|
218
|
+
: [],
|
|
219
|
+
summary: isRequired
|
|
220
|
+
? "FAIL: Playwright is required but not installed."
|
|
221
|
+
: "Playwright not installed. Skipped.",
|
|
222
|
+
feedback: "Playwright is not installed in this project. Run /bober-playwright setup to initialize.",
|
|
210
223
|
timestamp,
|
|
211
224
|
};
|
|
212
225
|
}
|
|
@@ -214,11 +227,22 @@ export class PlaywrightEvaluator {
|
|
|
214
227
|
if (!(await hasPlaywrightConfig(projectRoot))) {
|
|
215
228
|
return {
|
|
216
229
|
evaluator: this.name,
|
|
217
|
-
passed:
|
|
218
|
-
score: undefined,
|
|
219
|
-
details:
|
|
220
|
-
|
|
221
|
-
|
|
230
|
+
passed: !isRequired,
|
|
231
|
+
score: isRequired ? 0 : undefined,
|
|
232
|
+
details: isRequired
|
|
233
|
+
? [
|
|
234
|
+
{
|
|
235
|
+
criterion: "Playwright config",
|
|
236
|
+
passed: false,
|
|
237
|
+
message: "Playwright is configured as a required strategy but no playwright.config.ts was found.",
|
|
238
|
+
severity: "error",
|
|
239
|
+
},
|
|
240
|
+
]
|
|
241
|
+
: [],
|
|
242
|
+
summary: isRequired
|
|
243
|
+
? "FAIL: Playwright is required but config is missing."
|
|
244
|
+
: "Playwright config not found. Skipped.",
|
|
245
|
+
feedback: "No playwright.config.ts found. Run /bober-playwright setup to create one.",
|
|
222
246
|
timestamp,
|
|
223
247
|
};
|
|
224
248
|
}
|
|
@@ -226,11 +250,22 @@ export class PlaywrightEvaluator {
|
|
|
226
250
|
if (!(await hasE2eTests(projectRoot))) {
|
|
227
251
|
return {
|
|
228
252
|
evaluator: this.name,
|
|
229
|
-
passed:
|
|
230
|
-
score: undefined,
|
|
231
|
-
details:
|
|
232
|
-
|
|
233
|
-
|
|
253
|
+
passed: !isRequired,
|
|
254
|
+
score: isRequired ? 0 : undefined,
|
|
255
|
+
details: isRequired
|
|
256
|
+
? [
|
|
257
|
+
{
|
|
258
|
+
criterion: "E2E test files",
|
|
259
|
+
passed: false,
|
|
260
|
+
message: "Playwright is configured as a required strategy but no E2E test files exist in e2e/ directory.",
|
|
261
|
+
severity: "error",
|
|
262
|
+
},
|
|
263
|
+
]
|
|
264
|
+
: [],
|
|
265
|
+
summary: isRequired
|
|
266
|
+
? "FAIL: Playwright is required but no E2E tests exist."
|
|
267
|
+
: "No E2E test files found. Skipped.",
|
|
268
|
+
feedback: "No test files found in e2e/ directory. The generator must create Playwright tests.",
|
|
234
269
|
timestamp,
|
|
235
270
|
};
|
|
236
271
|
}
|