agent-regression-lab 0.3.0 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +25 -4
- package/bin/agentlab.js +2 -0
- package/dist/config.js +13 -9
- package/dist/index.js +14 -0
- package/dist/init.js +88 -0
- package/dist/tools.js +18 -2
- package/dist/ui/App.js +49 -7
- package/dist/ui-assets/client.css +1108 -116
- package/dist/ui-assets/client.js +863 -426
- package/docs/coding-agents.md +74 -0
- package/docs/superpowers/plans/2026-04-13-phase-2-lite-phase-3-plan.md +160 -0
- package/docs/superpowers/plans/2026-04-13-phase-one-npm-tools-plan.md +502 -0
- package/docs/superpowers/plans/2026-04-16-regression-atlas-ui-redesign.md +1010 -0
- package/docs/superpowers/specs/2026-04-13-phase-2-lite-phase-3-design.md +164 -0
- package/docs/superpowers/specs/2026-04-16-regression-atlas-ui-redesign-design.md +417 -0
- package/docs/tools.md +34 -3
- package/docs/troubleshooting.md +55 -0
- package/examples/coding-tools/README.md +21 -0
- package/examples/coding-tools/index.js +11 -0
- package/examples/coding-tools/package.json +8 -0
- package/examples/support-tools/README.md +21 -0
- package/examples/support-tools/index.js +8 -0
- package/examples/support-tools/package.json +8 -0
- package/package.json +6 -4
|
@@ -0,0 +1,1010 @@
|
|
|
1
|
+
# Regression Atlas UI Redesign Implementation Plan
|
|
2
|
+
|
|
3
|
+
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
4
|
+
|
|
5
|
+
**Goal:** Replace the current generic run-inspection UI with the approved Regression Atlas experience: scenario-graph-led home, industrial-lab shell, cinematic case surfaces, compare chamber, and archive fallback.
|
|
6
|
+
|
|
7
|
+
**Architecture:** Keep the existing server routes and API payloads, but refactor the UI into explicit view-model and presentation layers. Build the new product shell around the existing `App.tsx` entry, derive atlas data from `/api/runs`, and reuse existing detail/compare payloads for the case and compare surfaces before considering backend changes.
|
|
8
|
+
|
|
9
|
+
**Tech Stack:** React 19, TypeScript, esbuild, CSS, node:test, react-dom/server
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## File Structure
|
|
14
|
+
|
|
15
|
+
### Existing files to modify
|
|
16
|
+
|
|
17
|
+
- `src/ui/App.tsx` - replace the current page assembly with the atlas shell and route-specific screens
|
|
18
|
+
- `src/ui/styles.css` - replace the current parchment-like dashboard theme with the industrial-lab visual system
|
|
19
|
+
- `tests/launch/ui-smoke.test.ts` - update smoke coverage for the new shell, atlas, case surface, and compare chamber
|
|
20
|
+
|
|
21
|
+
### Files to create
|
|
22
|
+
|
|
23
|
+
- `src/ui/types.ts` - shared UI data types moved out of `App.tsx`
|
|
24
|
+
- `src/ui/viewModels.ts` - atlas grouping, scenario summaries, compare staging helpers, formatting helpers
|
|
25
|
+
- `src/ui/components.tsx` - reusable presentation components such as verdict totems, evidence drawer sections, archive rows, and compare staging chips
|
|
26
|
+
|
|
27
|
+
### Responsibility boundaries
|
|
28
|
+
|
|
29
|
+
- `App.tsx` should own route resolution, fetch lifecycle, and page composition only
|
|
30
|
+
- `types.ts` should define all UI-facing payload types in one place
|
|
31
|
+
- `viewModels.ts` should transform API payloads into atlas, archive, and evidence structures without React dependencies
|
|
32
|
+
- `components.tsx` should hold presentation primitives reused across atlas, cases, compare, and archive surfaces
|
|
33
|
+
- `styles.css` should define the entire material system, shell layout, motion, and responsive rules
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
### Task 1: Extract Shared UI Types And Atlas View Models
|
|
38
|
+
|
|
39
|
+
**Files:**
|
|
40
|
+
- Create: `src/ui/types.ts`
|
|
41
|
+
- Create: `src/ui/viewModels.ts`
|
|
42
|
+
- Modify: `src/ui/App.tsx`
|
|
43
|
+
- Modify: `tests/launch/ui-smoke.test.ts`
|
|
44
|
+
|
|
45
|
+
- [ ] **Step 1: Write the failing tests for atlas derivation and shared exports**
|
|
46
|
+
|
|
47
|
+
Add these tests to `tests/launch/ui-smoke.test.ts`:
|
|
48
|
+
|
|
49
|
+
```ts
|
|
50
|
+
import { buildAtlasGroups, createCompareStage, summarizeRuns } from "../../src/ui/viewModels.js";
|
|
51
|
+
|
|
52
|
+
test("atlas groups scenarios by suite and keeps latest run first", () => {
|
|
53
|
+
const atlas = buildAtlasGroups([
|
|
54
|
+
{
|
|
55
|
+
id: "run_new",
|
|
56
|
+
scenarioId: "support.refund-correct-order",
|
|
57
|
+
suite: "support",
|
|
58
|
+
agentVersionId: "agent_1",
|
|
59
|
+
provider: "mock",
|
|
60
|
+
status: "fail",
|
|
61
|
+
score: 0,
|
|
62
|
+
durationMs: 22,
|
|
63
|
+
totalSteps: 4,
|
|
64
|
+
startedAt: "2026-04-10T00:01:00.000Z",
|
|
65
|
+
},
|
|
66
|
+
{
|
|
67
|
+
id: "run_old",
|
|
68
|
+
scenarioId: "support.refund-correct-order",
|
|
69
|
+
suite: "support",
|
|
70
|
+
agentVersionId: "agent_1",
|
|
71
|
+
provider: "mock",
|
|
72
|
+
status: "pass",
|
|
73
|
+
score: 100,
|
|
74
|
+
durationMs: 11,
|
|
75
|
+
totalSteps: 3,
|
|
76
|
+
startedAt: "2026-04-10T00:00:00.000Z",
|
|
77
|
+
},
|
|
78
|
+
{
|
|
79
|
+
id: "run_ops",
|
|
80
|
+
scenarioId: "ops.payments-api-alert",
|
|
81
|
+
suite: "ops",
|
|
82
|
+
agentVersionId: "agent_2",
|
|
83
|
+
provider: "http",
|
|
84
|
+
status: "error",
|
|
85
|
+
score: 0,
|
|
86
|
+
durationMs: 15,
|
|
87
|
+
totalSteps: 2,
|
|
88
|
+
startedAt: "2026-04-10T00:02:00.000Z",
|
|
89
|
+
},
|
|
90
|
+
]);
|
|
91
|
+
|
|
92
|
+
assert.equal(atlas.length, 2);
|
|
93
|
+
assert.equal(atlas[0]?.suite, "ops");
|
|
94
|
+
assert.equal(atlas[1]?.scenarios[0]?.latestRun.id, "run_new");
|
|
95
|
+
assert.equal(atlas[1]?.scenarios[0]?.statusCounts.fail, 1);
|
|
96
|
+
assert.equal(atlas[1]?.scenarios[0]?.statusCounts.pass, 1);
|
|
97
|
+
});
|
|
98
|
+
|
|
99
|
+
test("compare staging keeps only compatible atlas picks", () => {
|
|
100
|
+
const stage = createCompareStage();
|
|
101
|
+
const staged = stage.pinRun("support.refund-correct-order", "run_a").pinRun("support.refund-correct-order", "run_b");
|
|
102
|
+
|
|
103
|
+
assert.deepEqual(staged.runPair, { baseline: "run_a", candidate: "run_b" });
|
|
104
|
+
assert.equal(staged.isRunCompareReady, true);
|
|
105
|
+
});
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
- [ ] **Step 2: Run the targeted test to verify it fails**
|
|
109
|
+
|
|
110
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
111
|
+
|
|
112
|
+
Expected: FAIL with a module export error for `../../src/ui/viewModels.js` or missing exports.
|
|
113
|
+
|
|
114
|
+
- [ ] **Step 3: Create the shared type file**
|
|
115
|
+
|
|
116
|
+
Create `src/ui/types.ts`:
|
|
117
|
+
|
|
118
|
+
```ts
|
|
119
|
+
export type RunListItem = {
|
|
120
|
+
id: string;
|
|
121
|
+
scenarioId: string;
|
|
122
|
+
suite: string;
|
|
123
|
+
suiteBatchId?: string;
|
|
124
|
+
variantSetName?: string;
|
|
125
|
+
variantLabel?: string;
|
|
126
|
+
agentVersionId: string;
|
|
127
|
+
agentLabel?: string;
|
|
128
|
+
provider?: string;
|
|
129
|
+
modelId?: string;
|
|
130
|
+
status: "pass" | "fail" | "error";
|
|
131
|
+
score: number;
|
|
132
|
+
durationMs: number;
|
|
133
|
+
totalSteps: number;
|
|
134
|
+
startedAt: string;
|
|
135
|
+
};
|
|
136
|
+
|
|
137
|
+
export type RunDetail = {
|
|
138
|
+
run: {
|
|
139
|
+
id: string;
|
|
140
|
+
scenarioId: string;
|
|
141
|
+
status: string;
|
|
142
|
+
score: number;
|
|
143
|
+
durationMs: number;
|
|
144
|
+
totalSteps: number;
|
|
145
|
+
terminationReason: string;
|
|
146
|
+
finalOutput: string;
|
|
147
|
+
startedAt: string;
|
|
148
|
+
variantSetName?: string;
|
|
149
|
+
variantLabel?: string;
|
|
150
|
+
promptVersion?: string;
|
|
151
|
+
modelVersion?: string;
|
|
152
|
+
toolSchemaVersion?: string;
|
|
153
|
+
configLabel?: string;
|
|
154
|
+
configHash?: string;
|
|
155
|
+
runtimeProfileName?: string;
|
|
156
|
+
suiteDefinitionName?: string;
|
|
157
|
+
};
|
|
158
|
+
agentVersion?: {
|
|
159
|
+
provider?: string;
|
|
160
|
+
modelId?: string;
|
|
161
|
+
label: string;
|
|
162
|
+
command?: string;
|
|
163
|
+
args?: string[];
|
|
164
|
+
variantSetName?: string;
|
|
165
|
+
variantLabel?: string;
|
|
166
|
+
promptVersion?: string;
|
|
167
|
+
modelVersion?: string;
|
|
168
|
+
toolSchemaVersion?: string;
|
|
169
|
+
configLabel?: string;
|
|
170
|
+
configHash?: string;
|
|
171
|
+
runtimeProfileName?: string;
|
|
172
|
+
suiteDefinitionName?: string;
|
|
173
|
+
};
|
|
174
|
+
evaluatorResults: Array<{ evaluatorId: string; status: string; message: string }>;
|
|
175
|
+
toolCalls: Array<{ id: string; toolName: string; input: unknown; output?: unknown; status: string }>;
|
|
176
|
+
traceEvents: Array<{ eventId: string; stepIndex: number; source: string; type: string; payload: Record<string, unknown> }>;
|
|
177
|
+
errorDetail?: string;
|
|
178
|
+
};
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
- [ ] **Step 4: Implement atlas and compare-staging view models**
|
|
182
|
+
|
|
183
|
+
Create `src/ui/viewModels.ts`:
|
|
184
|
+
|
|
185
|
+
```ts
|
|
186
|
+
import type { RunListItem } from "./types.js";
|
|
187
|
+
|
|
188
|
+
export type AtlasScenario = {
|
|
189
|
+
scenarioId: string;
|
|
190
|
+
suite: string;
|
|
191
|
+
latestRun: RunListItem;
|
|
192
|
+
runs: RunListItem[];
|
|
193
|
+
statusCounts: { pass: number; fail: number; error: number };
|
|
194
|
+
};
|
|
195
|
+
|
|
196
|
+
export type AtlasGroup = {
|
|
197
|
+
suite: string;
|
|
198
|
+
scenarios: AtlasScenario[];
|
|
199
|
+
latestStartedAt: string;
|
|
200
|
+
};
|
|
201
|
+
|
|
202
|
+
export function buildAtlasGroups(runs: RunListItem[]): AtlasGroup[] {
|
|
203
|
+
const byScenario = new Map<string, RunListItem[]>();
|
|
204
|
+
for (const run of runs) {
|
|
205
|
+
const key = `${run.suite}::${run.scenarioId}`;
|
|
206
|
+
byScenario.set(key, [...(byScenario.get(key) ?? []), run].sort((a, b) => b.startedAt.localeCompare(a.startedAt)));
|
|
207
|
+
}
|
|
208
|
+
|
|
209
|
+
const scenarios = [...byScenario.values()].map((scenarioRuns) => {
|
|
210
|
+
const latestRun = scenarioRuns[0]!;
|
|
211
|
+
return {
|
|
212
|
+
scenarioId: latestRun.scenarioId,
|
|
213
|
+
suite: latestRun.suite,
|
|
214
|
+
latestRun,
|
|
215
|
+
runs: scenarioRuns,
|
|
216
|
+
statusCounts: scenarioRuns.reduce(
|
|
217
|
+
(counts, run) => ({ ...counts, [run.status]: counts[run.status] + 1 }),
|
|
218
|
+
{ pass: 0, fail: 0, error: 0 },
|
|
219
|
+
),
|
|
220
|
+
};
|
|
221
|
+
});
|
|
222
|
+
|
|
223
|
+
const bySuite = new Map<string, AtlasScenario[]>();
|
|
224
|
+
for (const scenario of scenarios) {
|
|
225
|
+
bySuite.set(scenario.suite, [...(bySuite.get(scenario.suite) ?? []), scenario]);
|
|
226
|
+
}
|
|
227
|
+
|
|
228
|
+
return [...bySuite.entries()]
|
|
229
|
+
.map(([suite, suiteScenarios]) => ({
|
|
230
|
+
suite,
|
|
231
|
+
scenarios: suiteScenarios.sort((a, b) => b.latestRun.startedAt.localeCompare(a.latestRun.startedAt)),
|
|
232
|
+
latestStartedAt: suiteScenarios[0]!.latestRun.startedAt,
|
|
233
|
+
}))
|
|
234
|
+
.sort((a, b) => b.latestStartedAt.localeCompare(a.latestStartedAt));
|
|
235
|
+
}
|
|
236
|
+
|
|
237
|
+
export function createCompareStage() {
|
|
238
|
+
const createStage = (baseline = "", candidate = "") => ({
|
|
239
|
+
runPair: { baseline, candidate },
|
|
240
|
+
isRunCompareReady: Boolean(baseline && candidate),
|
|
241
|
+
pinRun(_scenarioId: string, runId: string) {
|
|
242
|
+
if (!baseline) {
|
|
243
|
+
return createStage(runId, candidate);
|
|
244
|
+
}
|
|
245
|
+
return createStage(baseline, runId);
|
|
246
|
+
},
|
|
247
|
+
});
|
|
248
|
+
|
|
249
|
+
return createStage();
|
|
250
|
+
}
|
|
251
|
+
|
|
252
|
+
export function summarizeRuns(runs: RunListItem[]) {
|
|
253
|
+
return runs.reduce(
|
|
254
|
+
(summary, run, index) => ({
|
|
255
|
+
total: summary.total + 1,
|
|
256
|
+
pass: summary.pass + (run.status === "pass" ? 1 : 0),
|
|
257
|
+
fail: summary.fail + (run.status === "fail" ? 1 : 0),
|
|
258
|
+
error: summary.error + (run.status === "error" ? 1 : 0),
|
|
259
|
+
latestSuite: index === 0 ? run.suite : summary.latestSuite,
|
|
260
|
+
latestProvider: index === 0 ? run.provider ?? "-" : summary.latestProvider,
|
|
261
|
+
}),
|
|
262
|
+
{ total: 0, pass: 0, fail: 0, error: 0, latestSuite: "-", latestProvider: "-" },
|
|
263
|
+
);
|
|
264
|
+
}
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
- [ ] **Step 5: Update `src/ui/App.tsx` to import shared types and helpers**
|
|
268
|
+
|
|
269
|
+
Replace the inline type declarations at the top of `src/ui/App.tsx` with:
|
|
270
|
+
|
|
271
|
+
```ts
|
|
272
|
+
import React, { useEffect, useState } from "react";
|
|
273
|
+
|
|
274
|
+
import type { RunDetail, RunListItem } from "./types.js";
|
|
275
|
+
import { buildAtlasGroups, createCompareStage, summarizeRuns } from "./viewModels.js";
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
Keep the existing `summarizeRuns` function temporarily only if needed to unblock the refactor, then move its implementation into `viewModels.ts`.
|
|
279
|
+
|
|
280
|
+
- [ ] **Step 6: Run the targeted test to verify it passes**
|
|
281
|
+
|
|
282
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
283
|
+
|
|
284
|
+
Expected: PASS for the new atlas and compare staging tests.
|
|
285
|
+
|
|
286
|
+
- [ ] **Step 7: Commit**
|
|
287
|
+
|
|
288
|
+
```bash
|
|
289
|
+
git add src/ui/types.ts src/ui/viewModels.ts src/ui/App.tsx tests/launch/ui-smoke.test.ts
|
|
290
|
+
git commit -m "refactor: extract ui view models for regression atlas"
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
### Task 2: Build The Regression Atlas Shell And Archive Fallback
|
|
294
|
+
|
|
295
|
+
**Files:**
|
|
296
|
+
- Create: `src/ui/components.tsx`
|
|
297
|
+
- Modify: `src/ui/App.tsx`
|
|
298
|
+
- Modify: `src/ui/styles.css`
|
|
299
|
+
- Modify: `tests/launch/ui-smoke.test.ts`
|
|
300
|
+
|
|
301
|
+
- [ ] **Step 1: Write the failing shell and atlas smoke tests**
|
|
302
|
+
|
|
303
|
+
Add these tests to `tests/launch/ui-smoke.test.ts`:
|
|
304
|
+
|
|
305
|
+
```ts
|
|
306
|
+
import { AtlasHome, LabShell } from "../../src/ui/App.js";
|
|
307
|
+
|
|
308
|
+
test("atlas home renders suite chambers and scenario nodes", () => {
|
|
309
|
+
const markup = renderToStaticMarkup(
|
|
310
|
+
React.createElement(AtlasHome, {
|
|
311
|
+
runs: [
|
|
312
|
+
{
|
|
313
|
+
id: "run_1",
|
|
314
|
+
scenarioId: "support.refund-correct-order",
|
|
315
|
+
suite: "support",
|
|
316
|
+
agentVersionId: "agent_1",
|
|
317
|
+
provider: "mock",
|
|
318
|
+
status: "fail",
|
|
319
|
+
score: 0,
|
|
320
|
+
durationMs: 12,
|
|
321
|
+
totalSteps: 3,
|
|
322
|
+
startedAt: "2026-04-09T00:00:00.000Z",
|
|
323
|
+
},
|
|
324
|
+
],
|
|
325
|
+
}),
|
|
326
|
+
);
|
|
327
|
+
|
|
328
|
+
assert.match(markup, /Regression Atlas/);
|
|
329
|
+
assert.match(markup, /suite chamber/);
|
|
330
|
+
assert.match(markup, /support\.refund-correct-order/);
|
|
331
|
+
});
|
|
332
|
+
|
|
333
|
+
test("lab shell renders atlas cases compare and archive mode labels", () => {
|
|
334
|
+
const markup = renderToStaticMarkup(
|
|
335
|
+
React.createElement(LabShell, {
|
|
336
|
+
mode: "Atlas",
|
|
337
|
+
staging: { runPair: { baseline: "run_a", candidate: "run_b" }, isRunCompareReady: true },
|
|
338
|
+
children: React.createElement("div", null, "body"),
|
|
339
|
+
evidence: null,
|
|
340
|
+
}),
|
|
341
|
+
);
|
|
342
|
+
|
|
343
|
+
assert.match(markup, /Atlas/);
|
|
344
|
+
assert.match(markup, /Cases/);
|
|
345
|
+
assert.match(markup, /Compare/);
|
|
346
|
+
assert.match(markup, /Archive/);
|
|
347
|
+
assert.match(markup, /Compare ready/);
|
|
348
|
+
});
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
- [ ] **Step 2: Run the targeted test to verify it fails**
|
|
352
|
+
|
|
353
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
354
|
+
|
|
355
|
+
Expected: FAIL with missing exports for `AtlasHome` and `LabShell`.
|
|
356
|
+
|
|
357
|
+
- [ ] **Step 3: Create the reusable UI primitives**
|
|
358
|
+
|
|
359
|
+
Create `src/ui/components.tsx`:
|
|
360
|
+
|
|
361
|
+
```tsx
|
|
362
|
+
import React from "react";
|
|
363
|
+
|
|
364
|
+
export function VerdictTotem(props: { status: "pass" | "fail" | "error" | "neutral"; label: string }): React.JSX.Element {
|
|
365
|
+
return (
|
|
366
|
+
<div className={`verdict-totem ${props.status}`}>
|
|
367
|
+
<span className="verdict-glyph" aria-hidden="true" />
|
|
368
|
+
<span>{props.label}</span>
|
|
369
|
+
</div>
|
|
370
|
+
);
|
|
371
|
+
}
|
|
372
|
+
|
|
373
|
+
export function SectionFrame(props: { eyebrow: string; title: string; children: React.ReactNode }): React.JSX.Element {
|
|
374
|
+
return (
|
|
375
|
+
<section className="section-frame">
|
|
376
|
+
<p className="eyebrow">{props.eyebrow}</p>
|
|
377
|
+
<h2>{props.title}</h2>
|
|
378
|
+
{props.children}
|
|
379
|
+
</section>
|
|
380
|
+
);
|
|
381
|
+
}
|
|
382
|
+
|
|
383
|
+
export function CompareStageChip(props: { label: string; value: string }): React.JSX.Element {
|
|
384
|
+
return (
|
|
385
|
+
<div className="compare-stage-chip">
|
|
386
|
+
<span className="compare-stage-label">{props.label}</span>
|
|
387
|
+
<strong>{props.value}</strong>
|
|
388
|
+
</div>
|
|
389
|
+
);
|
|
390
|
+
}
|
|
391
|
+
```
|
|
392
|
+
|
|
393
|
+
- [ ] **Step 4: Implement the shell and atlas home in `src/ui/App.tsx`**
|
|
394
|
+
|
|
395
|
+
Add these exported components:
|
|
396
|
+
|
|
397
|
+
```tsx
|
|
398
|
+
export function LabShell(props: {
|
|
399
|
+
mode: "Atlas" | "Cases" | "Compare" | "Archive";
|
|
400
|
+
staging: { runPair: { baseline: string; candidate: string }; isRunCompareReady: boolean };
|
|
401
|
+
evidence: React.ReactNode;
|
|
402
|
+
children: React.ReactNode;
|
|
403
|
+
}): React.JSX.Element {
|
|
404
|
+
return (
|
|
405
|
+
<div className="lab-shell">
|
|
406
|
+
<aside className="lab-rail">
|
|
407
|
+
<a className="lab-brand" href="/">Agent Regression Lab</a>
|
|
408
|
+
<nav className="mode-nav" aria-label="Primary">
|
|
409
|
+
<a href="/" className={props.mode === "Atlas" ? "active" : ""}>Atlas</a>
|
|
410
|
+
<a href="/" className={props.mode === "Cases" ? "active" : ""}>Cases</a>
|
|
411
|
+
<a href="/compare" className={props.mode === "Compare" ? "active" : ""}>Compare</a>
|
|
412
|
+
<a href="/" className={props.mode === "Archive" ? "active" : ""}>Archive</a>
|
|
413
|
+
</nav>
|
|
414
|
+
</aside>
|
|
415
|
+
<main className="lab-stage">{props.children}</main>
|
|
416
|
+
<aside className="evidence-drawer">{props.evidence}</aside>
|
|
417
|
+
<div className="compare-stage">
|
|
418
|
+
<CompareStageChip label="Baseline" value={props.staging.runPair.baseline || "pin a run"} />
|
|
419
|
+
<CompareStageChip label="Candidate" value={props.staging.runPair.candidate || "pin a run"} />
|
|
420
|
+
<span className="compare-stage-state">{props.staging.isRunCompareReady ? "Compare ready" : "Awaiting pair"}</span>
|
|
421
|
+
</div>
|
|
422
|
+
</div>
|
|
423
|
+
);
|
|
424
|
+
}
|
|
425
|
+
|
|
426
|
+
export function AtlasHome(props: { runs: RunListItem[] }): React.JSX.Element {
|
|
427
|
+
const groups = buildAtlasGroups(props.runs);
|
|
428
|
+
return (
|
|
429
|
+
<section className="atlas-home">
|
|
430
|
+
<header className="atlas-hero">
|
|
431
|
+
<p className="eyebrow">Regression Atlas</p>
|
|
432
|
+
<h1>Scenario graph as the product surface</h1>
|
|
433
|
+
</header>
|
|
434
|
+
<div className="atlas-grid">
|
|
435
|
+
{groups.map((group) => (
|
|
436
|
+
<section key={group.suite} className="suite-chamber">
|
|
437
|
+
<p className="suite-label">suite chamber</p>
|
|
438
|
+
<h2>{group.suite}</h2>
|
|
439
|
+
<ul className="scenario-node-list">
|
|
440
|
+
{group.scenarios.map((scenario) => (
|
|
441
|
+
<li key={scenario.scenarioId} className="scenario-node">
|
|
442
|
+
<a href={`/runs/${scenario.latestRun.id}`}>{scenario.scenarioId}</a>
|
|
443
|
+
</li>
|
|
444
|
+
))}
|
|
445
|
+
</ul>
|
|
446
|
+
</section>
|
|
447
|
+
))}
|
|
448
|
+
</div>
|
|
449
|
+
</section>
|
|
450
|
+
);
|
|
451
|
+
}
|
|
452
|
+
```
|
|
453
|
+
|
|
454
|
+
Then wire the list route through `LabShell` and render `AtlasHome` instead of the old table-first page. Keep the archive table beneath the atlas as the fallback section on the same route.
|
|
455
|
+
|
|
456
|
+
- [ ] **Step 5: Replace the CSS foundation with the industrial-lab shell**
|
|
457
|
+
|
|
458
|
+
Update `src/ui/styles.css` with these foundation blocks:
|
|
459
|
+
|
|
460
|
+
```css
|
|
461
|
+
:root {
|
|
462
|
+
color-scheme: light;
|
|
463
|
+
--bg: #d8dde2;
|
|
464
|
+
--bg-deep: #a9b4bf;
|
|
465
|
+
--panel: rgba(245, 247, 249, 0.88);
|
|
466
|
+
--panel-strong: rgba(233, 238, 242, 0.96);
|
|
467
|
+
--ink: #11161b;
|
|
468
|
+
--muted: #47525c;
|
|
469
|
+
--line: rgba(17, 22, 27, 0.16);
|
|
470
|
+
--accent: #c54f24;
|
|
471
|
+
--signal: #d1a33a;
|
|
472
|
+
--pass: #467f5b;
|
|
473
|
+
--fail: #9b321f;
|
|
474
|
+
--error: #6a3f1d;
|
|
475
|
+
--shadow: 0 30px 80px rgba(16, 22, 28, 0.12);
|
|
476
|
+
}
|
|
477
|
+
|
|
478
|
+
body {
|
|
479
|
+
margin: 0;
|
|
480
|
+
color: var(--ink);
|
|
481
|
+
font-family: "IBM Plex Sans", "Aptos", sans-serif;
|
|
482
|
+
background:
|
|
483
|
+
radial-gradient(circle at top left, rgba(255,255,255,0.75), transparent 32%),
|
|
484
|
+
linear-gradient(135deg, #e5eaee 0%, var(--bg) 48%, var(--bg-deep) 100%);
|
|
485
|
+
}
|
|
486
|
+
|
|
487
|
+
.lab-shell {
|
|
488
|
+
min-height: 100vh;
|
|
489
|
+
display: grid;
|
|
490
|
+
grid-template-columns: 220px minmax(0, 1fr) 320px;
|
|
491
|
+
grid-template-rows: minmax(0, 1fr) 88px;
|
|
492
|
+
}
|
|
493
|
+
|
|
494
|
+
.suite-chamber,
|
|
495
|
+
.section-frame,
|
|
496
|
+
.archive-panel,
|
|
497
|
+
.evidence-drawer,
|
|
498
|
+
.compare-stage {
|
|
499
|
+
background: var(--panel);
|
|
500
|
+
border: 1px solid var(--line);
|
|
501
|
+
box-shadow: var(--shadow);
|
|
502
|
+
backdrop-filter: blur(14px);
|
|
503
|
+
}
|
|
504
|
+
```
|
|
505
|
+
|
|
506
|
+
Add matching classes for `.lab-rail`, `.lab-stage`, `.atlas-grid`, `.scenario-node`, `.compare-stage`, and a mobile fallback under `@media (max-width: 980px)`.
|
|
507
|
+
|
|
508
|
+
- [ ] **Step 6: Run the targeted test to verify it passes**
|
|
509
|
+
|
|
510
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
511
|
+
|
|
512
|
+
Expected: PASS for the shell and atlas rendering tests.
|
|
513
|
+
|
|
514
|
+
- [ ] **Step 7: Commit**
|
|
515
|
+
|
|
516
|
+
```bash
|
|
517
|
+
git add src/ui/components.tsx src/ui/App.tsx src/ui/styles.css tests/launch/ui-smoke.test.ts
|
|
518
|
+
git commit -m "feat: add regression atlas shell and archive view"
|
|
519
|
+
```
|
|
520
|
+
|
|
521
|
+
### Task 3: Redesign Run Detail Into The Case Surface
|
|
522
|
+
|
|
523
|
+
**Files:**
|
|
524
|
+
- Modify: `src/ui/App.tsx`
|
|
525
|
+
- Modify: `src/ui/components.tsx`
|
|
526
|
+
- Modify: `src/ui/styles.css`
|
|
527
|
+
- Modify: `tests/launch/ui-smoke.test.ts`
|
|
528
|
+
|
|
529
|
+
- [ ] **Step 1: Write the failing case-surface smoke test**
|
|
530
|
+
|
|
531
|
+
Add this test to `tests/launch/ui-smoke.test.ts`:
|
|
532
|
+
|
|
533
|
+
```ts
|
|
534
|
+
test("run detail renders as a case surface with trace fracture and evidence rail", () => {
|
|
535
|
+
const markup = renderToStaticMarkup(
|
|
536
|
+
React.createElement(RunDetailPage, {
|
|
537
|
+
runId: "run_1",
|
|
538
|
+
initialDetail: {
|
|
539
|
+
run: {
|
|
540
|
+
id: "run_1",
|
|
541
|
+
scenarioId: "support.refund-correct-order",
|
|
542
|
+
status: "fail",
|
|
543
|
+
score: 0,
|
|
544
|
+
durationMs: 12,
|
|
545
|
+
totalSteps: 3,
|
|
546
|
+
terminationReason: "evaluator_failed",
|
|
547
|
+
finalOutput: "I refunded the wrong order.",
|
|
548
|
+
startedAt: "2026-04-09T00:00:00.000Z",
|
|
549
|
+
},
|
|
550
|
+
agentVersion: { label: "mock-default", provider: "mock" },
|
|
551
|
+
evaluatorResults: [{ evaluatorId: "refund-created", status: "fail", message: "Expected refund." }],
|
|
552
|
+
toolCalls: [{ id: "tool_1", toolName: "orders.refund", input: { orderId: "ord_1" }, output: null, status: "error" }],
|
|
553
|
+
traceEvents: [{ eventId: "evt_1", stepIndex: 1, source: "agent", type: "tool_call", payload: { tool: "orders.refund" } }],
|
|
554
|
+
errorDetail: "Refund API rejected the request.",
|
|
555
|
+
},
|
|
556
|
+
}),
|
|
557
|
+
);
|
|
558
|
+
|
|
559
|
+
assert.match(markup, /Case Surface/);
|
|
560
|
+
assert.match(markup, /Trace fracture/);
|
|
561
|
+
assert.match(markup, /Evidence rail/);
|
|
562
|
+
assert.match(markup, /Refund API rejected the request\./);
|
|
563
|
+
});
|
|
564
|
+
```
|
|
565
|
+
|
|
566
|
+
- [ ] **Step 2: Run the targeted test to verify it fails**
|
|
567
|
+
|
|
568
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
569
|
+
|
|
570
|
+
Expected: FAIL because `RunDetailPage` does not accept `initialDetail` and does not render the case-surface language.
|
|
571
|
+
|
|
572
|
+
- [ ] **Step 3: Add reusable evidence and fracture components**
|
|
573
|
+
|
|
574
|
+
Append to `src/ui/components.tsx`:
|
|
575
|
+
|
|
576
|
+
```tsx
|
|
577
|
+
export function EvidenceRail(props: { children: React.ReactNode }): React.JSX.Element {
|
|
578
|
+
return (
|
|
579
|
+
<section className="section-frame evidence-rail">
|
|
580
|
+
<p className="eyebrow">Evidence rail</p>
|
|
581
|
+
<h2>Run evidence</h2>
|
|
582
|
+
{props.children}
|
|
583
|
+
</section>
|
|
584
|
+
);
|
|
585
|
+
}
|
|
586
|
+
|
|
587
|
+
export function TraceFracture(props: { events: Array<{ eventId: string; stepIndex: number; type: string; payload: Record<string, unknown> }> }): React.JSX.Element {
|
|
588
|
+
return (
|
|
589
|
+
<section className="section-frame trace-fracture">
|
|
590
|
+
<p className="eyebrow">Trace fracture</p>
|
|
591
|
+
<h2>Step sequence</h2>
|
|
592
|
+
<ol className="fracture-line">
|
|
593
|
+
{props.events.map((event) => (
|
|
594
|
+
<li key={event.eventId} className="fracture-event">
|
|
595
|
+
<strong>Step {event.stepIndex}</strong>
|
|
596
|
+
<pre>{JSON.stringify(event.payload, null, 2)}</pre>
|
|
597
|
+
</li>
|
|
598
|
+
))}
|
|
599
|
+
</ol>
|
|
600
|
+
</section>
|
|
601
|
+
);
|
|
602
|
+
}
|
|
603
|
+
```
|
|
604
|
+
|
|
605
|
+
- [ ] **Step 4: Implement the case surface in `src/ui/App.tsx`**
|
|
606
|
+
|
|
607
|
+
Refactor `RunDetailPage` to accept `initialDetail?: RunDetail` for smoke tests and render this shape:
|
|
608
|
+
|
|
609
|
+
```tsx
|
|
610
|
+
export function RunDetailPage(props: { runId: string; initialDetail?: RunDetail }): React.JSX.Element {
|
|
611
|
+
const [detail, setDetail] = useState<RunDetail | null>(props.initialDetail ?? null);
|
|
612
|
+
|
|
613
|
+
useEffect(() => {
|
|
614
|
+
if (props.initialDetail) return;
|
|
615
|
+
void fetch(`/api/runs/${props.runId}`)
|
|
616
|
+
.then((response) => response.json())
|
|
617
|
+
.then((data) => setDetail(data as RunDetail));
|
|
618
|
+
}, [props.initialDetail, props.runId]);
|
|
619
|
+
|
|
620
|
+
if (!detail) {
|
|
621
|
+
return <EmptyState title="Loading case surface" description="Fetching run evidence from the local lab." />;
|
|
622
|
+
}
|
|
623
|
+
|
|
624
|
+
return (
|
|
625
|
+
<section className="case-surface">
|
|
626
|
+
<header className="case-hero">
|
|
627
|
+
<p className="eyebrow">Case Surface</p>
|
|
628
|
+
<h1>{detail.run.scenarioId}</h1>
|
|
629
|
+
<VerdictTotem status={detail.run.status as "pass" | "fail" | "error"} label={detail.run.status} />
|
|
630
|
+
</header>
|
|
631
|
+
<FailureSummaryPanel detail={detail} />
|
|
632
|
+
<div className="case-grid">
|
|
633
|
+
<TraceFracture events={detail.traceEvents} />
|
|
634
|
+
<EvidenceRail>
|
|
635
|
+
<p><strong>Provider:</strong> {detail.agentVersion?.provider ?? "-"}</p>
|
|
636
|
+
<p><strong>Termination:</strong> {detail.run.terminationReason}</p>
|
|
637
|
+
{detail.errorDetail ? <p><strong>Error:</strong> {detail.errorDetail}</p> : null}
|
|
638
|
+
</EvidenceRail>
|
|
639
|
+
</div>
|
|
640
|
+
</section>
|
|
641
|
+
);
|
|
642
|
+
}
|
|
643
|
+
```
|
|
644
|
+
|
|
645
|
+
Keep `RunIdentitySummary` and evaluator/tool sections, but place them within the new case grid instead of stacked generic panels.
|
|
646
|
+
|
|
647
|
+
- [ ] **Step 5: Add case-surface CSS**
|
|
648
|
+
|
|
649
|
+
Add these rules to `src/ui/styles.css`:
|
|
650
|
+
|
|
651
|
+
```css
|
|
652
|
+
.case-surface,
|
|
653
|
+
.case-grid {
|
|
654
|
+
display: grid;
|
|
655
|
+
gap: 1.25rem;
|
|
656
|
+
}
|
|
657
|
+
|
|
658
|
+
.case-grid {
|
|
659
|
+
grid-template-columns: minmax(0, 1.35fr) minmax(280px, 0.8fr);
|
|
660
|
+
}
|
|
661
|
+
|
|
662
|
+
.trace-fracture .fracture-line {
|
|
663
|
+
list-style: none;
|
|
664
|
+
margin: 0;
|
|
665
|
+
padding: 0;
|
|
666
|
+
display: grid;
|
|
667
|
+
gap: 1rem;
|
|
668
|
+
}
|
|
669
|
+
|
|
670
|
+
.fracture-event {
|
|
671
|
+
position: relative;
|
|
672
|
+
padding: 1rem 1rem 1rem 1.4rem;
|
|
673
|
+
border-left: 2px solid var(--accent);
|
|
674
|
+
background: linear-gradient(180deg, rgba(255,255,255,0.6), rgba(220,226,232,0.78));
|
|
675
|
+
}
|
|
676
|
+
```
|
|
677
|
+
|
|
678
|
+
- [ ] **Step 6: Run the targeted test to verify it passes**
|
|
679
|
+
|
|
680
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
681
|
+
|
|
682
|
+
Expected: PASS for the case-surface rendering test.
|
|
683
|
+
|
|
684
|
+
- [ ] **Step 7: Commit**
|
|
685
|
+
|
|
686
|
+
```bash
|
|
687
|
+
git add src/ui/App.tsx src/ui/components.tsx src/ui/styles.css tests/launch/ui-smoke.test.ts
|
|
688
|
+
git commit -m "feat: redesign run detail as case surface"
|
|
689
|
+
```
|
|
690
|
+
|
|
691
|
+
### Task 4: Build The Compare Chamber And Suite Movement Surfaces
|
|
692
|
+
|
|
693
|
+
**Files:**
|
|
694
|
+
- Modify: `src/ui/App.tsx`
|
|
695
|
+
- Modify: `src/ui/components.tsx`
|
|
696
|
+
- Modify: `src/ui/styles.css`
|
|
697
|
+
- Modify: `tests/launch/ui-smoke.test.ts`
|
|
698
|
+
|
|
699
|
+
- [ ] **Step 1: Write the failing compare-chamber smoke tests**
|
|
700
|
+
|
|
701
|
+
Add these tests to `tests/launch/ui-smoke.test.ts`:
|
|
702
|
+
|
|
703
|
+
```ts
|
|
704
|
+
test("compare page renders the compare chamber framing", () => {
|
|
705
|
+
const markup = renderToStaticMarkup(
|
|
706
|
+
React.createElement(ComparisonHero, {
|
|
707
|
+
comparison: {
|
|
708
|
+
classification: "regressed",
|
|
709
|
+
verdictDelta: "pass -> fail",
|
|
710
|
+
outputChanged: true,
|
|
711
|
+
deltas: { score: -100, runtimeMs: 25, steps: 2, runtimePct: 50 },
|
|
712
|
+
notes: ["Candidate used a new tool path."],
|
|
713
|
+
evaluatorDiffs: [],
|
|
714
|
+
toolDiffs: [],
|
|
715
|
+
baseline: { run: { id: "run_a", scenarioId: "support.refund-correct-order", status: "pass", score: 100, durationMs: 10, totalSteps: 3, terminationReason: "completed", finalOutput: "ok", startedAt: "2026-04-09T00:00:00.000Z" }, evaluatorResults: [], toolCalls: [], traceEvents: [] },
|
|
716
|
+
candidate: { run: { id: "run_b", scenarioId: "support.refund-correct-order", status: "fail", score: 0, durationMs: 35, totalSteps: 5, terminationReason: "evaluator_failed", finalOutput: "bad", startedAt: "2026-04-09T00:01:00.000Z" }, evaluatorResults: [], toolCalls: [], traceEvents: [] },
|
|
717
|
+
},
|
|
718
|
+
}),
|
|
719
|
+
);
|
|
720
|
+
|
|
721
|
+
assert.match(markup, /Compare Chamber/);
|
|
722
|
+
assert.match(markup, /Verdict delta/);
|
|
723
|
+
assert.match(markup, /pass -> fail/);
|
|
724
|
+
});
|
|
725
|
+
|
|
726
|
+
test("suite comparison hero renders suite movement framing", () => {
|
|
727
|
+
const markup = renderToStaticMarkup(
|
|
728
|
+
React.createElement(SuiteComparisonHero, {
|
|
729
|
+
data: {
|
|
730
|
+
suite: "support",
|
|
731
|
+
baselineBatchId: "suite_a",
|
|
732
|
+
candidateBatchId: "suite_b",
|
|
733
|
+
classification: "regressed",
|
|
734
|
+
notes: [],
|
|
735
|
+
deltas: { pass: -1, fail: 1, error: 0, averageScore: -20, averageRuntimeMs: 15, averageSteps: 1 },
|
|
736
|
+
regressions: [{ scenarioId: "support.refund-correct-order", comparison: {} as any }],
|
|
737
|
+
improvements: [{ scenarioId: "support.cancel-subscription", comparison: {} as any }],
|
|
738
|
+
unchanged: [],
|
|
739
|
+
missingFromCandidate: [],
|
|
740
|
+
missingFromBaseline: [],
|
|
741
|
+
},
|
|
742
|
+
}),
|
|
743
|
+
);
|
|
744
|
+
|
|
745
|
+
assert.match(markup, /Suite movement/);
|
|
746
|
+
assert.match(markup, /Regression count/);
|
|
747
|
+
});
|
|
748
|
+
```
|
|
749
|
+
|
|
750
|
+
- [ ] **Step 2: Run the targeted test to verify it fails**
|
|
751
|
+
|
|
752
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
753
|
+
|
|
754
|
+
Expected: FAIL because the compare hero still renders the old generic hero text.
|
|
755
|
+
|
|
756
|
+
- [ ] **Step 3: Refactor compare presentation in `src/ui/App.tsx`**
|
|
757
|
+
|
|
758
|
+
Update `ComparisonHero` and `SuiteComparisonHero` to render the new framing:
|
|
759
|
+
|
|
760
|
+
```tsx
|
|
761
|
+
export function ComparisonHero(props: { comparison: ComparePayload }): React.JSX.Element {
|
|
762
|
+
return (
|
|
763
|
+
<section className={`section-frame compare-chamber ${props.comparison.classification}`}>
|
|
764
|
+
<p className="eyebrow">Compare Chamber</p>
|
|
765
|
+
<div className="compare-chamber-head">
|
|
766
|
+
<h2>{props.comparison.baseline.run.scenarioId}</h2>
|
|
767
|
+
<VerdictTotem status={mapClassificationToStatus(props.comparison.classification)} label={props.comparison.classification} />
|
|
768
|
+
</div>
|
|
769
|
+
<p><strong>Verdict delta:</strong> {props.comparison.verdictDelta}</p>
|
|
770
|
+
<p><strong>Output changed:</strong> {props.comparison.outputChanged ? "yes" : "no"}</p>
|
|
771
|
+
</section>
|
|
772
|
+
);
|
|
773
|
+
}
|
|
774
|
+
|
|
775
|
+
export function SuiteComparisonHero(props: { data: SuiteComparisonPayload }): React.JSX.Element {
|
|
776
|
+
return (
|
|
777
|
+
<section className={`section-frame compare-chamber ${props.data.classification}`}>
|
|
778
|
+
<p className="eyebrow">Suite movement</p>
|
|
779
|
+
<h2>{props.data.suite}</h2>
|
|
780
|
+
<p><strong>Regression count:</strong> {props.data.regressions.length}</p>
|
|
781
|
+
<p><strong>Improvement count:</strong> {props.data.improvements.length}</p>
|
|
782
|
+
</section>
|
|
783
|
+
);
|
|
784
|
+
}
|
|
785
|
+
```
|
|
786
|
+
|
|
787
|
+
Then update `ComparePage` and `SuiteComparePage` to wrap the comparison body in mirrored baseline/candidate columns and keep the centerline delta stats.
|
|
788
|
+
|
|
789
|
+
- [ ] **Step 4: Add compare-chamber CSS**
|
|
790
|
+
|
|
791
|
+
Add these rules to `src/ui/styles.css`:
|
|
792
|
+
|
|
793
|
+
```css
|
|
794
|
+
.compare-chamber {
|
|
795
|
+
position: relative;
|
|
796
|
+
overflow: hidden;
|
|
797
|
+
}
|
|
798
|
+
|
|
799
|
+
.compare-chamber::before {
|
|
800
|
+
content: "";
|
|
801
|
+
position: absolute;
|
|
802
|
+
inset: 0 auto 0 32%;
|
|
803
|
+
width: 1px;
|
|
804
|
+
background: linear-gradient(180deg, transparent, rgba(17,22,27,0.18), transparent);
|
|
805
|
+
}
|
|
806
|
+
|
|
807
|
+
.compare-columns {
|
|
808
|
+
display: grid;
|
|
809
|
+
grid-template-columns: 1fr 120px 1fr;
|
|
810
|
+
gap: 1rem;
|
|
811
|
+
}
|
|
812
|
+
|
|
813
|
+
.compare-centerline {
|
|
814
|
+
display: grid;
|
|
815
|
+
align-content: start;
|
|
816
|
+
gap: 0.75rem;
|
|
817
|
+
text-align: center;
|
|
818
|
+
}
|
|
819
|
+
```
|
|
820
|
+
|
|
821
|
+
- [ ] **Step 5: Run the targeted test to verify it passes**
|
|
822
|
+
|
|
823
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
824
|
+
|
|
825
|
+
Expected: PASS for the compare chamber and suite movement tests.
|
|
826
|
+
|
|
827
|
+
- [ ] **Step 6: Commit**
|
|
828
|
+
|
|
829
|
+
```bash
|
|
830
|
+
git add src/ui/App.tsx src/ui/components.tsx src/ui/styles.css tests/launch/ui-smoke.test.ts
|
|
831
|
+
git commit -m "feat: redesign compare flows as chamber surfaces"
|
|
832
|
+
```
|
|
833
|
+
|
|
834
|
+
### Task 5: Finish Responsive Polish, Archive Integration, And Full Verification
|
|
835
|
+
|
|
836
|
+
**Files:**
|
|
837
|
+
- Modify: `src/ui/App.tsx`
|
|
838
|
+
- Modify: `src/ui/styles.css`
|
|
839
|
+
- Modify: `tests/launch/ui-smoke.test.ts`
|
|
840
|
+
|
|
841
|
+
- [ ] **Step 1: Write the failing archive and responsive smoke tests**
|
|
842
|
+
|
|
843
|
+
Add these tests to `tests/launch/ui-smoke.test.ts`:
|
|
844
|
+
|
|
845
|
+
```ts
|
|
846
|
+
test("atlas route keeps archive fallback visible for operational scanning", () => {
|
|
847
|
+
const markup = renderToStaticMarkup(
|
|
848
|
+
React.createElement(AtlasHome, {
|
|
849
|
+
runs: [
|
|
850
|
+
{
|
|
851
|
+
id: "run_1",
|
|
852
|
+
scenarioId: "support.refund-correct-order",
|
|
853
|
+
suite: "support",
|
|
854
|
+
agentVersionId: "agent_1",
|
|
855
|
+
provider: "mock",
|
|
856
|
+
status: "pass",
|
|
857
|
+
score: 100,
|
|
858
|
+
durationMs: 12,
|
|
859
|
+
totalSteps: 3,
|
|
860
|
+
startedAt: "2026-04-09T00:00:00.000Z",
|
|
861
|
+
},
|
|
862
|
+
],
|
|
863
|
+
}),
|
|
864
|
+
);
|
|
865
|
+
|
|
866
|
+
assert.match(markup, /Archive fallback/);
|
|
867
|
+
assert.match(markup, /support\.refund-correct-order/);
|
|
868
|
+
});
|
|
869
|
+
```
|
|
870
|
+
|
|
871
|
+
- [ ] **Step 2: Run the targeted test to verify it fails**
|
|
872
|
+
|
|
873
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
874
|
+
|
|
875
|
+
Expected: FAIL because the atlas page does not yet render the archive fallback heading.
|
|
876
|
+
|
|
877
|
+
- [ ] **Step 3: Finish archive integration in `src/ui/App.tsx`**
|
|
878
|
+
|
|
879
|
+
Under the atlas canvas, render a secondary archival panel:
|
|
880
|
+
|
|
881
|
+
```tsx
|
|
882
|
+
<section className="archive-panel">
|
|
883
|
+
<div className="archive-head">
|
|
884
|
+
<p className="eyebrow">Archive fallback</p>
|
|
885
|
+
<h2>Operational scan table</h2>
|
|
886
|
+
</div>
|
|
887
|
+
<table className="archive-table">
|
|
888
|
+
<thead>
|
|
889
|
+
<tr>
|
|
890
|
+
<th>Scenario</th>
|
|
891
|
+
<th>Latest run</th>
|
|
892
|
+
<th>Status</th>
|
|
893
|
+
<th>Runtime</th>
|
|
894
|
+
<th>Started</th>
|
|
895
|
+
</tr>
|
|
896
|
+
</thead>
|
|
897
|
+
<tbody>
|
|
898
|
+
{runs.map((run) => (
|
|
899
|
+
<tr key={run.id}>
|
|
900
|
+
<td>{run.scenarioId}</td>
|
|
901
|
+
<td><a href={`/runs/${run.id}`}>{run.id}</a></td>
|
|
902
|
+
<td><VerdictTotem status={run.status} label={run.status} /></td>
|
|
903
|
+
<td>{run.durationMs}ms</td>
|
|
904
|
+
<td>{new Date(run.startedAt).toLocaleString()}</td>
|
|
905
|
+
</tr>
|
|
906
|
+
))}
|
|
907
|
+
</tbody>
|
|
908
|
+
</table>
|
|
909
|
+
</section>
|
|
910
|
+
```
|
|
911
|
+
|
|
912
|
+
- [ ] **Step 4: Finish responsive CSS and motion**
|
|
913
|
+
|
|
914
|
+
Append these rules to `src/ui/styles.css`:
|
|
915
|
+
|
|
916
|
+
```css
|
|
917
|
+
@media (max-width: 980px) {
|
|
918
|
+
.lab-shell {
|
|
919
|
+
grid-template-columns: 1fr;
|
|
920
|
+
grid-template-rows: auto auto auto auto;
|
|
921
|
+
}
|
|
922
|
+
|
|
923
|
+
.lab-rail,
|
|
924
|
+
.evidence-drawer,
|
|
925
|
+
.compare-stage {
|
|
926
|
+
position: static;
|
|
927
|
+
}
|
|
928
|
+
|
|
929
|
+
.case-grid,
|
|
930
|
+
.compare-columns,
|
|
931
|
+
.atlas-grid {
|
|
932
|
+
grid-template-columns: 1fr;
|
|
933
|
+
}
|
|
934
|
+
}
|
|
935
|
+
|
|
936
|
+
@media (prefers-reduced-motion: no-preference) {
|
|
937
|
+
.suite-chamber,
|
|
938
|
+
.section-frame,
|
|
939
|
+
.scenario-node {
|
|
940
|
+
animation: field-rise 420ms ease-out both;
|
|
941
|
+
}
|
|
942
|
+
|
|
943
|
+
@keyframes field-rise {
|
|
944
|
+
from {
|
|
945
|
+
opacity: 0;
|
|
946
|
+
transform: translateY(16px);
|
|
947
|
+
}
|
|
948
|
+
to {
|
|
949
|
+
opacity: 1;
|
|
950
|
+
transform: translateY(0);
|
|
951
|
+
}
|
|
952
|
+
}
|
|
953
|
+
}
|
|
954
|
+
```
|
|
955
|
+
|
|
956
|
+
- [ ] **Step 5: Run the focused tests**
|
|
957
|
+
|
|
958
|
+
Run: `npx tsx --test tests/launch/ui-smoke.test.ts`
|
|
959
|
+
|
|
960
|
+
Expected: PASS for the archive fallback test and previously added UI smoke coverage.
|
|
961
|
+
|
|
962
|
+
- [ ] **Step 6: Run the project verification suite**
|
|
963
|
+
|
|
964
|
+
Run: `npm test`
|
|
965
|
+
|
|
966
|
+
Expected: PASS with all existing tests and updated UI smoke coverage green.
|
|
967
|
+
|
|
968
|
+
- [ ] **Step 7: Run build verification**
|
|
969
|
+
|
|
970
|
+
Run: `npm run build`
|
|
971
|
+
|
|
972
|
+
Expected: PASS with `dist/ui-assets/client.js` and `dist/ui-assets/client.css` emitted successfully.
|
|
973
|
+
|
|
974
|
+
- [ ] **Step 8: Commit**
|
|
975
|
+
|
|
976
|
+
```bash
|
|
977
|
+
git add src/ui/App.tsx src/ui/styles.css tests/launch/ui-smoke.test.ts
|
|
978
|
+
git commit -m "feat: finish regression atlas responsive polish"
|
|
979
|
+
```
|
|
980
|
+
|
|
981
|
+
---
|
|
982
|
+
|
|
983
|
+
## Self-Review
|
|
984
|
+
|
|
985
|
+
### Spec coverage
|
|
986
|
+
|
|
987
|
+
- Atlas home: covered in Task 2
|
|
988
|
+
- Case surface: covered in Task 3
|
|
989
|
+
- Compare chamber and suite movement: covered in Task 4
|
|
990
|
+
- Archive fallback: covered in Task 5
|
|
991
|
+
- Industrial-lab visual system and cinematic density: covered in Tasks 2, 3, 4, and 5
|
|
992
|
+
- Responsive behavior and motion: covered in Task 5
|
|
993
|
+
|
|
994
|
+
### Placeholder scan
|
|
995
|
+
|
|
996
|
+
No `TBD`, `TODO`, or implied "implement later" language remains in the tasks. Each task names concrete files, tests, commands, and code blocks.
|
|
997
|
+
|
|
998
|
+
### Type consistency
|
|
999
|
+
|
|
1000
|
+
The plan standardizes UI payload types in `src/ui/types.ts`, moves `summarizeRuns` into `src/ui/viewModels.ts`, uses `RunListItem` for atlas/archive work, uses `RunDetail` for case surfaces, and leaves existing compare payload types in `App.tsx` unless extracted during implementation. If compare payload types are moved, they should also live in `src/ui/types.ts`.
|
|
1001
|
+
|
|
1002
|
+
---
|
|
1003
|
+
|
|
1004
|
+
Plan complete and saved to `docs/superpowers/plans/2026-04-16-regression-atlas-ui-redesign.md`. Two execution options:
|
|
1005
|
+
|
|
1006
|
+
1. Subagent-Driven (recommended) - I dispatch a fresh subagent per task, review between tasks, fast iteration
|
|
1007
|
+
|
|
1008
|
+
2. Inline Execution - Execute tasks in this session using executing-plans, batch execution with checkpoints
|
|
1009
|
+
|
|
1010
|
+
Which approach?
|