agent-regression-lab 0.3.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +25 -4
- package/bin/agentlab.js +2 -0
- package/dist/config.js +13 -9
- package/dist/index.js +14 -0
- package/dist/init.js +88 -0
- package/dist/tools.js +18 -2
- package/dist/ui/App.js +49 -7
- package/dist/ui-assets/client.css +92 -0
- package/dist/ui-assets/client.js +102 -20
- package/docs/coding-agents.md +74 -0
- package/docs/superpowers/plans/2026-04-13-phase-2-lite-phase-3-plan.md +160 -0
- package/docs/superpowers/plans/2026-04-13-phase-one-npm-tools-plan.md +502 -0
- package/docs/superpowers/specs/2026-04-13-phase-2-lite-phase-3-design.md +164 -0
- package/docs/tools.md +34 -3
- package/docs/troubleshooting.md +55 -0
- package/examples/coding-tools/README.md +21 -0
- package/examples/coding-tools/index.js +11 -0
- package/examples/coding-tools/package.json +8 -0
- package/examples/support-tools/README.md +21 -0
- package/examples/support-tools/index.js +8 -0
- package/examples/support-tools/package.json +8 -0
- package/package.json +6 -4
|
@@ -0,0 +1,502 @@
|
|
|
1
|
+
# Phase One Npm Tools Implementation Plan
|
|
2
|
+
|
|
3
|
+
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
4
|
+
|
|
5
|
+
**Goal:** Finish the Phase 1 friction-removal work by adding npm-installable tool support, shipping package-style example tools, and making the `init` and docs flow work cleanly for `npx` users.
|
|
6
|
+
|
|
7
|
+
**Architecture:** Keep existing repo-local `modulePath` tool loading intact and add an explicit package-based loading path via `package`. Validate that each tool registration uses exactly one source, resolve file-backed tools from the project root, and resolve package-backed tools through normal Node package resolution from the current project. Ship minimal example packages in-repo to demonstrate the supported authoring shape and update `init` plus docs to make the new flow the default recommendation for installed usage.
|
|
8
|
+
|
|
9
|
+
**Tech Stack:** TypeScript, Node.js ESM dynamic import, YAML config, node:test, tsx
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## File Map
|
|
14
|
+
|
|
15
|
+
**Core types and config**
|
|
16
|
+
- Modify: `src/types.ts`
|
|
17
|
+
- Modify: `src/config.ts`
|
|
18
|
+
|
|
19
|
+
**Tool loading**
|
|
20
|
+
- Modify: `src/tools.ts`
|
|
21
|
+
|
|
22
|
+
**Init flow**
|
|
23
|
+
- Modify: `src/init.ts`
|
|
24
|
+
|
|
25
|
+
**Docs**
|
|
26
|
+
- Modify: `README.md`
|
|
27
|
+
- Modify: `docs/tools.md`
|
|
28
|
+
- Modify: `docs/troubleshooting.md`
|
|
29
|
+
- Modify: `.claude/project.md`
|
|
30
|
+
- Modify: `.claude/active-tasks.md`
|
|
31
|
+
|
|
32
|
+
**Examples**
|
|
33
|
+
- Create: `examples/support-tools/package.json`
|
|
34
|
+
- Create: `examples/support-tools/index.js`
|
|
35
|
+
- Create: `examples/support-tools/README.md`
|
|
36
|
+
- Create: `examples/coding-tools/package.json`
|
|
37
|
+
- Create: `examples/coding-tools/index.js`
|
|
38
|
+
- Create: `examples/coding-tools/README.md`
|
|
39
|
+
|
|
40
|
+
**Tests**
|
|
41
|
+
- Modify: `tests/config.tier-one.test.ts`
|
|
42
|
+
- Create: `tests/packageTools.test.ts`
|
|
43
|
+
- Modify: `tests/cliPackaging.test.ts`
|
|
44
|
+
- Modify: `tests/init.test.ts`
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
### Task 1: Add Explicit Package-Based Tool Registration
|
|
49
|
+
|
|
50
|
+
**Files:**
|
|
51
|
+
- Modify: `src/types.ts`
|
|
52
|
+
- Modify: `src/config.ts`
|
|
53
|
+
- Modify: `tests/config.tier-one.test.ts`
|
|
54
|
+
|
|
55
|
+
- [ ] **Step 1: Write failing config tests for package-backed tools**
|
|
56
|
+
|
|
57
|
+
Add tests covering:
|
|
58
|
+
|
|
59
|
+
```ts
|
|
60
|
+
test("config accepts a package-backed tool registration", () => {
|
|
61
|
+
// tool has package + exportName and no modulePath
|
|
62
|
+
});
|
|
63
|
+
|
|
64
|
+
test("config rejects a tool with both modulePath and package", () => {
|
|
65
|
+
assert.throws(() => loadAgentLabConfig(), /exactly one of 'modulePath' or 'package'/i);
|
|
66
|
+
});
|
|
67
|
+
|
|
68
|
+
test("config rejects a tool with neither modulePath nor package", () => {
|
|
69
|
+
assert.throws(() => loadAgentLabConfig(), /exactly one of 'modulePath' or 'package'/i);
|
|
70
|
+
});
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
- [ ] **Step 2: Run the config tests to verify failure**
|
|
74
|
+
|
|
75
|
+
Run: `npx tsx --test tests/config.tier-one.test.ts`
|
|
76
|
+
|
|
77
|
+
Expected: FAIL because `ToolRegistration` and `validateToolRegistration()` currently require `modulePath`.
|
|
78
|
+
|
|
79
|
+
- [ ] **Step 3: Extend the tool registration type**
|
|
80
|
+
|
|
81
|
+
Update `src/types.ts`:
|
|
82
|
+
|
|
83
|
+
```ts
|
|
84
|
+
export type ToolRegistration = ToolSpec & {
|
|
85
|
+
modulePath?: string;
|
|
86
|
+
package?: string;
|
|
87
|
+
exportName?: string;
|
|
88
|
+
};
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
- [ ] **Step 4: Update config validation for explicit source selection**
|
|
92
|
+
|
|
93
|
+
Update `validateToolRegistration()` in `src/config.ts` so it enforces:
|
|
94
|
+
|
|
95
|
+
```ts
|
|
96
|
+
const hasModulePath = typeof value.modulePath === "string" && value.modulePath.length > 0;
|
|
97
|
+
const hasPackage = typeof value.package === "string" && value.package.length > 0;
|
|
98
|
+
|
|
99
|
+
if ((hasModulePath ? 1 : 0) + (hasPackage ? 1 : 0) !== 1) {
|
|
100
|
+
throw new Error(`Tool '${value.name}' must define exactly one of 'modulePath' or 'package'.`);
|
|
101
|
+
}
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
Retain:
|
|
105
|
+
|
|
106
|
+
- required `name`
|
|
107
|
+
- required `exportName`
|
|
108
|
+
- required `description`
|
|
109
|
+
- required `inputSchema`
|
|
110
|
+
|
|
111
|
+
And only enforce repo-boundary + existence checks when `modulePath` is used.
|
|
112
|
+
|
|
113
|
+
- [ ] **Step 5: Re-run the config tests**
|
|
114
|
+
|
|
115
|
+
Run: `npx tsx --test tests/config.tier-one.test.ts`
|
|
116
|
+
|
|
117
|
+
Expected: PASS
|
|
118
|
+
|
|
119
|
+
- [ ] **Step 6: Commit**
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
git add src/types.ts src/config.ts tests/config.tier-one.test.ts
|
|
123
|
+
git commit -m "feat: support package-backed tool registrations"
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
### Task 2: Load Tools From Installed Packages
|
|
129
|
+
|
|
130
|
+
**Files:**
|
|
131
|
+
- Modify: `src/tools.ts`
|
|
132
|
+
- Create: `tests/packageTools.test.ts`
|
|
133
|
+
|
|
134
|
+
- [ ] **Step 1: Write failing loader tests for package-backed tools**
|
|
135
|
+
|
|
136
|
+
Create `tests/packageTools.test.ts` with coverage for:
|
|
137
|
+
|
|
138
|
+
```ts
|
|
139
|
+
test("loadToolRegistry loads a package-backed tool from node_modules", async () => {
|
|
140
|
+
// temp workspace with package tool registration and a stub package in node_modules
|
|
141
|
+
});
|
|
142
|
+
|
|
143
|
+
test("loadToolRegistry surfaces a clear error when package import fails", async () => {
|
|
144
|
+
// missing package should mention tool name and package name
|
|
145
|
+
});
|
|
146
|
+
|
|
147
|
+
test("loadToolRegistry still loads repo-local modulePath tools", async () => {
|
|
148
|
+
// regression guard for existing behavior
|
|
149
|
+
});
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
- [ ] **Step 2: Run the new loader tests to verify failure**
|
|
153
|
+
|
|
154
|
+
Run: `npx tsx --test tests/packageTools.test.ts`
|
|
155
|
+
|
|
156
|
+
Expected: FAIL because `loadConfiguredTool()` only imports `modulePath`.
|
|
157
|
+
|
|
158
|
+
- [ ] **Step 3: Refactor tool loading into source-specific helpers**
|
|
159
|
+
|
|
160
|
+
In `src/tools.ts`, split the loading path:
|
|
161
|
+
|
|
162
|
+
```ts
|
|
163
|
+
async function loadConfiguredTool(tool: ToolRegistration): Promise<LoadedTool> {
|
|
164
|
+
const module = tool.package
|
|
165
|
+
? await importConfiguredPackageTool(tool)
|
|
166
|
+
: await importConfiguredFileTool(tool);
|
|
167
|
+
|
|
168
|
+
const candidate = module[tool.exportName!];
|
|
169
|
+
if (typeof candidate !== "function") {
|
|
170
|
+
throw new Error(`Tool '${tool.name}' export '${tool.exportName}' is not a function.`);
|
|
171
|
+
}
|
|
172
|
+
|
|
173
|
+
return {
|
|
174
|
+
spec: {
|
|
175
|
+
name: tool.name,
|
|
176
|
+
description: tool.description,
|
|
177
|
+
inputSchema: tool.inputSchema,
|
|
178
|
+
},
|
|
179
|
+
handler: candidate as ToolHandler,
|
|
180
|
+
};
|
|
181
|
+
}
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
Add:
|
|
185
|
+
|
|
186
|
+
```ts
|
|
187
|
+
async function importConfiguredFileTool(tool: ToolRegistration) {
|
|
188
|
+
const moduleUrl = pathToFileURL(resolve(tool.modulePath!)).href;
|
|
189
|
+
return await import(moduleUrl);
|
|
190
|
+
}
|
|
191
|
+
|
|
192
|
+
async function importConfiguredPackageTool(tool: ToolRegistration) {
|
|
193
|
+
try {
|
|
194
|
+
return await import(tool.package!);
|
|
195
|
+
} catch (error) {
|
|
196
|
+
const message = error instanceof Error ? error.message : String(error);
|
|
197
|
+
throw new Error(`Tool '${tool.name}' failed to load package '${tool.package}': ${message}`);
|
|
198
|
+
}
|
|
199
|
+
}
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
- [ ] **Step 4: Re-run the loader tests**
|
|
203
|
+
|
|
204
|
+
Run: `npx tsx --test tests/packageTools.test.ts`
|
|
205
|
+
|
|
206
|
+
Expected: PASS
|
|
207
|
+
|
|
208
|
+
- [ ] **Step 5: Run existing tool-related regression tests**
|
|
209
|
+
|
|
210
|
+
Run: `npx tsx --test tests/benchmarkExpansion.test.ts tests/cliPackaging.test.ts`
|
|
211
|
+
|
|
212
|
+
Expected: PASS
|
|
213
|
+
|
|
214
|
+
- [ ] **Step 6: Commit**
|
|
215
|
+
|
|
216
|
+
```bash
|
|
217
|
+
git add src/tools.ts tests/packageTools.test.ts tests/cliPackaging.test.ts
|
|
218
|
+
git commit -m "feat: load custom tools from installed packages"
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
---
|
|
222
|
+
|
|
223
|
+
### Task 3: Update Init To Teach The Package-Tool Path
|
|
224
|
+
|
|
225
|
+
**Files:**
|
|
226
|
+
- Modify: `src/init.ts`
|
|
227
|
+
- Modify: `tests/init.test.ts`
|
|
228
|
+
|
|
229
|
+
- [ ] **Step 1: Write failing init tests**
|
|
230
|
+
|
|
231
|
+
Add or extend tests to assert the generated config includes comments for both styles:
|
|
232
|
+
|
|
233
|
+
```ts
|
|
234
|
+
assert.match(config, /modulePath: \.\/tools\/customTool\.ts/);
|
|
235
|
+
assert.match(config, /package: "@agentlab\/example-support-tools"/);
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
Also verify the next-step guidance mentions package installation:
|
|
239
|
+
|
|
240
|
+
```ts
|
|
241
|
+
assert.match(output, /npm install @agentlab\/example-support-tools/);
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
- [ ] **Step 2: Run init tests to verify failure**
|
|
245
|
+
|
|
246
|
+
Run: `npx tsx --test tests/init.test.ts`
|
|
247
|
+
|
|
248
|
+
Expected: FAIL because the current init template only documents repo-local paths.
|
|
249
|
+
|
|
250
|
+
- [ ] **Step 3: Update the generated config and next-step guidance**
|
|
251
|
+
|
|
252
|
+
Modify `src/init.ts` so `SAMPLE_CONFIG` shows:
|
|
253
|
+
|
|
254
|
+
```yaml
|
|
255
|
+
# Tools can come from:
|
|
256
|
+
# 1. repo-local files
|
|
257
|
+
# 2. installed npm packages
|
|
258
|
+
|
|
259
|
+
# tools:
|
|
260
|
+
# - name: my.local_tool
|
|
261
|
+
# modulePath: ./tools/customTool.ts
|
|
262
|
+
# exportName: customTool
|
|
263
|
+
#
|
|
264
|
+
# - name: support.find_duplicate_charge
|
|
265
|
+
# package: "@agentlab/example-support-tools"
|
|
266
|
+
# exportName: findDuplicateCharge
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
And update console output to include:
|
|
270
|
+
|
|
271
|
+
```ts
|
|
272
|
+
console.log(" npm install @agentlab/example-support-tools");
|
|
273
|
+
console.log(" # then register package-backed tools in agentlab.config.yaml");
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
- [ ] **Step 4: Re-run init tests**
|
|
277
|
+
|
|
278
|
+
Run: `npx tsx --test tests/init.test.ts`
|
|
279
|
+
|
|
280
|
+
Expected: PASS
|
|
281
|
+
|
|
282
|
+
- [ ] **Step 5: Commit**
|
|
283
|
+
|
|
284
|
+
```bash
|
|
285
|
+
git add src/init.ts tests/init.test.ts
|
|
286
|
+
git commit -m "docs: teach init flow about package-backed tools"
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
---
|
|
290
|
+
|
|
291
|
+
### Task 4: Add Minimal Example Tool Packages
|
|
292
|
+
|
|
293
|
+
**Files:**
|
|
294
|
+
- Create: `examples/support-tools/package.json`
|
|
295
|
+
- Create: `examples/support-tools/index.js`
|
|
296
|
+
- Create: `examples/support-tools/README.md`
|
|
297
|
+
- Create: `examples/coding-tools/package.json`
|
|
298
|
+
- Create: `examples/coding-tools/index.js`
|
|
299
|
+
- Create: `examples/coding-tools/README.md`
|
|
300
|
+
- Create: `tests/packageTools.test.ts` (extend)
|
|
301
|
+
|
|
302
|
+
- [ ] **Step 1: Create the support tools example package**
|
|
303
|
+
|
|
304
|
+
Create `examples/support-tools/package.json`:
|
|
305
|
+
|
|
306
|
+
```json
|
|
307
|
+
{
|
|
308
|
+
"name": "@agentlab/example-support-tools",
|
|
309
|
+
"private": true,
|
|
310
|
+
"type": "module",
|
|
311
|
+
"exports": {
|
|
312
|
+
".": "./index.js"
|
|
313
|
+
}
|
|
314
|
+
}
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
Create `examples/support-tools/index.js` with a minimal exported function:
|
|
318
|
+
|
|
319
|
+
```js
|
|
320
|
+
export async function findDuplicateCharge(input) {
|
|
321
|
+
const customerId = String(input?.customer_id ?? "");
|
|
322
|
+
if (!customerId) {
|
|
323
|
+
throw new Error("customer_id is required");
|
|
324
|
+
}
|
|
325
|
+
return { order_id: `dup_${customerId}` };
|
|
326
|
+
}
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
- [ ] **Step 2: Create the coding tools example package**
|
|
330
|
+
|
|
331
|
+
Create `examples/coding-tools/package.json`:
|
|
332
|
+
|
|
333
|
+
```json
|
|
334
|
+
{
|
|
335
|
+
"name": "@agentlab/example-coding-tools",
|
|
336
|
+
"private": true,
|
|
337
|
+
"type": "module",
|
|
338
|
+
"exports": {
|
|
339
|
+
".": "./index.js"
|
|
340
|
+
}
|
|
341
|
+
}
|
|
342
|
+
```
|
|
343
|
+
|
|
344
|
+
Create `examples/coding-tools/index.js`:
|
|
345
|
+
|
|
346
|
+
```js
|
|
347
|
+
export async function readRepoHint(input) {
|
|
348
|
+
const path = String(input?.path ?? "");
|
|
349
|
+
if (!path) {
|
|
350
|
+
throw new Error("path is required");
|
|
351
|
+
}
|
|
352
|
+
return { path, hint: "Check the target file before editing." };
|
|
353
|
+
}
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
- [ ] **Step 3: Add short READMEs showing intended registration**
|
|
357
|
+
|
|
358
|
+
Each README should include an `agentlab.config.yaml` snippet using `package` + `exportName`.
|
|
359
|
+
|
|
360
|
+
- [ ] **Step 4: Extend package tool tests to verify the example package shape**
|
|
361
|
+
|
|
362
|
+
Add a test that imports the example package directly:
|
|
363
|
+
|
|
364
|
+
```ts
|
|
365
|
+
const mod = await import(resolve("examples/support-tools/index.js"));
|
|
366
|
+
assert.equal(typeof mod.findDuplicateCharge, "function");
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
- [ ] **Step 5: Run example-package tests**
|
|
370
|
+
|
|
371
|
+
Run: `npx tsx --test tests/packageTools.test.ts`
|
|
372
|
+
|
|
373
|
+
Expected: PASS
|
|
374
|
+
|
|
375
|
+
- [ ] **Step 6: Commit**
|
|
376
|
+
|
|
377
|
+
```bash
|
|
378
|
+
git add examples/support-tools examples/coding-tools tests/packageTools.test.ts
|
|
379
|
+
git commit -m "feat: add package-style example tool packages"
|
|
380
|
+
```
|
|
381
|
+
|
|
382
|
+
---
|
|
383
|
+
|
|
384
|
+
### Task 5: Update Docs For Npx And Package-Based Tools
|
|
385
|
+
|
|
386
|
+
**Files:**
|
|
387
|
+
- Modify: `README.md`
|
|
388
|
+
- Modify: `docs/tools.md`
|
|
389
|
+
- Modify: `docs/troubleshooting.md`
|
|
390
|
+
- Modify: `.claude/project.md`
|
|
391
|
+
- Modify: `.claude/active-tasks.md`
|
|
392
|
+
|
|
393
|
+
- [ ] **Step 1: Update public docs to describe both tool sources**
|
|
394
|
+
|
|
395
|
+
In `docs/tools.md`, change the core framing from:
|
|
396
|
+
|
|
397
|
+
```md
|
|
398
|
+
Custom tools are registered in `agentlab.config.yaml` and loaded from repo-local JS or TS modules.
|
|
399
|
+
```
|
|
400
|
+
|
|
401
|
+
to:
|
|
402
|
+
|
|
403
|
+
```md
|
|
404
|
+
Custom tools are registered in `agentlab.config.yaml` and can be loaded from either repo-local modules or installed npm packages.
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
Add example blocks for both `modulePath` and `package`.
|
|
408
|
+
|
|
409
|
+
- [ ] **Step 2: Update README npx flow**
|
|
410
|
+
|
|
411
|
+
Add a package-tool example under the `npx` / install guidance:
|
|
412
|
+
|
|
413
|
+
```bash
|
|
414
|
+
npm install @agentlab/example-support-tools
|
|
415
|
+
```
|
|
416
|
+
|
|
417
|
+
And explain that package-backed tools are the preferred extension path for installed and `npx`-based projects.
|
|
418
|
+
|
|
419
|
+
- [ ] **Step 3: Update troubleshooting**
|
|
420
|
+
|
|
421
|
+
Add failure cases for:
|
|
422
|
+
|
|
423
|
+
- missing installed package
|
|
424
|
+
- wrong `exportName` from a package
|
|
425
|
+
- specifying both `modulePath` and `package`
|
|
426
|
+
|
|
427
|
+
Use direct messages matching the implementation.
|
|
428
|
+
|
|
429
|
+
- [ ] **Step 4: Update internal phase tracking**
|
|
430
|
+
|
|
431
|
+
In `.claude/project.md` and `.claude/active-tasks.md`:
|
|
432
|
+
|
|
433
|
+
- mark npm-installable tools complete
|
|
434
|
+
- keep Phase 1 open only if example packages / starter-repo / README rewrite still remain
|
|
435
|
+
|
|
436
|
+
- [ ] **Step 5: Run a docs-adjacent regression smoke**
|
|
437
|
+
|
|
438
|
+
Run: `npm run smoke:cli`
|
|
439
|
+
|
|
440
|
+
Expected: PASS
|
|
441
|
+
|
|
442
|
+
- [ ] **Step 6: Commit**
|
|
443
|
+
|
|
444
|
+
```bash
|
|
445
|
+
git add README.md docs/tools.md docs/troubleshooting.md .claude/project.md .claude/active-tasks.md
|
|
446
|
+
git commit -m "docs: document npm-installable tools for npx users"
|
|
447
|
+
```
|
|
448
|
+
|
|
449
|
+
---
|
|
450
|
+
|
|
451
|
+
### Task 6: Full Verification For Phase 1 Tooling
|
|
452
|
+
|
|
453
|
+
**Files:**
|
|
454
|
+
- Modify as needed from previous tasks only
|
|
455
|
+
|
|
456
|
+
- [ ] **Step 1: Run focused tests**
|
|
457
|
+
|
|
458
|
+
Run:
|
|
459
|
+
|
|
460
|
+
```bash
|
|
461
|
+
npx tsx --test tests/config.tier-one.test.ts tests/packageTools.test.ts tests/init.test.ts tests/cliPackaging.test.ts
|
|
462
|
+
```
|
|
463
|
+
|
|
464
|
+
Expected: PASS
|
|
465
|
+
|
|
466
|
+
- [ ] **Step 2: Run full suite**
|
|
467
|
+
|
|
468
|
+
Run:
|
|
469
|
+
|
|
470
|
+
```bash
|
|
471
|
+
npm test
|
|
472
|
+
```
|
|
473
|
+
|
|
474
|
+
Expected: PASS
|
|
475
|
+
|
|
476
|
+
- [ ] **Step 3: Run full release checks**
|
|
477
|
+
|
|
478
|
+
Run:
|
|
479
|
+
|
|
480
|
+
```bash
|
|
481
|
+
npm run check
|
|
482
|
+
npm run build
|
|
483
|
+
npm run smoke:cli
|
|
484
|
+
npm_config_cache=/tmp/agentlab-npm-cache npm pack --dry-run
|
|
485
|
+
```
|
|
486
|
+
|
|
487
|
+
Expected: PASS
|
|
488
|
+
|
|
489
|
+
- [ ] **Step 4: Reconcile internal status**
|
|
490
|
+
|
|
491
|
+
If verification is green:
|
|
492
|
+
|
|
493
|
+
- keep `.claude/project.md` Phase 1 marked in progress only if remaining tasks are truly outstanding
|
|
494
|
+
- otherwise mark the package-tool portion complete explicitly
|
|
495
|
+
|
|
496
|
+
- [ ] **Step 5: Final commit**
|
|
497
|
+
|
|
498
|
+
```bash
|
|
499
|
+
git add -A
|
|
500
|
+
git commit -m "feat: finish phase-one npm tool support"
|
|
501
|
+
```
|
|
502
|
+
|
|
@@ -0,0 +1,164 @@
|
|
|
1
|
+
# Phase 2 Lite And Phase 3 Design
|
|
2
|
+
|
|
3
|
+
## Goal
|
|
4
|
+
|
|
5
|
+
Compress the original Phase 2 into a minimal integration-story pass, then move immediately into Phase 3 UI/demo polish.
|
|
6
|
+
|
|
7
|
+
The intent is to preserve product legibility for new users without spending weeks on broad framework coverage before the product is visually demo-ready.
|
|
8
|
+
|
|
9
|
+
## Why This Change
|
|
10
|
+
|
|
11
|
+
The current product is technically credible, but the main remaining gap is not core capability. It is demonstration quality and onboarding clarity.
|
|
12
|
+
|
|
13
|
+
Fully skipping Phase 2 would create a prettier product with weaker adoption paths:
|
|
14
|
+
|
|
15
|
+
- users would see polished UI but still ask how it fits their workflow
|
|
16
|
+
- the product would remain support-agent-coded in perception
|
|
17
|
+
- the README and launch story would still lack recognizable entry points
|
|
18
|
+
|
|
19
|
+
Keeping a trimmed Phase 2 solves that without delaying Phase 3 materially.
|
|
20
|
+
|
|
21
|
+
## Recommended Roadmap Change
|
|
22
|
+
|
|
23
|
+
Use this ordering:
|
|
24
|
+
|
|
25
|
+
1. `Phase 2-lite`
|
|
26
|
+
2. `Phase 3`
|
|
27
|
+
|
|
28
|
+
Do not treat Phase 2-lite as a broad integration campaign. Treat it as the minimum viable integration story required to make Phase 3 polish meaningful.
|
|
29
|
+
|
|
30
|
+
## Phase 2 Lite Scope
|
|
31
|
+
|
|
32
|
+
Phase 2-lite should deliver only the pieces that make the product legible to new technical users.
|
|
33
|
+
|
|
34
|
+
### Keep
|
|
35
|
+
|
|
36
|
+
- `arl-test` as the canonical HTTP/live-agent example
|
|
37
|
+
- one CI example using `agentlab run --suite-def pre_merge`
|
|
38
|
+
- one coding-agent example or guide
|
|
39
|
+
- 2-3 README entry points such as:
|
|
40
|
+
- start here for HTTP agents
|
|
41
|
+
- start here for coding agents
|
|
42
|
+
- start here for CI/pre-merge regression
|
|
43
|
+
|
|
44
|
+
### Skip For Now
|
|
45
|
+
|
|
46
|
+
- broad framework integration coverage
|
|
47
|
+
- multiple framework-specific guides
|
|
48
|
+
- large scenario-pack system work
|
|
49
|
+
- marketplace/community work
|
|
50
|
+
- many hero examples across every ecosystem
|
|
51
|
+
|
|
52
|
+
## Phase 2 Lite Deliverables
|
|
53
|
+
|
|
54
|
+
### 1. Canonical Integration Paths
|
|
55
|
+
|
|
56
|
+
The product should have three obvious ways in:
|
|
57
|
+
|
|
58
|
+
- HTTP/live service path
|
|
59
|
+
- anchored by `arl-test`
|
|
60
|
+
- coding-agent path
|
|
61
|
+
- enough to prove ARL is not just for support agents
|
|
62
|
+
- CI path
|
|
63
|
+
- GitHub Actions example using suite definitions
|
|
64
|
+
|
|
65
|
+
These should be recognizable and copy-pasteable.
|
|
66
|
+
|
|
67
|
+
### 2. README Entry Points
|
|
68
|
+
|
|
69
|
+
The README should not just describe the architecture. It should route users by workflow.
|
|
70
|
+
|
|
71
|
+
Recommended entry sections:
|
|
72
|
+
|
|
73
|
+
- “If your agent runs as an HTTP service”
|
|
74
|
+
- “If you are validating coding-agent changes”
|
|
75
|
+
- “If you want pre-merge regression checks in CI”
|
|
76
|
+
|
|
77
|
+
Each section should point to one canonical example, not many.
|
|
78
|
+
|
|
79
|
+
### 3. Keep Scope Narrow
|
|
80
|
+
|
|
81
|
+
Phase 2-lite should avoid product expansion.
|
|
82
|
+
|
|
83
|
+
It should mainly be:
|
|
84
|
+
|
|
85
|
+
- examples
|
|
86
|
+
- README routing
|
|
87
|
+
- one CI workflow example
|
|
88
|
+
- one extra concrete use-case path beyond HTTP support
|
|
89
|
+
|
|
90
|
+
## Phase 3 Scope
|
|
91
|
+
|
|
92
|
+
After Phase 2-lite, Phase 3 becomes the main workstream.
|
|
93
|
+
|
|
94
|
+
### Primary Goal
|
|
95
|
+
|
|
96
|
+
Make the product demoable, screenshotable, and easier to understand visually.
|
|
97
|
+
|
|
98
|
+
### Core Work
|
|
99
|
+
|
|
100
|
+
- comparison view redesign
|
|
101
|
+
- clearer red/green regression presentation
|
|
102
|
+
- better trace visualization
|
|
103
|
+
- stronger run history/dashboard view
|
|
104
|
+
- visual polish that feels intentional rather than debug-console minimal
|
|
105
|
+
- README screenshots or GIFs that show the regression story quickly
|
|
106
|
+
|
|
107
|
+
### Design Constraint
|
|
108
|
+
|
|
109
|
+
Phase 3 should improve clarity, not add ornamental UI.
|
|
110
|
+
|
|
111
|
+
Every UI change should help users answer one of these questions faster:
|
|
112
|
+
|
|
113
|
+
- what changed?
|
|
114
|
+
- what failed?
|
|
115
|
+
- where did it fail?
|
|
116
|
+
- did the candidate regress?
|
|
117
|
+
- should I trust this run?
|
|
118
|
+
|
|
119
|
+
## Success Criteria
|
|
120
|
+
|
|
121
|
+
### After Phase 2 Lite
|
|
122
|
+
|
|
123
|
+
A new technical user can quickly identify:
|
|
124
|
+
|
|
125
|
+
- how to use ARL with an HTTP agent
|
|
126
|
+
- how to use ARL in CI
|
|
127
|
+
- that ARL can also support coding-agent regression workflows
|
|
128
|
+
|
|
129
|
+
### After Phase 3
|
|
130
|
+
|
|
131
|
+
The product should be visually strong enough that:
|
|
132
|
+
|
|
133
|
+
- screenshots are worth sharing
|
|
134
|
+
- demos feel polished
|
|
135
|
+
- mentors and early users understand the product faster
|
|
136
|
+
- the UI helps explain value instead of requiring explanation around it
|
|
137
|
+
|
|
138
|
+
## Non-Goals
|
|
139
|
+
|
|
140
|
+
This roadmap change does not mean:
|
|
141
|
+
|
|
142
|
+
- hosted platform work
|
|
143
|
+
- broad plugin/framework ecosystem support
|
|
144
|
+
- marketplace or virality mechanics
|
|
145
|
+
- replacing core CLI authoring with UI-first configuration
|
|
146
|
+
|
|
147
|
+
Those remain later-phase work.
|
|
148
|
+
|
|
149
|
+
## Recommended Execution Order
|
|
150
|
+
|
|
151
|
+
1. update internal roadmap/task tracking to reflect `Phase 2-lite`
|
|
152
|
+
2. implement the minimal integration-story assets
|
|
153
|
+
3. switch immediately to Phase 3 UI/demo polish
|
|
154
|
+
|
|
155
|
+
## Decision
|
|
156
|
+
|
|
157
|
+
Use a compressed integration phase, not a skipped integration phase.
|
|
158
|
+
|
|
159
|
+
That is the best tradeoff between:
|
|
160
|
+
|
|
161
|
+
- speed
|
|
162
|
+
- product clarity
|
|
163
|
+
- demo quality
|
|
164
|
+
- launch readiness
|