agents-harness 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (56) hide show
  1. package/README.md +317 -0
  2. package/dist/cli.d.ts +2 -0
  3. package/dist/cli.js +56 -0
  4. package/dist/cli.js.map +1 -0
  5. package/dist/commands/config.d.ts +26 -0
  6. package/dist/commands/config.js +91 -0
  7. package/dist/commands/config.js.map +1 -0
  8. package/dist/commands/init.d.ts +1 -0
  9. package/dist/commands/init.js +67 -0
  10. package/dist/commands/init.js.map +1 -0
  11. package/dist/commands/resume.d.ts +6 -0
  12. package/dist/commands/resume.js +88 -0
  13. package/dist/commands/resume.js.map +1 -0
  14. package/dist/commands/run.d.ts +8 -0
  15. package/dist/commands/run.js +106 -0
  16. package/dist/commands/run.js.map +1 -0
  17. package/dist/commands/status.d.ts +1 -0
  18. package/dist/commands/status.js +38 -0
  19. package/dist/commands/status.js.map +1 -0
  20. package/dist/core/context-manager.d.ts +19 -0
  21. package/dist/core/context-manager.js +118 -0
  22. package/dist/core/context-manager.js.map +1 -0
  23. package/dist/core/file-protocol.d.ts +14 -0
  24. package/dist/core/file-protocol.js +119 -0
  25. package/dist/core/file-protocol.js.map +1 -0
  26. package/dist/core/orchestrator.d.ts +30 -0
  27. package/dist/core/orchestrator.js +238 -0
  28. package/dist/core/orchestrator.js.map +1 -0
  29. package/dist/core/types.d.ts +136 -0
  30. package/dist/core/types.js +3 -0
  31. package/dist/core/types.js.map +1 -0
  32. package/dist/dashboard/server.d.ts +15 -0
  33. package/dist/dashboard/server.js +61 -0
  34. package/dist/dashboard/server.js.map +1 -0
  35. package/dist/dashboard/socket.d.ts +8 -0
  36. package/dist/dashboard/socket.js +22 -0
  37. package/dist/dashboard/socket.js.map +1 -0
  38. package/dist/defaults/criteria.d.ts +1 -0
  39. package/dist/defaults/criteria.js +23 -0
  40. package/dist/defaults/criteria.js.map +1 -0
  41. package/dist/defaults/prompts.d.ts +15 -0
  42. package/dist/defaults/prompts.js +123 -0
  43. package/dist/defaults/prompts.js.map +1 -0
  44. package/dist/discovery/config-loader.d.ts +12 -0
  45. package/dist/discovery/config-loader.js +64 -0
  46. package/dist/discovery/config-loader.js.map +1 -0
  47. package/dist/discovery/project-context.d.ts +12 -0
  48. package/dist/discovery/project-context.js +56 -0
  49. package/dist/discovery/project-context.js.map +1 -0
  50. package/dist/discovery/stack-detector.d.ts +15 -0
  51. package/dist/discovery/stack-detector.js +372 -0
  52. package/dist/discovery/stack-detector.js.map +1 -0
  53. package/dist/index.d.ts +12 -0
  54. package/dist/index.js +14 -0
  55. package/dist/index.js.map +1 -0
  56. package/package.json +60 -0
package/README.md ADDED
@@ -0,0 +1,317 @@
1
+ # agent-harness
2
+
3
+ A multi-agent orchestrator for autonomous software development. Three AI agents — **Planner**, **Generator**, and **Evaluator** — work together in a loop to turn your feature spec into working code.
4
+
5
+ Built on the architecture described in Anthropic's engineering blog post: [**Harness Design for Long-Running Apps**](https://www.anthropic.com/engineering/harness-design-long-running-apps). The core idea: separate generation from evaluation (like a GAN), reset context between agent invocations to prevent degradation, and use file-based handoffs so each agent starts fresh.
6
+
7
+ ## How It Works
8
+
9
+ ```
10
+ You write a spec
11
+ |
12
+ v
13
+ [Planner] -----> Expands spec, breaks it into sprints, writes contracts
14
+ |
15
+ v
16
+ [Generator] ----> Implements the sprint contract (reads/writes/edits code)
17
+ |
18
+ v
19
+ [Evaluator] ----> Critically tests the implementation against the contract
20
+ |
21
+ PASS? ----yes--> Next sprint (or done)
22
+ |
23
+ no
24
+ |
25
+ v
26
+ [Generator] ----> Tries again with evaluator feedback
27
+ |
28
+ v
29
+ (loop up to max attempts)
30
+ ```
31
+
32
+ Each agent gets a **fresh context** on every invocation — no accumulated confusion from long conversations. State is passed between agents via files in the `.harness/` directory, not conversation history.
33
+
34
+ ## Quick Start
35
+
36
+ ### 1. Install
37
+
38
+ ```bash
39
+ npm install -g agent-harness
40
+ ```
41
+
42
+ ### 2. Set your API key
43
+
44
+ ```bash
45
+ # Option A: Environment variable
46
+ export ANTHROPIC_API_KEY=sk-ant-...
47
+
48
+ # Option B: Global config
49
+ agent-harness config set api-key sk-ant-...
50
+ ```
51
+
52
+ ### 3. Run
53
+
54
+ ```bash
55
+ cd your-project
56
+ agent-harness run "Add user authentication with email/password login and JWT tokens"
57
+ ```
58
+
59
+ That's it. The harness will plan, implement, and test the feature across multiple sprints.
60
+
61
+ ## Commands
62
+
63
+ ### `run` — Start a new run
64
+
65
+ ```bash
66
+ agent-harness run "<spec>"
67
+ ```
68
+
69
+ Give it a feature description and it handles the rest.
70
+
71
+ **Options:**
72
+
73
+ | Flag | Description | Default |
74
+ |------|-------------|---------|
75
+ | `-s, --scope <workspaces...>` | Limit to specific workspaces (monorepo) | All |
76
+ | `--max-attempts <n>` | Max retry attempts per sprint | 3 |
77
+ | `--max-budget <n>` | Max total spend in USD | 50 |
78
+ | `--dashboard` | Open a live web dashboard | Off |
79
+ | `--port <n>` | Dashboard port | 3117 |
80
+
81
+ **Examples:**
82
+
83
+ ```bash
84
+ # Simple feature
85
+ agent-harness run "Add a /health endpoint that returns 200 OK"
86
+
87
+ # With budget limit
88
+ agent-harness run "Refactor the auth module to use OAuth2" --max-budget 20
89
+
90
+ # Monorepo — only touch the backend
91
+ agent-harness run "Add pagination to the users API" --scope packages/api
92
+
93
+ # With live dashboard
94
+ agent-harness run "Build a notification system" --dashboard
95
+ ```
96
+
97
+ ### `init` — Initialize project config (optional)
98
+
99
+ ```bash
100
+ agent-harness init
101
+ ```
102
+
103
+ Creates a `.harness/` directory with:
104
+ - `config.yaml` — agent models, budget limits, attempt limits
105
+ - `criteria.md` — custom evaluation criteria template
106
+
107
+ The harness works without `init` — it auto-detects your stack. Only run this if you want to customize settings.
108
+
109
+ **Example output:**
110
+
111
+ ```
112
+ Detected project:
113
+ Repository type: single
114
+ Workspace: .
115
+ Language: typescript
116
+ Framework: next.js
117
+ Test runner: vitest
118
+ Test command: npx vitest run
119
+ CLAUDE.md: found
120
+
121
+ Created .harness/config.yaml
122
+ Created .harness/criteria.md
123
+ ```
124
+
125
+ ### `status` — Check run progress
126
+
127
+ ```bash
128
+ agent-harness status
129
+ ```
130
+
131
+ Shows the current state of a run — which sprint you're on, pass/fail status, and cost.
132
+
133
+ **Example output:**
134
+
135
+ ```
136
+ Status: RUNNING
137
+ Spec: Add user authentication with email/password login...
138
+ Started: 2025-03-28T10:30:00.000Z
139
+ Phase: evaluate
140
+ Cost: $2.45 / $50.00
141
+
142
+ Sprints: 2 / 3
143
+ [PASS] Sprint 1 — 1 attempt, $0.85
144
+ [....] Sprint 2 — 2 attempts, $1.60
145
+ [ ] Sprint 3
146
+ ```
147
+
148
+ ### `resume` — Resume a stopped run
149
+
150
+ ```bash
151
+ agent-harness resume
152
+ ```
153
+
154
+ Picks up where a stopped or failed run left off. Skips completed sprints.
155
+
156
+ **Options:**
157
+
158
+ | Flag | Description | Default |
159
+ |------|-------------|---------|
160
+ | `--max-budget <n>` | Max total spend in USD | 50 |
161
+ | `--dashboard` | Open a live web dashboard | Off |
162
+ | `--port <n>` | Dashboard port | 3117 |
163
+
164
+ **Example:**
165
+
166
+ ```bash
167
+ # Hit Ctrl+C during a run, then later:
168
+ agent-harness resume
169
+
170
+ # Resume with a higher budget
171
+ agent-harness resume --max-budget 100
172
+ ```
173
+
174
+ ### `config` — Manage global settings
175
+
176
+ ```bash
177
+ agent-harness config set <key> <value>
178
+ agent-harness config get <key>
179
+ ```
180
+
181
+ **Examples:**
182
+
183
+ ```bash
184
+ # Save your API key globally
185
+ agent-harness config set api-key sk-ant-api03-...
186
+
187
+ # Check what's set
188
+ agent-harness config get api-key
189
+ ```
190
+
191
+ Config is stored at `~/.agent-harness/config.yaml`.
192
+
193
+ ## Configuration
194
+
195
+ ### Zero-config (default)
196
+
197
+ The harness auto-detects your project:
198
+ - **Language** — TypeScript, Python, Rust, Go
199
+ - **Framework** — Next.js, Django, etc.
200
+ - **Test runner** — vitest, jest, pytest, cargo test, go test
201
+ - **Repo type** — single repo or monorepo (npm workspaces, pnpm, lerna)
202
+ - **CLAUDE.md** — reads project conventions if present
203
+
204
+ ### Custom config (optional)
205
+
206
+ Run `agent-harness init`, then edit `.harness/config.yaml`:
207
+
208
+ ```yaml
209
+ agents:
210
+ planner:
211
+ model: sonnet
212
+ generator:
213
+ model: opus
214
+ maxTurns: 100
215
+ evaluator:
216
+ model: sonnet
217
+ max_attempts_per_sprint: 3
218
+ max_budget_per_sprint_usd: 5
219
+ max_total_budget_usd: 50
220
+ ```
221
+
222
+ **Available models:** `opus`, `sonnet`, `haiku`
223
+
224
+ ### Custom evaluation criteria
225
+
226
+ Edit `.harness/criteria.md` to add project-specific rules:
227
+
228
+ ```markdown
229
+ # Custom Evaluation Criteria
230
+
231
+ - All API endpoints must return proper HTTP status codes
232
+ - Database migrations must be reversible
233
+ - All user-facing strings must be internationalized
234
+ ```
235
+
236
+ These are checked **in addition to** the built-in defaults (correctness, testing, code quality, integration).
237
+
238
+ ## Live Dashboard
239
+
240
+ Enable with `--dashboard` to get a real-time web UI:
241
+
242
+ ```bash
243
+ agent-harness run "Build a feature" --dashboard
244
+ # Dashboard: http://localhost:3117
245
+ ```
246
+
247
+ The dashboard shows:
248
+ - Sprint progress with pass/fail status
249
+ - Live activity stream (every file read, edit, bash command)
250
+ - Evaluation results with passed/failed criteria
251
+ - Cost tracking with budget progress bar
252
+ - Auto-reconnects if the connection drops
253
+
254
+ ## Programmatic API
255
+
256
+ Use agent-harness as a library in your own tools:
257
+
258
+ ```typescript
259
+ import { Harness } from "agent-harness";
260
+
261
+ const harness = new Harness({
262
+ apiKey: process.env.ANTHROPIC_API_KEY!,
263
+ root: "/path/to/project",
264
+ maxTotalBudgetUsd: 20,
265
+ });
266
+
267
+ harness.on("phase:start", (data) => {
268
+ console.log(`Phase: ${data.phase}, Sprint: ${data.sprint}`);
269
+ });
270
+
271
+ harness.on("evaluation", (data) => {
272
+ console.log(`Sprint ${data.sprint}: ${data.result.passed ? "PASS" : "FAIL"}`);
273
+ });
274
+
275
+ harness.on("run:complete", (data) => {
276
+ console.log(`Done — ${data.status}, cost: $${data.totalCostUsd.toFixed(2)}`);
277
+ });
278
+
279
+ await harness.run("Add a REST API for managing todos");
280
+ ```
281
+
282
+ ### Exported classes and functions
283
+
284
+ | Export | Description |
285
+ |--------|-------------|
286
+ | `Harness` | Main orchestrator class |
287
+ | `ContextManager` | Wraps Agent SDK with fresh context per call |
288
+ | `FileProtocol` | Manages `.harness/` directory state |
289
+ | `DashboardServer` | HTTP + WebSocket dashboard server |
290
+ | `buildProjectContext` | Auto-detect project stack and config |
291
+ | `detectStack` | Detect language, framework, test runner |
292
+ | `buildSystemPrompt` | Build agent system prompts |
293
+ | `DEFAULT_CRITERIA` | Built-in evaluation criteria |
294
+
295
+ ## The Three Agents
296
+
297
+ | Agent | Model | Role | Tools |
298
+ |-------|-------|------|-------|
299
+ | **Planner** | Sonnet | Writes specs, decomposes into sprints, writes contracts | Read, Write |
300
+ | **Generator** | Opus | Implements code based on the contract | Read, Edit, Write, Bash, Glob, Grep |
301
+ | **Evaluator** | Sonnet | Critically tests implementation against contract | Read, Bash, Grep, Glob |
302
+
303
+ Key design principle from the Anthropic article: **the generator never evaluates its own work**. A separate evaluator with fresh context provides unbiased assessment.
304
+
305
+ ## Requirements
306
+
307
+ - Node.js 18+
308
+ - An [Anthropic API key](https://console.anthropic.com/)
309
+ - The `@anthropic-ai/claude-agent-sdk` package (peer dependency)
310
+
311
+ ## Credits
312
+
313
+ This project is built on the harness architecture described in Anthropic's engineering article: [**Harness Design for Long-Running Apps**](https://www.anthropic.com/engineering/harness-design-long-running-apps). The article introduces the pattern of separating generation from evaluation, using fresh context windows per agent invocation, and file-based state handoffs to enable reliable multi-hour autonomous coding sessions.
314
+
315
+ ## License
316
+
317
+ ISC
package/dist/cli.d.ts ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env node
2
+ export {};
package/dist/cli.js ADDED
@@ -0,0 +1,56 @@
1
+ #!/usr/bin/env node
2
+ import { Command } from "commander";
3
+ import { runCommand } from "./commands/run.js";
4
+ import { initCommand } from "./commands/init.js";
5
+ import { statusCommand } from "./commands/status.js";
6
+ import { resumeCommand } from "./commands/resume.js";
7
+ import { configCommand } from "./commands/config.js";
8
+ const program = new Command();
9
+ program
10
+ .name("agent-harness")
11
+ .description("Multi-agent orchestrator for autonomous software development")
12
+ .version("0.1.0");
13
+ program
14
+ .command("run")
15
+ .description("Start a new harness run with a specification")
16
+ .argument("<spec>", "The feature specification to implement")
17
+ .option("-s, --scope <workspaces...>", "Limit to specific workspaces")
18
+ .option("--max-attempts <n>", "Max attempts per sprint", parseInt)
19
+ .option("--max-budget <n>", "Max total budget in USD", parseFloat)
20
+ .option("--dashboard", "Enable live dashboard")
21
+ .option("--port <n>", "Dashboard port", parseInt)
22
+ .action((spec, options) => {
23
+ runCommand(spec, options);
24
+ });
25
+ program
26
+ .command("init")
27
+ .description("Initialize .harness/ config for the current project")
28
+ .action(() => {
29
+ initCommand();
30
+ });
31
+ program
32
+ .command("status")
33
+ .description("Show status of the current run")
34
+ .action(() => {
35
+ statusCommand();
36
+ });
37
+ program
38
+ .command("resume")
39
+ .description("Resume a stopped or failed run")
40
+ .option("--max-budget <n>", "Max total budget in USD", parseFloat)
41
+ .option("--dashboard", "Enable live dashboard")
42
+ .option("--port <n>", "Dashboard port", parseInt)
43
+ .action((options) => {
44
+ resumeCommand(options);
45
+ });
46
+ program
47
+ .command("config")
48
+ .description("Get or set global configuration")
49
+ .argument("<action>", "Action: get or set")
50
+ .argument("[key]", "Config key (e.g. api-key)")
51
+ .argument("[value]", "Value to set (for set action)")
52
+ .action((action, key, value) => {
53
+ configCommand(action, key, value);
54
+ });
55
+ program.parse();
56
+ //# sourceMappingURL=cli.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"cli.js","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AAEA,OAAO,EAAE,OAAO,EAAE,MAAM,WAAW,CAAC;AACpC,OAAO,EAAE,UAAU,EAAE,MAAM,mBAAmB,CAAC;AAC/C,OAAO,EAAE,WAAW,EAAE,MAAM,oBAAoB,CAAC;AACjD,OAAO,EAAE,aAAa,EAAE,MAAM,sBAAsB,CAAC;AACrD,OAAO,EAAE,aAAa,EAAE,MAAM,sBAAsB,CAAC;AACrD,OAAO,EAAE,aAAa,EAAE,MAAM,sBAAsB,CAAC;AAErD,MAAM,OAAO,GAAG,IAAI,OAAO,EAAE,CAAC;AAE9B,OAAO;KACJ,IAAI,CAAC,eAAe,CAAC;KACrB,WAAW,CAAC,8DAA8D,CAAC;KAC3E,OAAO,CAAC,OAAO,CAAC,CAAC;AAEpB,OAAO;KACJ,OAAO,CAAC,KAAK,CAAC;KACd,WAAW,CAAC,8CAA8C,CAAC;KAC3D,QAAQ,CAAC,QAAQ,EAAE,wCAAwC,CAAC;KAC5D,MAAM,CAAC,6BAA6B,EAAE,8BAA8B,CAAC;KACrE,MAAM,CAAC,oBAAoB,EAAE,yBAAyB,EAAE,QAAQ,CAAC;KACjE,MAAM,CAAC,kBAAkB,EAAE,yBAAyB,EAAE,UAAU,CAAC;KACjE,MAAM,CAAC,aAAa,EAAE,uBAAuB,CAAC;KAC9C,MAAM,CAAC,YAAY,EAAE,gBAAgB,EAAE,QAAQ,CAAC;KAChD,MAAM,CAAC,CAAC,IAAY,EAAE,OAAO,EAAE,EAAE;IAChC,UAAU,CAAC,IAAI,EAAE,OAAO,CAAC,CAAC;AAC5B,CAAC,CAAC,CAAC;AAEL,OAAO;KACJ,OAAO,CAAC,MAAM,CAAC;KACf,WAAW,CAAC,qDAAqD,CAAC;KAClE,MAAM,CAAC,GAAG,EAAE;IACX,WAAW,EAAE,CAAC;AAChB,CAAC,CAAC,CAAC;AAEL,OAAO;KACJ,OAAO,CAAC,QAAQ,CAAC;KACjB,WAAW,CAAC,gCAAgC,CAAC;KAC7C,MAAM,CAAC,GAAG,EAAE;IACX,aAAa,EAAE,CAAC;AAClB,CAAC,CAAC,CAAC;AAEL,OAAO;KACJ,OAAO,CAAC,QAAQ,CAAC;KACjB,WAAW,CAAC,gCAAgC,CAAC;KAC7C,MAAM,CAAC,kBAAkB,EAAE,yBAAyB,EAAE,UAAU,CAAC;KACjE,MAAM,CAAC,aAAa,EAAE,uBAAuB,CAAC;KAC9C,MAAM,CAAC,YAAY,EAAE,gBAAgB,EAAE,QAAQ,CAAC;KAChD,MAAM,CAAC,CAAC,OAAO,EAAE,EAAE;IAClB,aAAa,CAAC,OAAO,CAAC,CAAC;AACzB,CAAC,CAAC,CAAC;AAEL,OAAO;KACJ,OAAO,CAAC,QAAQ,CAAC;KACjB,WAAW,CAAC,iCAAiC,CAAC;KAC9C,QAAQ,CAAC,UAAU,EAAE,oBAAoB,CAAC;KAC1C,QAAQ,CAAC,OAAO,EAAE,2BAA2B,CAAC;KAC9C,QAAQ,CAAC,SAAS,EAAE,+BAA+B,CAAC;KACpD,MAAM,CAAC,CAAC,MAAc,EAAE,GAAY,EAAE,KAAc,EAAE,EAAE;IACvD,aAAa,CAAC,MAAM,EAAE,GAAG,EAAE,KAAK,CAAC,CAAC;AACpC,CAAC,CAAC,CAAC;AAEL,OAAO,CAAC,KAAK,EAAE,CAAC"}
@@ -0,0 +1,26 @@
1
+ export interface GlobalConfig {
2
+ api_key?: string;
3
+ [key: string]: unknown;
4
+ }
5
+ /**
6
+ * Load global config from ~/.agent-harness/config.yaml
7
+ */
8
+ export declare function loadGlobalConfig(): GlobalConfig;
9
+ /**
10
+ * Save global config to ~/.agent-harness/config.yaml
11
+ */
12
+ export declare function saveGlobalConfig(config: GlobalConfig): void;
13
+ /**
14
+ * Resolve the API key from (in priority order):
15
+ * 1. ANTHROPIC_API_KEY environment variable
16
+ * 2. Global config file (~/.agent-harness/config.yaml)
17
+ */
18
+ export declare function resolveApiKey(): string | undefined;
19
+ /**
20
+ * CLI config command handler.
21
+ *
22
+ * Usage:
23
+ * agent-harness config set api-key <value>
24
+ * agent-harness config get api-key
25
+ */
26
+ export declare function configCommand(action: string, key?: string, value?: string): void;
@@ -0,0 +1,91 @@
1
+ import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
2
+ import { join } from "node:path";
3
+ import { homedir } from "node:os";
4
+ import { parse as parseYaml, stringify as toYaml } from "yaml";
5
+ function getGlobalConfigDir() {
6
+ return join(homedir(), ".agent-harness");
7
+ }
8
+ function getGlobalConfigPath() {
9
+ return join(getGlobalConfigDir(), "config.yaml");
10
+ }
11
+ /**
12
+ * Load global config from ~/.agent-harness/config.yaml
13
+ */
14
+ export function loadGlobalConfig() {
15
+ if (!existsSync(getGlobalConfigPath())) {
16
+ return {};
17
+ }
18
+ try {
19
+ const content = readFileSync(getGlobalConfigPath(), "utf-8");
20
+ return parseYaml(content) ?? {};
21
+ }
22
+ catch {
23
+ return {};
24
+ }
25
+ }
26
+ /**
27
+ * Save global config to ~/.agent-harness/config.yaml
28
+ */
29
+ export function saveGlobalConfig(config) {
30
+ mkdirSync(getGlobalConfigDir(), { recursive: true });
31
+ writeFileSync(getGlobalConfigPath(), toYaml(config), "utf-8");
32
+ }
33
+ /**
34
+ * Resolve the API key from (in priority order):
35
+ * 1. ANTHROPIC_API_KEY environment variable
36
+ * 2. Global config file (~/.agent-harness/config.yaml)
37
+ */
38
+ export function resolveApiKey() {
39
+ // Check env var first
40
+ const envKey = process.env.ANTHROPIC_API_KEY;
41
+ if (envKey)
42
+ return envKey;
43
+ // Check global config
44
+ const config = loadGlobalConfig();
45
+ if (config.api_key)
46
+ return config.api_key;
47
+ return undefined;
48
+ }
49
+ /**
50
+ * CLI config command handler.
51
+ *
52
+ * Usage:
53
+ * agent-harness config set api-key <value>
54
+ * agent-harness config get api-key
55
+ */
56
+ export function configCommand(action, key, value) {
57
+ // Map CLI key names to config key names
58
+ const keyMap = {
59
+ "api-key": "api_key",
60
+ };
61
+ if (action === "set") {
62
+ if (!key || !value) {
63
+ console.error("Usage: agent-harness config set <key> <value>");
64
+ return;
65
+ }
66
+ const configKey = keyMap[key] ?? key;
67
+ const config = loadGlobalConfig();
68
+ config[configKey] = value;
69
+ saveGlobalConfig(config);
70
+ console.log(`Set ${key} in global config.`);
71
+ }
72
+ else if (action === "get") {
73
+ if (!key) {
74
+ console.error("Usage: agent-harness config get <key>");
75
+ return;
76
+ }
77
+ const configKey = keyMap[key] ?? key;
78
+ const config = loadGlobalConfig();
79
+ const val = config[configKey];
80
+ if (val !== undefined) {
81
+ console.log(String(val));
82
+ }
83
+ else {
84
+ console.log(`${key} is not set.`);
85
+ }
86
+ }
87
+ else {
88
+ console.error(`Unknown config action: ${action}. Use "get" or "set".`);
89
+ }
90
+ }
91
+ //# sourceMappingURL=config.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"config.js","sourceRoot":"","sources":["../../src/commands/config.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,SAAS,EAAE,YAAY,EAAE,aAAa,EAAE,MAAM,SAAS,CAAC;AAC7E,OAAO,EAAE,IAAI,EAAE,MAAM,WAAW,CAAC;AACjC,OAAO,EAAE,OAAO,EAAE,MAAM,SAAS,CAAC;AAClC,OAAO,EAAE,KAAK,IAAI,SAAS,EAAE,SAAS,IAAI,MAAM,EAAE,MAAM,MAAM,CAAC;AAE/D,SAAS,kBAAkB;IACzB,OAAO,IAAI,CAAC,OAAO,EAAE,EAAE,gBAAgB,CAAC,CAAC;AAC3C,CAAC;AAED,SAAS,mBAAmB;IAC1B,OAAO,IAAI,CAAC,kBAAkB,EAAE,EAAE,aAAa,CAAC,CAAC;AACnD,CAAC;AAOD;;GAEG;AACH,MAAM,UAAU,gBAAgB;IAC9B,IAAI,CAAC,UAAU,CAAC,mBAAmB,EAAE,CAAC,EAAE,CAAC;QACvC,OAAO,EAAE,CAAC;IACZ,CAAC;IACD,IAAI,CAAC;QACH,MAAM,OAAO,GAAG,YAAY,CAAC,mBAAmB,EAAE,EAAE,OAAO,CAAC,CAAC;QAC7D,OAAQ,SAAS,CAAC,OAAO,CAAkB,IAAI,EAAE,CAAC;IACpD,CAAC;IAAC,MAAM,CAAC;QACP,OAAO,EAAE,CAAC;IACZ,CAAC;AACH,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,gBAAgB,CAAC,MAAoB;IACnD,SAAS,CAAC,kBAAkB,EAAE,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;IACrD,aAAa,CAAC,mBAAmB,EAAE,EAAE,MAAM,CAAC,MAAM,CAAC,EAAE,OAAO,CAAC,CAAC;AAChE,CAAC;AAED;;;;GAIG;AACH,MAAM,UAAU,aAAa;IAC3B,sBAAsB;IACtB,MAAM,MAAM,GAAG,OAAO,CAAC,GAAG,CAAC,iBAAiB,CAAC;IAC7C,IAAI,MAAM;QAAE,OAAO,MAAM,CAAC;IAE1B,sBAAsB;IACtB,MAAM,MAAM,GAAG,gBAAgB,EAAE,CAAC;IAClC,IAAI,MAAM,CAAC,OAAO;QAAE,OAAO,MAAM,CAAC,OAAO,CAAC;IAE1C,OAAO,SAAS,CAAC;AACnB,CAAC;AAED;;;;;;GAMG;AACH,MAAM,UAAU,aAAa,CAC3B,MAAc,EACd,GAAY,EACZ,KAAc;IAEd,wCAAwC;IACxC,MAAM,MAAM,GAA2B;QACrC,SAAS,EAAE,SAAS;KACrB,CAAC;IAEF,IAAI,MAAM,KAAK,KAAK,EAAE,CAAC;QACrB,IAAI,CAAC,GAAG,IAAI,CAAC,KAAK,EAAE,CAAC;YACnB,OAAO,CAAC,KAAK,CAAC,+CAA+C,CAAC,CAAC;YAC/D,OAAO;QACT,CAAC;QACD,MAAM,SAAS,GAAG,MAAM,CAAC,GAAG,CAAC,IAAI,GAAG,CAAC;QACrC,MAAM,MAAM,GAAG,gBAAgB,EAAE,CAAC;QAClC,MAAM,CAAC,SAAS,CAAC,GAAG,KAAK,CAAC;QAC1B,gBAAgB,CAAC,MAAM,CAAC,CAAC;QACzB,OAAO,CAAC,GAAG,CAAC,OAAO,GAAG,oBAAoB,CAAC,CAAC;IAC9C,CAAC;SAAM,IAAI,MAAM,KAAK,KAAK,EAAE,CAAC;QAC5B,IAAI,CAAC,GAAG,EAAE,CAAC;YACT,OAAO,CAAC,KAAK,CAAC,uCAAuC,CAAC,CAAC;YACvD,OAAO;QACT,CAAC;QACD,MAAM,SAAS,GAAG,MAAM,CAAC,GAAG,CAAC,IAAI,GAAG,CAAC;QACrC,MAAM,MAAM,GAAG,gBAAgB,EAAE,CAAC;QAClC,MAAM,GAAG,GAAG,MAAM,CAAC,SAAS,CAAC,CAAC;QAC9B,IAAI,GAAG,KAAK,SAAS,EAAE,CAAC;YACtB,OAAO,CAAC,GAAG,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC,CAAC;QAC3B,CAAC;aAAM,CAAC;YACN,OAAO,CAAC,GAAG,CAAC,GAAG,GAAG,cAAc,CAAC,CAAC;QACpC,CAAC;IACH,CAAC;SAAM,CAAC;QACN,OAAO,CAAC,KAAK,CAAC,0BAA0B,MAAM,uBAAuB,CAAC,CAAC;IACzE,CAAC;AACH,CAAC"}
@@ -0,0 +1 @@
1
+ export declare function initCommand(): void;
@@ -0,0 +1,67 @@
1
+ import { existsSync, mkdirSync, writeFileSync } from "node:fs";
2
+ import { join } from "node:path";
3
+ import { stringify as toYaml } from "yaml";
4
+ import { buildProjectContext } from "../discovery/project-context.js";
5
+ export function initCommand() {
6
+ const root = process.cwd();
7
+ const harnessDir = join(root, ".harness");
8
+ const configPath = join(harnessDir, "config.yaml");
9
+ if (existsSync(configPath)) {
10
+ console.log(".harness/config.yaml already exists. Skipping init.");
11
+ return;
12
+ }
13
+ // Run project discovery
14
+ const ctx = buildProjectContext(root, null);
15
+ console.log("Detected project:");
16
+ console.log(` Repository type: ${ctx.repoType}`);
17
+ for (const ws of ctx.workspaces) {
18
+ console.log(` Workspace: ${ws.path}`);
19
+ console.log(` Language: ${ws.stack.language}`);
20
+ if (ws.stack.framework) {
21
+ console.log(` Framework: ${ws.stack.framework}`);
22
+ }
23
+ if (ws.stack.testRunner) {
24
+ console.log(` Test runner: ${ws.stack.testRunner}`);
25
+ }
26
+ console.log(` Test command: ${ws.stack.testCommand}`);
27
+ }
28
+ if (ctx.rootClaudeMd) {
29
+ console.log(" CLAUDE.md: found");
30
+ }
31
+ console.log("");
32
+ // Scaffold .harness directory
33
+ mkdirSync(harnessDir, { recursive: true });
34
+ // Build initial config from detected values
35
+ const config = {
36
+ // Agent configuration (defaults shown as comments in the YAML)
37
+ agents: {
38
+ planner: { model: "sonnet" },
39
+ generator: { model: "opus" },
40
+ evaluator: { model: "sonnet" },
41
+ },
42
+ max_attempts_per_sprint: 3,
43
+ max_budget_per_sprint_usd: 5,
44
+ max_total_budget_usd: 50,
45
+ };
46
+ writeFileSync(configPath, toYaml(config), "utf-8");
47
+ console.log("Created .harness/config.yaml");
48
+ // Create criteria.md template
49
+ const criteriaPath = join(harnessDir, "criteria.md");
50
+ if (!existsSync(criteriaPath)) {
51
+ const criteriaTemplate = `# Custom Evaluation Criteria
52
+
53
+ Add project-specific criteria here. These are checked IN ADDITION to the default criteria.
54
+
55
+ ## Examples (uncomment and customize):
56
+ # - All API endpoints must return proper HTTP status codes
57
+ # - Database migrations must be reversible
58
+ # - All user-facing strings must be internationalized
59
+ # - Performance: API responses must be under 200ms
60
+ `;
61
+ writeFileSync(criteriaPath, criteriaTemplate, "utf-8");
62
+ console.log("Created .harness/criteria.md");
63
+ }
64
+ console.log("\nDone! Edit .harness/config.yaml to customize settings.");
65
+ console.log("Run 'agent-harness run \"<spec>\"' to start.");
66
+ }
67
+ //# sourceMappingURL=init.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"init.js","sourceRoot":"","sources":["../../src/commands/init.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,SAAS,EAAE,aAAa,EAAE,MAAM,SAAS,CAAC;AAC/D,OAAO,EAAE,IAAI,EAAE,MAAM,WAAW,CAAC;AACjC,OAAO,EAAE,SAAS,IAAI,MAAM,EAAE,MAAM,MAAM,CAAC;AAC3C,OAAO,EAAE,mBAAmB,EAAE,MAAM,iCAAiC,CAAC;AAEtE,MAAM,UAAU,WAAW;IACzB,MAAM,IAAI,GAAG,OAAO,CAAC,GAAG,EAAE,CAAC;IAC3B,MAAM,UAAU,GAAG,IAAI,CAAC,IAAI,EAAE,UAAU,CAAC,CAAC;IAC1C,MAAM,UAAU,GAAG,IAAI,CAAC,UAAU,EAAE,aAAa,CAAC,CAAC;IAEnD,IAAI,UAAU,CAAC,UAAU,CAAC,EAAE,CAAC;QAC3B,OAAO,CAAC,GAAG,CAAC,qDAAqD,CAAC,CAAC;QACnE,OAAO;IACT,CAAC;IAED,wBAAwB;IACxB,MAAM,GAAG,GAAG,mBAAmB,CAAC,IAAI,EAAE,IAAI,CAAC,CAAC;IAE5C,OAAO,CAAC,GAAG,CAAC,mBAAmB,CAAC,CAAC;IACjC,OAAO,CAAC,GAAG,CAAC,sBAAsB,GAAG,CAAC,QAAQ,EAAE,CAAC,CAAC;IAClD,KAAK,MAAM,EAAE,IAAI,GAAG,CAAC,UAAU,EAAE,CAAC;QAChC,OAAO,CAAC,GAAG,CAAC,gBAAgB,EAAE,CAAC,IAAI,EAAE,CAAC,CAAC;QACvC,OAAO,CAAC,GAAG,CAAC,iBAAiB,EAAE,CAAC,KAAK,CAAC,QAAQ,EAAE,CAAC,CAAC;QAClD,IAAI,EAAE,CAAC,KAAK,CAAC,SAAS,EAAE,CAAC;YACvB,OAAO,CAAC,GAAG,CAAC,kBAAkB,EAAE,CAAC,KAAK,CAAC,SAAS,EAAE,CAAC,CAAC;QACtD,CAAC;QACD,IAAI,EAAE,CAAC,KAAK,CAAC,UAAU,EAAE,CAAC;YACxB,OAAO,CAAC,GAAG,CAAC,oBAAoB,EAAE,CAAC,KAAK,CAAC,UAAU,EAAE,CAAC,CAAC;QACzD,CAAC;QACD,OAAO,CAAC,GAAG,CAAC,qBAAqB,EAAE,CAAC,KAAK,CAAC,WAAW,EAAE,CAAC,CAAC;IAC3D,CAAC;IACD,IAAI,GAAG,CAAC,YAAY,EAAE,CAAC;QACrB,OAAO,CAAC,GAAG,CAAC,oBAAoB,CAAC,CAAC;IACpC,CAAC;IACD,OAAO,CAAC,GAAG,CAAC,EAAE,CAAC,CAAC;IAEhB,8BAA8B;IAC9B,SAAS,CAAC,UAAU,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;IAE3C,4CAA4C;IAC5C,MAAM,MAAM,GAA4B;QACtC,+DAA+D;QAC/D,MAAM,EAAE;YACN,OAAO,EAAE,EAAE,KAAK,EAAE,QAAQ,EAAE;YAC5B,SAAS,EAAE,EAAE,KAAK,EAAE,MAAM,EAAE;YAC5B,SAAS,EAAE,EAAE,KAAK,EAAE,QAAQ,EAAE;SAC/B;QACD,uBAAuB,EAAE,CAAC;QAC1B,yBAAyB,EAAE,CAAC;QAC5B,oBAAoB,EAAE,EAAE;KACzB,CAAC;IAEF,aAAa,CAAC,UAAU,EAAE,MAAM,CAAC,MAAM,CAAC,EAAE,OAAO,CAAC,CAAC;IACnD,OAAO,CAAC,GAAG,CAAC,8BAA8B,CAAC,CAAC;IAE5C,8BAA8B;IAC9B,MAAM,YAAY,GAAG,IAAI,CAAC,UAAU,EAAE,aAAa,CAAC,CAAC;IACrD,IAAI,CAAC,UAAU,CAAC,YAAY,CAAC,EAAE,CAAC;QAC9B,MAAM,gBAAgB,GAAG;;;;;;;;;CAS5B,CAAC;QACE,aAAa,CAAC,YAAY,EAAE,gBAAgB,EAAE,OAAO,CAAC,CAAC;QACvD,OAAO,CAAC,GAAG,CAAC,8BAA8B,CAAC,CAAC;IAC9C,CAAC;IAED,OAAO,CAAC,GAAG,CAAC,0DAA0D,CAAC,CAAC;IACxE,OAAO,CAAC,GAAG,CAAC,8CAA8C,CAAC,CAAC;AAC9D,CAAC"}
@@ -0,0 +1,6 @@
1
+ export interface ResumeOptions {
2
+ maxBudget?: number;
3
+ dashboard?: boolean;
4
+ port?: number;
5
+ }
6
+ export declare function resumeCommand(options: ResumeOptions): Promise<void>;
@@ -0,0 +1,88 @@
1
+ import { Harness } from "../core/orchestrator.js";
2
+ import { resolveApiKey } from "./config.js";
3
+ function formatDuration(ms) {
4
+ const seconds = Math.floor(ms / 1000);
5
+ if (seconds < 60)
6
+ return `${seconds}s`;
7
+ const minutes = Math.floor(seconds / 60);
8
+ const remainingSeconds = seconds % 60;
9
+ if (minutes < 60)
10
+ return `${minutes}m ${remainingSeconds}s`;
11
+ const hours = Math.floor(minutes / 60);
12
+ const remainingMinutes = minutes % 60;
13
+ return `${hours}h ${remainingMinutes}m`;
14
+ }
15
+ export async function resumeCommand(options) {
16
+ const apiKey = resolveApiKey();
17
+ if (!apiKey) {
18
+ console.error("Error: No API key found.");
19
+ console.error("Set ANTHROPIC_API_KEY environment variable or run: agent-harness config set api-key <key>");
20
+ process.exit(1);
21
+ }
22
+ const root = process.cwd();
23
+ console.log("Resuming agent-harness run...");
24
+ console.log("");
25
+ const harness = new Harness({
26
+ apiKey,
27
+ root,
28
+ maxTotalBudgetUsd: options.maxBudget,
29
+ });
30
+ // Same event listeners as run command
31
+ harness.on("phase:start", (data) => {
32
+ const sprint = data.sprint > 0 ? ` [Sprint ${data.sprint}]` : "";
33
+ const attempt = data.attempt > 0 ? ` (attempt ${data.attempt})` : "";
34
+ console.log(`\n--- ${data.phase.toUpperCase()}${sprint}${attempt} ---`);
35
+ });
36
+ harness.on("agent:activity", (data) => {
37
+ console.log(` [${data.role}] ${data.summary}`);
38
+ });
39
+ harness.on("evaluation", (data) => {
40
+ const icon = data.result.passed ? "PASS" : "FAIL";
41
+ console.log(`\n Evaluation: ${icon}`);
42
+ if (data.result.passedCriteria.length > 0) {
43
+ for (const c of data.result.passedCriteria) {
44
+ console.log(` + ${c}`);
45
+ }
46
+ }
47
+ if (data.result.failedCriteria.length > 0) {
48
+ for (const c of data.result.failedCriteria) {
49
+ console.log(` - ${c}`);
50
+ }
51
+ }
52
+ if (!data.result.passed && data.result.critique) {
53
+ console.log(` Critique: ${data.result.critique.slice(0, 200)}`);
54
+ }
55
+ });
56
+ harness.on("cost:update", (data) => {
57
+ console.log(` Cost: $${data.totalCostUsd.toFixed(2)} / $${data.budgetUsd.toFixed(2)}`);
58
+ });
59
+ harness.on("sprint:complete", (data) => {
60
+ const icon = data.status === "passed" ? "PASS" : "FAIL";
61
+ console.log(`\nSprint ${data.sprint}: ${icon} (${data.attempts} attempt${data.attempts > 1 ? "s" : ""}, $${data.costUsd.toFixed(2)})`);
62
+ });
63
+ harness.on("run:complete", (data) => {
64
+ console.log("\n========================================");
65
+ console.log(`Run ${data.status.toUpperCase()}`);
66
+ console.log(`Sprints: ${data.totalSprints}`);
67
+ console.log(`Total cost: $${data.totalCostUsd.toFixed(2)}`);
68
+ console.log(`Duration: ${formatDuration(data.durationMs)}`);
69
+ console.log("========================================");
70
+ });
71
+ // Handle SIGINT for graceful shutdown
72
+ const handleSignal = () => {
73
+ console.log("\nReceived interrupt signal. Stopping...");
74
+ harness.stop();
75
+ };
76
+ process.on("SIGINT", handleSignal);
77
+ try {
78
+ await harness.resume();
79
+ }
80
+ catch (error) {
81
+ console.error(`\nResume failed: ${error instanceof Error ? error.message : String(error)}`);
82
+ process.exit(1);
83
+ }
84
+ finally {
85
+ process.off("SIGINT", handleSignal);
86
+ }
87
+ }
88
+ //# sourceMappingURL=resume.js.map