@weldr/runr 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.4.0] - 2026-01-03
11
+
12
+ **Case Files** — Every run leaves a machine-readable journal.
13
+
14
+ ### Added
15
+
16
+ - **Case Files**: Auto-generated `journal.md` + `journal.json` for every run
17
+ - Schema v1.0 with immutable facts (timestamps, milestones, verification attempts)
18
+ - Living data (append-only notes)
19
+ - Secret redaction in error excerpts
20
+ - Warnings array captures all extraction issues
21
+ - **CLI Commands**:
22
+ - `runr journal [run_id]` — Generate and display journal (defaults to latest)
23
+ - `runr note <message> [--run-id]` — Add timestamped note (defaults to latest)
24
+ - `runr open [run_id]` — Open journal in $EDITOR (defaults to latest)
25
+ - **Auto-generation**: Journals written on run completion (stop or finish)
26
+ - **Non-interactive safety**: `runr open` fails cleanly in CI or when $EDITOR unset
27
+
28
+ ### Fixed
29
+
30
+ - **Package bloat**: Excluded test files from npm package (81 → 69 files)
31
+ - **Deprecation warnings**: Replaced deprecated `getRunsRoot()` with `getRunrPaths().runs_dir`
32
+
10
33
  ## [0.3.0] - 2026-01-01
11
34
 
12
35
  **Renamed to Runr.** New identity, same reliability-first mission.
package/README.md CHANGED
@@ -1,57 +1,122 @@
1
1
  # Runr
2
2
 
3
- Phase-gated orchestration for agent tasks.
3
+ **Stop losing 30 minutes when the agent derails.**
4
4
 
5
- > **Status**: v0.3.0 — Renamed from `agent-runner`. Early, opinionated, evolving.
5
+ ![Failure Recovery](demo/failure-checkpoint.gif)
6
+
7
+ *When verification fails after 3 checkpoints, progress isn't lost — Runr saves verified work as git commits.*
8
+
9
+ ## Quickstart
10
+
11
+ ```bash
12
+ npm install -g @weldr/runr
13
+ cd your-repo
14
+ runr init
15
+ runr run --task .runr/tasks/your-task.md --worktree
16
+ ```
6
17
 
7
- ## The Problem
18
+ **If it stops:** Run the suggested command in `.runr/runs/<run_id>/handoffs/stop.json`
8
19
 
9
- AI agents can write code. They can also:
10
- - Claim success without verification
11
- - Modify files they shouldn't touch
12
- - Get stuck in infinite loops
13
- - Fail in ways that are impossible to debug
20
+ ![Next Action](demo/next-action.gif)
14
21
 
15
- **Runr doesn't make agents smarter. It makes them accountable.**
22
+ *Runr writes a stop handoff so agents know exactly what to do next — no guessing, no hallucinating.*
16
23
 
17
- ## What This Does
24
+ ## How It Works
18
25
 
19
- Runr orchestrates AI workers (Claude, Codex) through a phase-based workflow with hard gates:
26
+ Runr orchestrates AI workers through phase gates with checkpoints:
20
27
 
21
28
  ```
22
29
  PLAN → IMPLEMENT → VERIFY → REVIEW → CHECKPOINT → done
23
- ↑___________| (retry if needed)
30
+ ↑___________| (retry if verification fails)
24
31
  ```
25
32
 
26
- Every phase has criteria. You don't advance without meeting them.
33
+ - **Phase gates** Agent can't skip verification or claim false success
34
+ - **Checkpoints** — Verified milestones saved as git commits
35
+ - **Stop handoffs** — Structured diagnostics with next actions
36
+ - **Scope guards** — Files outside scope are protected
37
+
38
+ > **Status**: v0.3.0 — Renamed from `agent-runner`. Early, opinionated, evolving.
27
39
 
28
- ## Why Phase Gates?
40
+ ## Meta-Agent Quickstart (Recommended)
29
41
 
30
- Most agent tools optimize for speed. Runr optimizes for **trust**.
42
+ **The easiest way to use Runr:** Let your coding agent drive it.
31
43
 
32
- When a run fails (and it will), you get:
33
- - **Structured diagnostics** — exactly why it stopped
34
- - **Checkpoints** — resume from where it failed
35
- - **Scope guards** — files it couldn't touch, it didn't touch
36
- - **Evidence** — "done" means "proven done"
44
+ Runr works as a **reliable execution backend**. Instead of learning CLI commands, your agent (Claude Code, Codex, etc.) operates Runr for you — handling runs, interpreting failures, and resuming from checkpoints.
37
45
 
38
- ## Quick Start
46
+ ### Setup (One-Time)
39
47
 
40
48
  ```bash
41
- # Install
42
- git clone https://github.com/vonwao/runr.git
43
- cd runr && npm install && npm run build && npm link
49
+ # 1. Install Runr
50
+ npm install -g @weldr/runr
44
51
 
45
- # Verify
46
- runr version
52
+ # 2. Verify environment
47
53
  runr doctor
48
54
 
49
- # Run a task
55
+ # 3. Create minimal config
56
+ mkdir -p .runr/tasks
57
+ cat > .runr/runr.config.json << 'EOF'
58
+ {
59
+ "agent": { "name": "my-project", "version": "1" },
60
+ "scope": {
61
+ "presets": ["typescript", "vitest"]
62
+ },
63
+ "verification": {
64
+ "tier0": ["npm run typecheck"],
65
+ "tier1": ["npm test"]
66
+ }
67
+ }
68
+ EOF
69
+ ```
70
+
71
+ ### Usage
72
+
73
+ Just tell your coding agent:
74
+
75
+ > "Use Runr to add user authentication with OAuth2. Create checkpoints after each milestone."
76
+
77
+ The agent will:
78
+ 1. Create a task file (`.runr/tasks/add-auth.md`)
79
+ 2. Run `runr run --task ... --worktree`
80
+ 3. Monitor progress with `runr status`
81
+ 4. Handle failures, resume from checkpoints
82
+ 5. Report results with commit links
83
+
84
+ **See [RUNR_OPERATOR.md](./RUNR_OPERATOR.md)** for the complete agent integration guide.
85
+
86
+ ### Why This Works
87
+
88
+ Most devs already have a coding agent open. Telling them:
89
+ - "Drop this in your agent, and it'll drive Runr for you"
90
+
91
+ …has near-zero friction compared to:
92
+ - "Learn these CLI commands, create config files, understand phase gates"
93
+
94
+ The agent becomes your operator. Runr stays the reliable execution layer.
95
+
96
+ ---
97
+
98
+ ## Quick Start (Direct CLI)
99
+
100
+ ```bash
101
+ # Install
102
+ npm install -g @weldr/runr
103
+
104
+ # Initialize in your project
50
105
  cd /your/project
51
- runr run --task .runr/tasks/my-task.md --worktree
106
+ runr init
107
+
108
+ # Run a task
109
+ runr run --task .runr/tasks/example-feature.md --worktree
110
+
111
+ # If it fails, resume from last checkpoint
112
+ runr resume <run_id>
113
+
114
+ # Get machine-readable diagnostics
115
+ runr summarize <run_id>
116
+ # Output: .runr/runs/<run_id>/handoffs/stop.json
52
117
  ```
53
118
 
54
- > Not on npm yet. Coming soon as `@weldr/runr`.
119
+ > Prefer source install? See [Development](#development).
55
120
 
56
121
  ## Configuration
57
122
 
@@ -91,15 +156,20 @@ Available: `nextjs`, `react`, `drizzle`, `prisma`, `vitest`, `jest`, `playwright
91
156
 
92
157
  | Command | What it does |
93
158
  |---------|--------------|
159
+ | `runr init` | Initialize config (auto-detect verify commands) |
94
160
  | `runr run --task <file>` | Start a task |
95
161
  | `runr resume <id>` | Continue from checkpoint |
162
+ | `runr watch <id> --auto-resume` | Watch run + auto-resume on failure |
96
163
  | `runr status [id]` | Show run state |
97
164
  | `runr follow [id]` | Tail run progress |
98
- | `runr report <id>` | Generate run report |
165
+ | `runr report <id>` | Generate run report (includes next_action) |
166
+ | `runr journal [id]` | Generate and display case file |
167
+ | `runr note <message>` | Add timestamped note to run |
168
+ | `runr open [id]` | Open journal in $EDITOR |
99
169
  | `runr gc` | Clean up old runs |
100
170
  | `runr doctor` | Check environment |
101
171
 
102
- ### The Fun Commands
172
+ ### Aliases
103
173
 
104
174
  Same functionality, different vibe:
105
175
 
@@ -110,6 +180,57 @@ runr scry <id> # status
110
180
  runr banish # gc
111
181
  ```
112
182
 
183
+ ## Case Files
184
+
185
+ Every run automatically generates a **journal.md** case file in `.runr/runs/<run_id>/journal.md` containing:
186
+
187
+ - **Run metadata** (timestamps, duration, stop reason)
188
+ - **Task details** (goal, requirements, success criteria)
189
+ - **Milestone progress** (attempted, verified, checkpoints)
190
+ - **Verification history** (test attempts, pass/fail counts)
191
+ - **Code changes** (files changed, diff stats, top files)
192
+ - **Error excerpts** (last failure with redacted secrets)
193
+ - **Next action** (suggested command to continue)
194
+ - **Notes** (timestamped annotations)
195
+
196
+ ### Commands
197
+
198
+ ```bash
199
+ # Generate and display journal for latest run
200
+ runr journal
201
+
202
+ # Generate journal for specific run
203
+ runr journal <run_id>
204
+
205
+ # Force regeneration even if up to date
206
+ runr journal <run_id> --force
207
+
208
+ # Add a timestamped note to latest run
209
+ runr note "Debugging OAuth token refresh issue"
210
+
211
+ # Add note to specific run
212
+ runr note "Fixed token refresh" --run-id <run_id>
213
+
214
+ # Open journal in $EDITOR (defaults to latest run)
215
+ runr open
216
+ runr open <run_id>
217
+ ```
218
+
219
+ **Note**: If `<run_id>` is omitted, all commands default to the most recent run in the repository.
220
+
221
+ ### Auto-Generation
222
+
223
+ Journals are automatically generated when runs complete (stop or finish). You can also:
224
+ - Manually regenerate with `runr journal <run_id> --force`
225
+ - Add timestamped notes during or after runs with `runr note` (stored in `.runr/runs/<run_id>/notes.jsonl`)
226
+ - Open in your editor with `runr open` (uses `$EDITOR` or `vim`)
227
+
228
+ **Use case**: Share run context with collaborators, document debugging sessions, track experiment results.
229
+
230
+ **Files generated:**
231
+ - `journal.md` - Human-readable case file
232
+ - `notes.jsonl` - Timestamped notes (one JSON object per line)
233
+
113
234
  ## Task Files
114
235
 
115
236
  Tasks are markdown files:
@@ -146,26 +267,23 @@ Every stop produces `stop.json` + `stop.md` with diagnostics.
146
267
 
147
268
  ## Philosophy
148
269
 
149
- **This is not magic.** Runs fail. The goal is *understandable, resumable* failure.
270
+ This isn't magic. Runs fail. The goal is understandable, resumable failure.
150
271
 
151
- **This is not a chatbot.** Task in, code out. No conversation.
272
+ This isn't a chatbot. Task in, code out.
152
273
 
153
- **This is not a code generator.** It orchestrates generators. Different job.
274
+ This isn't a code generator. It orchestrates generators.
154
275
 
155
- **Agents lie. Logs don't.** If it can't prove it, it didn't do it.
276
+ Agents lie. Logs don't. If it can't prove it, it didn't do it.
156
277
 
157
278
  ## Migrating from agent-runner
158
279
 
159
- If you're upgrading from `agent-runner`:
160
-
161
280
  | Old | New |
162
281
  |-----|-----|
163
282
  | `agent` CLI | `runr` CLI |
164
283
  | `.agent/` directory | `.runr/` directory |
165
284
  | `agent.config.json` | `runr.config.json` |
166
285
  | `.agent-worktrees/` | `.runr-worktrees/` |
167
-
168
- Both old and new locations work during the transition period. You'll see deprecation warnings for old locations.
286
+ Old paths still work for now, with deprecation warnings.
169
287
 
170
288
  ## Development
171
289
 
@@ -179,21 +297,21 @@ npm run dev -- run --task task.md # run from source
179
297
 
180
298
  | Version | Date | Highlights |
181
299
  |---------|------|------------|
182
- | v0.3.0 | 2026-01-01 | **Renamed to Runr**, new CLI, new directory structure |
183
- | v0.2.2 | 2025-12-31 | Worktree location fix, guard diagnostics |
184
- | v0.2.1 | 2025-12-29 | Scope presets, review digest |
185
- | v0.2.0 | 2025-12-28 | Review loop detection |
186
- | v0.1.0 | 2025-12-27 | Initial stable release |
300
+ | v0.3.0 | **Renamed to Runr**, new CLI, new directory structure |
301
+ | v0.2.2 | Worktree location fix, guard diagnostics |
302
+ | v0.2.1 | Scope presets, review digest |
303
+ | v0.2.0 | Review loop detection |
304
+ | v0.1.0 | Initial stable release |
187
305
 
188
- See [CHANGELOG.md](CHANGELOG.md) for detailed release notes.
306
+ See [CHANGELOG.md](CHANGELOG.md) for details.
189
307
 
190
308
  ## Contributing
191
309
 
192
- See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and guidelines.
310
+ See [CONTRIBUTING.md](CONTRIBUTING.md).
193
311
 
194
312
  ## License
195
313
 
196
- Apache 2.0 — See [LICENSE](LICENSE)
314
+ Apache 2.0 — See [LICENSE](LICENSE).
197
315
 
198
316
  ---
199
317
 
package/dist/cli.js CHANGED
@@ -5,6 +5,7 @@ import { resumeCommand } from './commands/resume.js';
5
5
  import { statusCommand, statusAllCommand } from './commands/status.js';
6
6
  import { reportCommand, findLatestRunId } from './commands/report.js';
7
7
  import { summarizeCommand } from './commands/summarize.js';
8
+ import { nextCommand } from './commands/next.js';
8
9
  import { compareCommand } from './commands/compare.js';
9
10
  import { guardsOnlyCommand } from './commands/guards-only.js';
10
11
  import { doctorCommand } from './commands/doctor.js';
@@ -15,6 +16,9 @@ import { orchestrateCommand, resumeOrchestrationCommand, waitOrchestrationComman
15
16
  import { pathsCommand } from './commands/paths.js';
16
17
  import { metricsCommand } from './commands/metrics.js';
17
18
  import { versionCommand } from './commands/version.js';
19
+ import { initCommand } from './commands/init.js';
20
+ import { watchCommand } from './commands/watch.js';
21
+ import { journalCommand, noteCommand, openCommand } from './commands/journal.js';
18
22
  const program = new Command();
19
23
  // Check if invoked as deprecated 'agent' command
20
24
  const invokedAs = process.argv[1]?.split('/').pop() || 'runr';
@@ -24,6 +28,21 @@ if (invokedAs === 'agent') {
24
28
  program
25
29
  .name('runr')
26
30
  .description('Phase-gated orchestration for agent tasks');
31
+ program
32
+ .command('init')
33
+ .description('Initialize Runr configuration for a repository')
34
+ .option('--repo <path>', 'Path to repository (defaults to current directory)', '.')
35
+ .option('--interactive', 'Launch interactive setup wizard to configure verification commands', false)
36
+ .option('--print', 'Display generated config in terminal without writing to disk', false)
37
+ .option('--force', 'Overwrite existing .runr/runr.config.json if present', false)
38
+ .action(async (options) => {
39
+ await initCommand({
40
+ repo: options.repo,
41
+ interactive: options.interactive,
42
+ print: options.print,
43
+ force: options.force
44
+ });
45
+ });
27
46
  program
28
47
  .command('run')
29
48
  .option('--repo <path>', 'Target repo path (default: current directory)', '.')
@@ -132,6 +151,7 @@ program
132
151
  .option('--repo <path>', 'Target repo path (default: current directory)', '.')
133
152
  .option('--tail <count>', 'Tail last N events', '50')
134
153
  .option('--kpi-only', 'Show compact KPI summary only')
154
+ .option('--json', 'Output KPI as JSON (includes next_action and suggested_command)')
135
155
  .action(async (runId, options) => {
136
156
  let resolvedRunId = runId;
137
157
  if (runId === 'latest') {
@@ -146,7 +166,8 @@ program
146
166
  runId: resolvedRunId,
147
167
  repo: options.repo,
148
168
  tail: Number.parseInt(options.tail, 10),
149
- kpiOnly: options.kpiOnly
169
+ kpiOnly: options.kpiOnly,
170
+ json: options.json
150
171
  });
151
172
  });
152
173
  program
@@ -166,6 +187,23 @@ program
166
187
  }
167
188
  await summarizeCommand({ runId: resolvedRunId, repo: options.repo });
168
189
  });
190
+ program
191
+ .command('next')
192
+ .description('Print suggested next command from stop handoff')
193
+ .argument('<runId>', 'Run ID (or "latest")')
194
+ .option('--repo <path>', 'Target repo path (default: current directory)', '.')
195
+ .action(async (runId, options) => {
196
+ let resolvedRunId = runId;
197
+ if (runId === 'latest') {
198
+ const latest = findLatestRunId(options.repo);
199
+ if (!latest) {
200
+ console.error('No runs found');
201
+ process.exit(1);
202
+ }
203
+ resolvedRunId = latest;
204
+ }
205
+ await nextCommand(resolvedRunId, { repo: options.repo });
206
+ });
169
207
  program
170
208
  .command('compare')
171
209
  .description('Compare KPIs between two runs')
@@ -258,6 +296,25 @@ program
258
296
  olderThan: Number.parseInt(options.olderThan, 10)
259
297
  });
260
298
  });
299
+ program
300
+ .command('watch')
301
+ .description('Watch run progress and optionally auto-resume on failure')
302
+ .argument('<runId>', 'Run ID to watch')
303
+ .option('--repo <path>', 'Target repo path (default: current directory)', '.')
304
+ .option('--auto-resume', 'Automatically resume on transient failures', false)
305
+ .option('--max-attempts <N>', 'Maximum auto-resume attempts (default: 3)', '3')
306
+ .option('--interval <seconds>', 'Poll interval in seconds (default: 5)', '5')
307
+ .option('--json', 'Output JSON events', false)
308
+ .action(async (runId, options) => {
309
+ await watchCommand({
310
+ runId,
311
+ repo: options.repo,
312
+ autoResume: options.autoResume,
313
+ maxAttempts: Number.parseInt(options.maxAttempts, 10),
314
+ interval: Number.parseInt(options.interval, 10) * 1000,
315
+ json: options.json
316
+ });
317
+ });
261
318
  program
262
319
  .command('wait')
263
320
  .description('Block until run reaches terminal state (for meta-agent coordination)')
@@ -461,4 +518,44 @@ program
461
518
  olderThan: Number.parseInt(options.olderThan, 10)
462
519
  });
463
520
  });
521
+ // journal - Generate case file from run
522
+ program
523
+ .command('journal')
524
+ .description('Generate and display journal.md for a run')
525
+ .argument('[runId]', 'Run ID (defaults to latest)')
526
+ .option('--repo <path>', 'Target repo path', '.')
527
+ .option('--output <file>', 'Output file path (defaults to runs/<id>/journal.md)')
528
+ .option('--force', 'Force regeneration even if up to date', false)
529
+ .action(async (runId, options) => {
530
+ await journalCommand({
531
+ repo: options.repo,
532
+ runId,
533
+ output: options.output,
534
+ force: options.force
535
+ });
536
+ });
537
+ // note - Add timestamped note to run
538
+ program
539
+ .command('note <message>')
540
+ .description('Add a timestamped note to a run')
541
+ .option('--repo <path>', 'Target repo path', '.')
542
+ .option('--run-id <id>', 'Run ID (defaults to latest)')
543
+ .action(async (message, options) => {
544
+ await noteCommand(message, {
545
+ repo: options.repo,
546
+ runId: options.runId
547
+ });
548
+ });
549
+ // open - Open journal.md in editor
550
+ program
551
+ .command('open')
552
+ .description('Open journal.md in $EDITOR')
553
+ .argument('[runId]', 'Run ID (defaults to latest)')
554
+ .option('--repo <path>', 'Target repo path', '.')
555
+ .action(async (runId, options) => {
556
+ await openCommand({
557
+ repo: options.repo,
558
+ runId
559
+ });
560
+ });
464
561
  program.parseAsync();