npm - @weldr/runr - Versions diffs - 0.3.0 → 0.4.0 - Mend

@weldr/runr 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/CHANGELOG.md +23 -0
package/README.md +165 -47
package/dist/cli.js +98 -1
package/dist/commands/init.js +440 -0
package/dist/commands/journal.js +167 -0
package/dist/commands/next.js +25 -0
package/dist/commands/report.js +55 -4
package/dist/commands/watch.js +187 -0
package/dist/journal/builder.js +464 -0
package/dist/journal/redactor.js +68 -0
package/dist/journal/renderer.js +201 -0
package/dist/journal/types.js +7 -0
package/dist/supervisor/runner.js +31 -0
package/package.json +3 -1
package/dist/commands/__tests__/report.test.js +0 -202
package/dist/config/__tests__/presets.test.js +0 -104
package/dist/context/__tests__/artifact.test.js +0 -130
package/dist/context/__tests__/pack.test.js +0 -191
package/dist/env/__tests__/fingerprint.test.js +0 -116
package/dist/orchestrator/__tests__/policy.test.js +0 -185
package/dist/orchestrator/__tests__/schema-version.test.js +0 -65
package/dist/supervisor/__tests__/evidence-gate.test.js +0 -111
package/dist/supervisor/__tests__/ownership.test.js +0 -103
package/dist/supervisor/__tests__/state-machine.test.js +0 -290
package/dist/workers/__tests__/claude.test.js +0 -88
package/dist/workers/__tests__/codex.test.js +0 -81

package/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.4.0] - 2026-01-03
+**Case Files** — Every run leaves a machine-readable journal.
+### Added
+- **Case Files**: Auto-generated `journal.md` + `journal.json` for every run
+  - Schema v1.0 with immutable facts (timestamps, milestones, verification attempts)
+  - Living data (append-only notes)
+  - Secret redaction in error excerpts
+  - Warnings array captures all extraction issues
+- **CLI Commands**:
+  - `runr journal [run_id]` — Generate and display journal (defaults to latest)
+  - `runr note <message> [--run-id]` — Add timestamped note (defaults to latest)
+  - `runr open [run_id]` — Open journal in $EDITOR (defaults to latest)
+- **Auto-generation**: Journals written on run completion (stop or finish)
+- **Non-interactive safety**: `runr open` fails cleanly in CI or when $EDITOR unset
+### Fixed
+- **Package bloat**: Excluded test files from npm package (81 → 69 files)
+- **Deprecation warnings**: Replaced deprecated `getRunsRoot()` with `getRunrPaths().runs_dir`
 ## [0.3.0] - 2026-01-01
 **Renamed to Runr.** New identity, same reliability-first mission.

package/README.md CHANGED Viewed

@@ -1,57 +1,122 @@
 # Runr
-Phase-gated orchestration for agent tasks.
+**Stop losing 30 minutes when the agent derails.**
-> **Status**: v0.3.0 — Renamed from `agent-runner`. Early, opinionated, evolving.
+![Failure Recovery](demo/failure-checkpoint.gif)
+*When verification fails after 3 checkpoints, progress isn't lost — Runr saves verified work as git commits.*
+## Quickstart
+```bash
+npm install -g @weldr/runr
+cd your-repo
+runr init
+runr run --task .runr/tasks/your-task.md --worktree
+```
-## The Problem
+**If it stops:** Run the suggested command in `.runr/runs/<run_id>/handoffs/stop.json`
-AI agents can write code. They can also:
-- Claim success without verification
-- Modify files they shouldn't touch
-- Get stuck in infinite loops
-- Fail in ways that are impossible to debug
+![Next Action](demo/next-action.gif)
-**Runr doesn't make agents smarter. It makes them accountable.**
+*Runr writes a stop handoff so agents know exactly what to do next — no guessing, no hallucinating.*
-## What This Does
+## How It Works
-Runr orchestrates AI workers (Claude, Codex) through a phase-based workflow with hard gates:
+Runr orchestrates AI workers through phase gates with checkpoints:
 ```
 PLAN → IMPLEMENT → VERIFY → REVIEW → CHECKPOINT → done
-         ↑___________|  (retry if needed)
+         ↑___________|  (retry if verification fails)
 ```
-Every phase has criteria. You don't advance without meeting them.
+- **Phase gates** — Agent can't skip verification or claim false success
+- **Checkpoints** — Verified milestones saved as git commits
+- **Stop handoffs** — Structured diagnostics with next actions
+- **Scope guards** — Files outside scope are protected
+> **Status**: v0.3.0 — Renamed from `agent-runner`. Early, opinionated, evolving.
-## Why Phase Gates?
+## Meta-Agent Quickstart (Recommended)
-Most agent tools optimize for speed. Runr optimizes for **trust**.
+**The easiest way to use Runr:** Let your coding agent drive it.
-When a run fails (and it will), you get:
-- **Structured diagnostics** — exactly why it stopped
-- **Checkpoints** — resume from where it failed
-- **Scope guards** — files it couldn't touch, it didn't touch
-- **Evidence** — "done" means "proven done"
+Runr works as a **reliable execution backend**. Instead of learning CLI commands, your agent (Claude Code, Codex, etc.) operates Runr for you — handling runs, interpreting failures, and resuming from checkpoints.
-## Quick Start
+### Setup (One-Time)
 ```bash
-# Install
-git clone https://github.com/vonwao/runr.git
-cd runr && npm install && npm run build && npm link
+# 1. Install Runr
+npm install -g @weldr/runr
-# Verify
-runr version
+# 2. Verify environment
 runr doctor
-# Run a task
+# 3. Create minimal config
+mkdir -p .runr/tasks
+cat > .runr/runr.config.json << 'EOF'
+{
+  "agent": { "name": "my-project", "version": "1" },
+  "scope": {
+    "presets": ["typescript", "vitest"]
+  },
+  "verification": {
+    "tier0": ["npm run typecheck"],
+    "tier1": ["npm test"]
+  }
+}
+EOF
+```
+### Usage
+Just tell your coding agent:
+> "Use Runr to add user authentication with OAuth2. Create checkpoints after each milestone."
+The agent will:
+1. Create a task file (`.runr/tasks/add-auth.md`)
+2. Run `runr run --task ... --worktree`
+3. Monitor progress with `runr status`
+4. Handle failures, resume from checkpoints
+5. Report results with commit links
+**See [RUNR_OPERATOR.md](./RUNR_OPERATOR.md)** for the complete agent integration guide.
+### Why This Works
+Most devs already have a coding agent open. Telling them:
+- "Drop this in your agent, and it'll drive Runr for you"
+…has near-zero friction compared to:
+- "Learn these CLI commands, create config files, understand phase gates"
+The agent becomes your operator. Runr stays the reliable execution layer.
+---
+## Quick Start (Direct CLI)
+```bash
+# Install
+npm install -g @weldr/runr
+# Initialize in your project
 cd /your/project
-runr run --task .runr/tasks/my-task.md --worktree
+runr init
+# Run a task
+runr run --task .runr/tasks/example-feature.md --worktree
+# If it fails, resume from last checkpoint
+runr resume <run_id>
+# Get machine-readable diagnostics
+runr summarize <run_id>
+# Output: .runr/runs/<run_id>/handoffs/stop.json
 ```
-> Not on npm yet. Coming soon as `@weldr/runr`.
+> Prefer source install? See [Development](#development).
 ## Configuration
@@ -91,15 +156,20 @@ Available: `nextjs`, `react`, `drizzle`, `prisma`, `vitest`, `jest`, `playwright
 | Command | What it does |
 |---------|--------------|
+| `runr init` | Initialize config (auto-detect verify commands) |
 | `runr run --task <file>` | Start a task |
 | `runr resume <id>` | Continue from checkpoint |
+| `runr watch <id> --auto-resume` | Watch run + auto-resume on failure |
 | `runr status [id]` | Show run state |
 | `runr follow [id]` | Tail run progress |
-| `runr report <id>` | Generate run report |
+| `runr report <id>` | Generate run report (includes next_action) |
+| `runr journal [id]` | Generate and display case file |
+| `runr note <message>` | Add timestamped note to run |
+| `runr open [id]` | Open journal in $EDITOR |
 | `runr gc` | Clean up old runs |
 | `runr doctor` | Check environment |
-### The Fun Commands
+### Aliases
 Same functionality, different vibe:
@@ -110,6 +180,57 @@ runr scry <id>               # status
 runr banish                  # gc
 ```
+## Case Files
+Every run automatically generates a **journal.md** case file in `.runr/runs/<run_id>/journal.md` containing:
+- **Run metadata** (timestamps, duration, stop reason)
+- **Task details** (goal, requirements, success criteria)
+- **Milestone progress** (attempted, verified, checkpoints)
+- **Verification history** (test attempts, pass/fail counts)
+- **Code changes** (files changed, diff stats, top files)
+- **Error excerpts** (last failure with redacted secrets)
+- **Next action** (suggested command to continue)
+- **Notes** (timestamped annotations)
+### Commands
+```bash
+# Generate and display journal for latest run
+runr journal
+# Generate journal for specific run
+runr journal <run_id>
+# Force regeneration even if up to date
+runr journal <run_id> --force
+# Add a timestamped note to latest run
+runr note "Debugging OAuth token refresh issue"
+# Add note to specific run
+runr note "Fixed token refresh" --run-id <run_id>
+# Open journal in $EDITOR (defaults to latest run)
+runr open
+runr open <run_id>
+```
+**Note**: If `<run_id>` is omitted, all commands default to the most recent run in the repository.
+### Auto-Generation
+Journals are automatically generated when runs complete (stop or finish). You can also:
+- Manually regenerate with `runr journal <run_id> --force`
+- Add timestamped notes during or after runs with `runr note` (stored in `.runr/runs/<run_id>/notes.jsonl`)
+- Open in your editor with `runr open` (uses `$EDITOR` or `vim`)
+**Use case**: Share run context with collaborators, document debugging sessions, track experiment results.
+**Files generated:**
+- `journal.md` - Human-readable case file
+- `notes.jsonl` - Timestamped notes (one JSON object per line)
 ## Task Files
 Tasks are markdown files:
@@ -146,26 +267,23 @@ Every stop produces `stop.json` + `stop.md` with diagnostics.
 ## Philosophy
-**This is not magic.** Runs fail. The goal is *understandable, resumable* failure.
+This isn't magic. Runs fail. The goal is understandable, resumable failure.
-**This is not a chatbot.** Task in, code out. No conversation.
+This isn't a chatbot. Task in, code out.
-**This is not a code generator.** It orchestrates generators. Different job.
+This isn't a code generator. It orchestrates generators.
-**Agents lie. Logs don't.** If it can't prove it, it didn't do it.
+Agents lie. Logs don't. If it can't prove it, it didn't do it.
 ## Migrating from agent-runner
-If you're upgrading from `agent-runner`:
 | Old | New |
 |-----|-----|
 | `agent` CLI | `runr` CLI |
 | `.agent/` directory | `.runr/` directory |
 | `agent.config.json` | `runr.config.json` |
 | `.agent-worktrees/` | `.runr-worktrees/` |
-Both old and new locations work during the transition period. You'll see deprecation warnings for old locations.
+Old paths still work for now, with deprecation warnings.
 ## Development
@@ -179,21 +297,21 @@ npm run dev -- run --task task.md  # run from source
 | Version | Date | Highlights |
 |---------|------|------------|
-| v0.3.0 | 2026-01-01 | **Renamed to Runr**, new CLI, new directory structure |
-| v0.2.2 | 2025-12-31 | Worktree location fix, guard diagnostics |
-| v0.2.1 | 2025-12-29 | Scope presets, review digest |
-| v0.2.0 | 2025-12-28 | Review loop detection |
-| v0.1.0 | 2025-12-27 | Initial stable release |
+| v0.3.0 | **Renamed to Runr**, new CLI, new directory structure |
+| v0.2.2 | Worktree location fix, guard diagnostics |
+| v0.2.1 | Scope presets, review digest |
+| v0.2.0 | Review loop detection |
+| v0.1.0 | Initial stable release |
-See [CHANGELOG.md](CHANGELOG.md) for detailed release notes.
+See [CHANGELOG.md](CHANGELOG.md) for details.
 ## Contributing
-See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and guidelines.
+See [CONTRIBUTING.md](CONTRIBUTING.md).
 ## License
-Apache 2.0 — See [LICENSE](LICENSE)
+Apache 2.0 — See [LICENSE](LICENSE).
 ---

package/dist/cli.js CHANGED Viewed

@@ -5,6 +5,7 @@ import { resumeCommand } from './commands/resume.js';
 import { statusCommand, statusAllCommand } from './commands/status.js';
 import { reportCommand, findLatestRunId } from './commands/report.js';
 import { summarizeCommand } from './commands/summarize.js';
+import { nextCommand } from './commands/next.js';
 import { compareCommand } from './commands/compare.js';
 import { guardsOnlyCommand } from './commands/guards-only.js';
 import { doctorCommand } from './commands/doctor.js';
@@ -15,6 +16,9 @@ import { orchestrateCommand, resumeOrchestrationCommand, waitOrchestrationComman
 import { pathsCommand } from './commands/paths.js';
 import { metricsCommand } from './commands/metrics.js';
 import { versionCommand } from './commands/version.js';
+import { initCommand } from './commands/init.js';
+import { watchCommand } from './commands/watch.js';
+import { journalCommand, noteCommand, openCommand } from './commands/journal.js';
 const program = new Command();
 // Check if invoked as deprecated 'agent' command
 const invokedAs = process.argv[1]?.split('/').pop() || 'runr';
@@ -24,6 +28,21 @@ if (invokedAs === 'agent') {
 program
     .name('runr')
     .description('Phase-gated orchestration for agent tasks');
+program
+    .command('init')
+    .description('Initialize Runr configuration for a repository')
+    .option('--repo <path>', 'Path to repository (defaults to current directory)', '.')
+    .option('--interactive', 'Launch interactive setup wizard to configure verification commands', false)
+    .option('--print', 'Display generated config in terminal without writing to disk', false)
+    .option('--force', 'Overwrite existing .runr/runr.config.json if present', false)
+    .action(async (options) => {
+    await initCommand({
+        repo: options.repo,
+        interactive: options.interactive,
+        print: options.print,
+        force: options.force
+    });
+});
 program
     .command('run')
     .option('--repo <path>', 'Target repo path (default: current directory)', '.')
@@ -132,6 +151,7 @@ program
     .option('--repo <path>', 'Target repo path (default: current directory)', '.')
     .option('--tail <count>', 'Tail last N events', '50')
     .option('--kpi-only', 'Show compact KPI summary only')
+    .option('--json', 'Output KPI as JSON (includes next_action and suggested_command)')
     .action(async (runId, options) => {
     let resolvedRunId = runId;
     if (runId === 'latest') {
@@ -146,7 +166,8 @@ program
         runId: resolvedRunId,
         repo: options.repo,
         tail: Number.parseInt(options.tail, 10),
-        kpiOnly: options.kpiOnly
+        kpiOnly: options.kpiOnly,
+        json: options.json
     });
 });
 program
@@ -166,6 +187,23 @@ program
     }
     await summarizeCommand({ runId: resolvedRunId, repo: options.repo });
 });
+program
+    .command('next')
+    .description('Print suggested next command from stop handoff')
+    .argument('<runId>', 'Run ID (or "latest")')
+    .option('--repo <path>', 'Target repo path (default: current directory)', '.')
+    .action(async (runId, options) => {
+    let resolvedRunId = runId;
+    if (runId === 'latest') {
+        const latest = findLatestRunId(options.repo);
+        if (!latest) {
+            console.error('No runs found');
+            process.exit(1);
+        }
+        resolvedRunId = latest;
+    }
+    await nextCommand(resolvedRunId, { repo: options.repo });
+});
 program
     .command('compare')
     .description('Compare KPIs between two runs')
@@ -258,6 +296,25 @@ program
         olderThan: Number.parseInt(options.olderThan, 10)
     });
 });
+program
+    .command('watch')
+    .description('Watch run progress and optionally auto-resume on failure')
+    .argument('<runId>', 'Run ID to watch')
+    .option('--repo <path>', 'Target repo path (default: current directory)', '.')
+    .option('--auto-resume', 'Automatically resume on transient failures', false)
+    .option('--max-attempts <N>', 'Maximum auto-resume attempts (default: 3)', '3')
+    .option('--interval <seconds>', 'Poll interval in seconds (default: 5)', '5')
+    .option('--json', 'Output JSON events', false)
+    .action(async (runId, options) => {
+    await watchCommand({
+        runId,
+        repo: options.repo,
+        autoResume: options.autoResume,
+        maxAttempts: Number.parseInt(options.maxAttempts, 10),
+        interval: Number.parseInt(options.interval, 10) * 1000,
+        json: options.json
+    });
+});
 program
     .command('wait')
     .description('Block until run reaches terminal state (for meta-agent coordination)')
@@ -461,4 +518,44 @@ program
         olderThan: Number.parseInt(options.olderThan, 10)
     });
 });
+// journal - Generate case file from run
+program
+    .command('journal')
+    .description('Generate and display journal.md for a run')
+    .argument('[runId]', 'Run ID (defaults to latest)')
+    .option('--repo <path>', 'Target repo path', '.')
+    .option('--output <file>', 'Output file path (defaults to runs/<id>/journal.md)')
+    .option('--force', 'Force regeneration even if up to date', false)
+    .action(async (runId, options) => {
+    await journalCommand({
+        repo: options.repo,
+        runId,
+        output: options.output,
+        force: options.force
+    });
+});
+// note - Add timestamped note to run
+program
+    .command('note <message>')
+    .description('Add a timestamped note to a run')
+    .option('--repo <path>', 'Target repo path', '.')
+    .option('--run-id <id>', 'Run ID (defaults to latest)')
+    .action(async (message, options) => {
+    await noteCommand(message, {
+        repo: options.repo,
+        runId: options.runId
+    });
+});
+// open - Open journal.md in editor
+program
+    .command('open')
+    .description('Open journal.md in $EDITOR')
+    .argument('[runId]', 'Run ID (defaults to latest)')
+    .option('--repo <path>', 'Target repo path', '.')
+    .action(async (runId, options) => {
+    await openCommand({
+        repo: options.repo,
+        runId
+    });
+});
 program.parseAsync();