npm - @sebastianandreasson/pi-autonomous-agents - Versions diffs - 0.4.0 → 0.5.1 - Mend

@sebastianandreasson/pi-autonomous-agents 0.4.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md CHANGED Viewed

@@ -1,58 +1,79 @@
-# PI Harness
+# PI Autonomous Agents
-`pi-harness` is a portable CLI/workflow package for running a local PI-based unattended loop with:
+`@sebastianandreasson/pi-autonomous-agents` is an npm package for running a bounded unattended [PI](https://pi.dev/) workflow inside another repository.
-- a `developer` pass
-- a fast verification step
-- a skeptical `tester` pass
-- optional periodic multimodal visual review
-- tester-owned final commit by default
+It orchestrates:
-The package is intentionally generic. It does not know how to navigate or test a specific app on its own.
+- a `developer` turn
+- a fast local verification step
+- an independent `tester` turn
+- an optional focused `developerFix` turn when verification/tester finds a real issue
+- optional periodic visual review from screenshots
-## What Belongs In The Package
+The package is intentionally generic. It handles supervision, prompts, runtime state, telemetry, retries, and guardrails. The consuming repo still owns its own tasks, instructions, tests, model endpoints, and screenshot capture flow.
-- supervisor/orchestration
-- PI adapter/runtime integration
+## Install
+```bash
+npm install -D @sebastianandreasson/pi-autonomous-agents
+```
+Then in the consuming repo, tell your agent:
+```text
+Find SETUP.md in @sebastianandreasson/pi-autonomous-agents and set everything up for this repository.
+```
+The package ships a top-level [SETUP.md](./SETUP.md) specifically for that workflow.
+## What This Package Owns
+- unattended loop orchestration
+- PI adapter integration
 - config loading
-- telemetry
-- loop guards, timeout guards, and retries
-- tester feedback + visual feedback handoff
-- optional legacy harness git finalize step for `commitMode: "plan"`
-- multimodal visual review client
+- prompt assembly
+- verification/tester/visual-review handoff
+- timeout and loop guards
+- telemetry and run summaries
+- runtime isolation and stale-run recovery
-## What Stays Per Project
+## What Each Repo Must Provide
 - `TODOS.md`
-- project instructions
-- browser tests
-- visual capture flow
-- app-specific verification commands
-- app/server startup scripts
+- repo-specific `pi/DEVELOPER.md`
+- repo-specific `pi/TESTER.md`
+- a fast bounded `testCommand`
+- model configuration that actually matches the local/cloud providers in use
+- optionally a screenshot capture command for visual review
-## Layout
+## Quick Start In A Repo
+The normal setup shape is:
 ```text
-packages/pi-harness/
-  package.json
-  pi.config.json
-  templates/DEVELOPER.md
-  templates/TESTER.md
-  docs/PI_SUPERVISOR.md
-  src/
-    cli.mjs
-    pi-client.mjs
-    pi-config.mjs
-    pi-prompts.mjs
-    pi-repo.mjs
-    pi-report.mjs
-    pi-rpc-adapter.mjs
-    pi-supervisor.mjs
-    pi-telemetry.mjs
-    pi-visual-once.mjs
-    pi-visual-review.mjs
+TODOS.md
+pi.config.json
+pi/
+  DEVELOPER.md
+  TESTER.md
+```
+Typical scripts:
+```json
+{
+  "scripts": {
+    "pi:mock": "PI_CONFIG_FILE=pi.config.json PI_TRANSPORT=mock PI_TEST_CMD= pi-harness once",
+    "pi:once": "PI_CONFIG_FILE=pi.config.json pi-harness once",
+    "pi:run": "PI_CONFIG_FILE=pi.config.json pi-harness run",
+    "pi:report": "PI_CONFIG_FILE=pi.config.json pi-harness report",
+    "pi:visual:once": "PI_CONFIG_FILE=pi.config.json pi-harness visual-once"
+  }
+}
 ```
+Start from [templates/pi.config.example.json](./templates/pi.config.example.json), [templates/DEVELOPER.md](./templates/DEVELOPER.md), [templates/TESTER.md](./templates/TESTER.md), and [templates/gitignore.fragment](./templates/gitignore.fragment).
 ## CLI
 ```bash
@@ -65,62 +86,212 @@ pi-harness adapter
 pi-harness visual-review-worker
 ```
-Use `PI_CONFIG_FILE` to point the harness at a project-local config file. If you do not provide one, the bundled generic `pi.config.json` is used as a fallback.
-## Setup In Another Repo
-After installing the package:
+Use `PI_CONFIG_FILE` to point at the repo-local config file:
 ```bash
-npm install -D @sebastianandreasson/pi-autonomous-agents
+PI_CONFIG_FILE=pi.config.json pi-harness once
 ```
-you can tell another agent in that repo:
-```text
-Find SETUP.md in @sebastianandreasson/pi-autonomous-agents and set everything up for this repository.
+If `PI_CONFIG_FILE` is not set, the package falls back to the bundled generic [pi.config.json](./pi.config.json).
+## Core Workflow
+Each real iteration works like this:
+1. `developer` implements one unchecked task from `TODOS.md`.
+2. The harness runs the configured fast verification command.
+3. If verification passes, `tester` reviews the change independently.
+4. If tester or verification fails, the findings go back to `developerFix` for one focused repair pass.
+5. If tester reaches `PASS`, tester creates the final commit directly by default.
+6. Every `N` successful iterations, optional visual review can inspect screenshots and veto the success if it finds a real problem.
+The default commit model is `commitMode: "agent"`. The older harness-managed parsed commit-plan flow still exists as `commitMode: "plan"`, but it is now a compatibility mode rather than the default.
+## Recommended Model Setup
+The package supports:
+- one default text model via `piModel`
+- one default visual-review model via `visualReviewModel`
+- optional per-role overrides via `roleModels`
+- per-model endpoint config in `models`
+Typical pattern:
+- local model for `developer`
+- local model for `developerRetry`
+- local model for `developerFix`
+- local or slightly stronger model for `tester`
+- stronger frontier model only for `visualReview`
+Example:
+```json
+{
+  "piModel": "local/text-model",
+  "visualReviewModel": "cloud/vision-model",
+  "models": {
+    "local/text-model": {
+      "baseUrl": "http://localhost:8000/v1",
+      "apiKey": "local",
+      "vision": false
+    },
+    "local/tester-model": {
+      "baseUrl": "http://localhost:8000/v1",
+      "apiKey": "local",
+      "vision": false
+    },
+    "cloud/vision-model": {
+      "baseUrl": "https://api.openai.com/v1",
+      "apiKeyEnv": "OPENAI_API_KEY",
+      "vision": true
+    }
+  },
+  "roleModels": {
+    "developer": "local/text-model",
+    "developerRetry": "local/text-model",
+    "developerFix": "local/text-model",
+    "tester": "local/tester-model",
+    "visualReview": "cloud/vision-model"
+  }
+}
 ```
-The package ships a top-level [SETUP.md](./SETUP.md) specifically for that workflow.
+Important:
+- do not guess model ids
+- if using a custom OpenAI-compatible provider, verify `<baseUrl>/models`
+- if using PI models directly, verify `pi --list-models`
+- if `PI_CODING_AGENT_DIR` points at a repo-local PI home, make sure it is bootstrapped and contains `models.json`
+The harness now preflights those checks before starting a real run.
-If you want to wipe all harness-generated state and start over cleanly in a repo, run:
+## Important Config Fields
-```bash
-PI_CONFIG_FILE=pi.config.json pi-harness clear-history
-```
+Common fields in `pi.config.json`:
+- `taskFile`
+- `developerInstructionsFile`
+- `testerInstructionsFile`
+- `transport`
+- `adapterCommand`
+- `piModel`
+- `models`
+- `roleModels`
+- `commitMode`
+- `promptMode`
+- `testCommand`
+- `visualReviewEnabled`
+- `visualCaptureCommand`
+- `continueAfterSeconds`
+- `toolContinueAfterSeconds`
+- `noEventTimeoutSeconds`
+- `toolNoEventTimeoutSeconds`
+- `largeFileWarningLines`
+- `largeSpecWarningLines`
+Key defaults:
+- `transport`: `adapter`
+- `commitMode`: `agent`
+- `promptMode`: `compact`
+- `piTools`: `read,edit,write,find,ls,bash`
+- `continueAfterSeconds`: `300`
+- `toolContinueAfterSeconds`: `900`
+- `noEventTimeoutSeconds`: `900`
+- `toolNoEventTimeoutSeconds`: `1800`
+## Prompt and Tooling Behavior
+The package is optimized for local models by default:
+- prompts are compacted before handoff
+- changed-file lists and feedback excerpts are capped
+- prompts prefer `read` for source inspection
+- shell is intended for `git`, tests, and narrow diagnostics
+- the adapter warns on obvious oversized shell-based file reads
+- the supervisor emits large-file/spec warnings when touched files are getting risky
+This is deliberate. Large monolith files, huge e2e specs, and broad TODO items are one of the main causes of local-model drift and retry loops.
+Recommended repo shape:
+- keep TODO items very small and implementation-shaped
+- split giant stores/modules before they become constant edit hotspots
+- split ever-growing end-to-end specs into scenario files
+- keep the default `testCommand` to a bounded smoke check, not a multi-minute happy-path run
+## Runtime Isolation And Recovery
+Recent versions of the package isolate each run more aggressively:
-The command removes configured harness history/runtime files and verifies that no configured history paths remain afterward.
+- active ownership lock at `.pi-runtime/active-run.json`
+- per-run runtime directory under `.pi-runtime/runs/<runId>/`
+- per-run PI sessions and telemetry
+- `runId` added to telemetry
+- in-progress iteration state persisted before agent work starts
+- stale run locks recovered when the owning PID is gone
+- timeout cleanup kills the full spawned process group, not only the direct child
-For prompt debugging, the harness also writes the exact assembled prompt for the current role to `.pi-last-prompt.txt` by default.
-For flow debugging, it also writes a machine-readable `.pi-last-iteration.json` summary with the selected task, tester verdict, commit-plan state, and terminal reason.
+That is meant to prevent orphaned timed-out agents or concurrent supervisors from corrupting shared state.
-## Generic Contracts
+## Debugging Artifacts
-- `taskFile`: usually `TODOS.md`
-- `developerInstructionsFile`: per-project developer instructions
-- `testerInstructionsFile`: per-project tester instructions
-- `roleModels`: optional per-role model overrides
-- `commitMode`: `agent` by default, `plan` only for legacy harness-managed commit parsing
-- `promptMode`: `compact` by default
-- `testCommand`: fast verification command
-- `visualCaptureCommand`: project-defined screenshot capture command
-- `visualFeedbackFile`: latest visual-review handoff
-- `testerFeedbackFile`: latest tester-review handoff
+Useful files during a run:
-For unattended loops, keep `testCommand` fast and bounded, such as a smoke suite. Long real-time Playwright happy-path specs belong in an explicit nightly or post-run lane, not the default developer/tester inner loop.
+- `.pi-last-prompt.txt`
+  Exact assembled prompt for the current role.
+- `.pi-last-output.txt`
+  Latest agent output snapshot.
+- `.pi-last-verification.txt`
+  Latest verification output snapshot.
+- `.pi-last-iteration.json`
+  Structured summary of the last completed iteration.
+- `.pi-state.json`
+  Persistent harness state, including in-progress iteration data.
+- `pi.log`
+  Main run log.
+- `pi_telemetry.jsonl`
+- `pi_telemetry.csv`
+- `.pi-runtime/active-run.json`
+- `.pi-runtime/runs/<runId>/...`
-Keep TODO items extremely small and implementation-shaped when using weaker local models. Broad tasks tend to produce much longer turns, more retries, and more tester drift than narrow one-step tasks.
+`pi-harness report` summarizes recent telemetry and surfaces things like terminal reasons and large-file warnings.
-The adapter heartbeat is PI-RPC-event based. Streaming shell output does not count as progress on its own, so long-running tools should rely on the tool-aware watchdog thresholds rather than terminal streaming.
+## Visual Review Contract
-`piModel` remains the default text model, but you can override specific roles with `roleModels` such as `developer`, `developerRetry`, `developerFix`, `tester`, and `visualReview`. `testerCommit` is only relevant if you opt back into `commitMode: "plan"`.
+Visual review is optional and generic. The harness does not know how to navigate your app.
-By default, successful tester passes should stage and create the commit directly in the same PI turn. The old commit-plan parsing flow is still available as `commitMode: "plan"`, but it is now a compatibility mode rather than the default.
+If enabled, your repo must provide a real screenshot capture command that writes a manifest under the configured capture directory. The manifest shape is documented in [docs/PI_SUPERVISOR.md](./docs/PI_SUPERVISOR.md).
-Prompt/context handoff is compact by default. The harness now caps prior feedback excerpts, changed-file lists, verification excerpts, and prompt note handoff. If needed, tune `maxPromptChangedFiles`, `maxVisualFeedbackLines`, `maxTesterFeedbackLines`, `maxPromptNotesLines`, and `maxVerificationExcerptLines`.
+Visual review should be used as a periodic audit, not as the default inner-loop gate.
-The default coding tool mix is now safer for local models: `read,edit,write,find,ls,bash`. Prompts explicitly steer source inspection toward `read` and reserve shell usage for `git`, tests, and narrow diagnostics.
+## Resetting Harness State
-The harness also emits lightweight large-file warnings for touched source/spec files and carries them into `.pi-last-iteration.json`, `pi-harness report`, and relevant prompts. Tune `largeFileWarningLines` and `largeSpecWarningLines` if needed.
+If you want to wipe harness-generated state and start fresh:
+```bash
+PI_CONFIG_FILE=pi.config.json pi-harness clear-history
+```
+That clears configured harness runtime/history artifacts and verifies they are gone. It does not remove project source files.
+## Docs
+- [SETUP.md](./SETUP.md)
+  Agent-facing setup instructions for consuming repos.
+- [docs/PI_SUPERVISOR.md](./docs/PI_SUPERVISOR.md)
+  More detailed flow, adapter, and runtime documentation.
+- [templates/PROJECT_SETUP.md](./templates/PROJECT_SETUP.md)
+  Minimal consuming-repo layout summary.
+## Development
+In this package repo:
+```bash
+npm run check
+npm test
+```
-The harness expects screenshot capture to produce a `manifest.json` plus image files under the configured visual capture directory.
+The package requires Node `>=20`.

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "@sebastianandreasson/pi-autonomous-agents",
   "private": false,
-  "version": "0.4.0",
+  "version": "0.5.1",
   "type": "module",
   "description": "Portable unattended PI harness for developer/tester/visual-review loops.",
   "license": "MIT",

package/src/pi-client.mjs CHANGED Viewed

@@ -103,7 +103,7 @@ async function runAdapterTurn({ config, model, sessionId, sessionFile, prompt, i
     instructionsFile: config.instructionsFile,
     developerInstructionsFile: config.developerInstructionsFile,
     testerInstructionsFile: config.testerInstructionsFile,
-    runtimeDir: config.piRuntimeDir,
+    runtimeDir: config.runRuntimeDir || config.piRuntimeDir,
     piCli: config.piCli,
     model: model ?? config.piModel,
     tools: config.piTools,

package/src/pi-config.mjs CHANGED Viewed

@@ -246,6 +246,7 @@ export function loadConfig(mode = 'once') {
     lastPromptFile: resolveFromCwd(cwd, 'PI_LAST_PROMPT_FILE', file.lastPromptFile, '.pi-last-prompt.txt'),
     lastIterationSummaryFile: resolveFromCwd(cwd, 'PI_LAST_ITERATION_SUMMARY_FILE', file.lastIterationSummaryFile, '.pi-last-iteration.json'),
     piRuntimeDir: resolveFromCwd(cwd, 'PI_RUNTIME_DIR', file.piRuntimeDir, '.pi-runtime'),
+    activeRunFile: resolveFromCwd(cwd, 'PI_ACTIVE_RUN_FILE', file.activeRunFile, '.pi-runtime/active-run.json'),
     piCli: readString('PI_CLI', file.piCli, 'pi'),
     piModel,
     piModelProfile: resolvedPiModel,

package/src/pi-repo.mjs CHANGED Viewed

@@ -1,6 +1,7 @@
 import fs from 'node:fs/promises'
 import { readFileSync } from 'node:fs'
 import process from 'node:process'
+import { randomUUID } from 'node:crypto'
 import { execFileSync, spawn } from 'node:child_process'
 import path from 'node:path'
@@ -9,7 +10,17 @@ export function timestamp() {
 }
 export async function appendLog(logFile, message) {
-  await fs.appendFile(logFile, `[${timestamp()}] ${message}\n`, 'utf8')
+  const runId = String(process.env.PI_RUN_ID ?? '').trim()
+  const prefix = runId !== '' ? `[run:${runId}] ` : ''
+  const line = `[${timestamp()}] ${prefix}${message}\n`
+  await fs.mkdir(path.dirname(logFile), { recursive: true })
+  await fs.appendFile(logFile, line, 'utf8')
+  const runLogFile = String(process.env.PI_RUN_LOG_FILE ?? '').trim()
+  if (runLogFile !== '' && runLogFile !== logFile) {
+    await fs.mkdir(path.dirname(runLogFile), { recursive: true })
+    await fs.appendFile(runLogFile, line, 'utf8')
+  }
 }
 export function ensureRepo(cwd) {
@@ -30,7 +41,27 @@ export async function ensureFileExists(filePath, label) {
 export async function readState(stateFile) {
   try {
     const raw = await fs.readFile(stateFile, 'utf8')
-    return JSON.parse(raw)
+    const parsed = JSON.parse(raw)
+    if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) {
+      throw new Error('Invalid state file payload')
+    }
+    return {
+      iteration: 0,
+      lastTransport: '',
+      lastPiModel: '',
+      sessionId: '',
+      sessionFile: '',
+      consecutiveFailures: 0,
+      successfulIterations: 0,
+      lastPhase: '',
+      lastStatus: '',
+      lastVerificationStatus: '',
+      lastVisualStatus: '',
+      lastRunAt: '',
+      runId: '',
+      inProgress: null,
+      ...parsed,
+    }
   } catch {
     return {
       iteration: 0,
@@ -38,22 +69,165 @@ export async function readState(stateFile) {
       lastPiModel: '',
       sessionId: '',
       sessionFile: '',
-        consecutiveFailures: 0,
-        successfulIterations: 0,
-        lastPhase: '',
-        lastStatus: '',
-        lastVerificationStatus: '',
-        lastVisualStatus: '',
-        lastRunAt: '',
-      }
+      consecutiveFailures: 0,
+      successfulIterations: 0,
+      lastPhase: '',
+      lastStatus: '',
+      lastVerificationStatus: '',
+      lastVisualStatus: '',
+      lastRunAt: '',
+      runId: '',
+      inProgress: null,
+    }
   }
 }
 export async function writeState(stateFile, state) {
   const formatted = `${JSON.stringify(state, null, 2)}\n`
+  await fs.mkdir(path.dirname(stateFile), { recursive: true })
   await fs.writeFile(stateFile, formatted, 'utf8')
 }
+export function createRunId() {
+  return randomUUID()
+}
+function normalizePid(raw) {
+  const pid = Number.parseInt(String(raw ?? ''), 10)
+  return Number.isInteger(pid) && pid > 0 ? pid : 0
+}
+export function isProcessRunning(pid) {
+  const normalizedPid = normalizePid(pid)
+  if (normalizedPid <= 0) {
+    return false
+  }
+  try {
+    process.kill(normalizedPid, 0)
+    return true
+  } catch (error) {
+    if (error && typeof error === 'object' && 'code' in error) {
+      return error.code === 'EPERM'
+    }
+    return false
+  }
+}
+export async function readJsonFile(filePath, fallback = null) {
+  try {
+    const raw = await fs.readFile(filePath, 'utf8')
+    return JSON.parse(raw)
+  } catch {
+    return fallback
+  }
+}
+async function writeJsonFile(filePath, value, flags) {
+  const formatted = `${JSON.stringify(value, null, 2)}\n`
+  await fs.mkdir(path.dirname(filePath), { recursive: true })
+  await fs.writeFile(filePath, formatted, { encoding: 'utf8', flag: flags })
+}
+export async function acquireRunLock(lockFile, lockState) {
+  const desired = {
+    runId: String(lockState?.runId ?? ''),
+    pid: normalizePid(lockState?.pid),
+    startedAt: String(lockState?.startedAt ?? timestamp()),
+    heartbeatAt: String(lockState?.heartbeatAt ?? timestamp()),
+    status: String(lockState?.status ?? 'starting'),
+    iteration: Number.isFinite(Number(lockState?.iteration)) ? Number(lockState.iteration) : 0,
+    phase: String(lockState?.phase ?? ''),
+    task: String(lockState?.task ?? ''),
+    mode: String(lockState?.mode ?? ''),
+    configFile: String(lockState?.configFile ?? ''),
+    cwd: String(lockState?.cwd ?? ''),
+  }
+  await fs.mkdir(path.dirname(lockFile), { recursive: true })
+  try {
+    await writeJsonFile(lockFile, desired, 'wx')
+    return { acquired: true, staleLock: null }
+  } catch (error) {
+    if (!error || typeof error !== 'object' || !('code' in error) || error.code !== 'EEXIST') {
+      throw error
+    }
+  }
+  const existing = await readJsonFile(lockFile, null)
+  const existingPid = normalizePid(existing?.pid)
+  if (existing && existingPid > 0 && isProcessRunning(existingPid) && existingPid !== process.pid) {
+    throw new Error(
+      `Another pi-harness run is active (runId=${String(existing.runId ?? '')} pid=${existingPid} startedAt=${String(existing.startedAt ?? '')}).`
+    )
+  }
+  await fs.rm(lockFile, { force: true })
+  try {
+    await writeJsonFile(lockFile, desired, 'wx')
+  } catch (error) {
+    if (error && typeof error === 'object' && 'code' in error && error.code === 'EEXIST') {
+      const current = await readJsonFile(lockFile, null)
+      throw new Error(
+        `Another pi-harness run acquired the lock first (runId=${String(current?.runId ?? '')} pid=${String(current?.pid ?? '')}).`
+      )
+    }
+    throw error
+  }
+  return { acquired: true, staleLock: existing }
+}
+export async function updateRunLock(lockFile, lockState) {
+  const current = await readJsonFile(lockFile, null)
+  if (!current) {
+    return false
+  }
+  const next = {
+    ...current,
+    ...lockState,
+    pid: normalizePid(lockState?.pid ?? current.pid),
+    heartbeatAt: String(lockState?.heartbeatAt ?? timestamp()),
+  }
+  await writeJsonFile(lockFile, next)
+  return true
+}
+export async function releaseRunLock(lockFile, runId) {
+  const current = await readJsonFile(lockFile, null)
+  if (!current) {
+    return false
+  }
+  if (String(current.runId ?? '') !== String(runId ?? '')) {
+    return false
+  }
+  await fs.rm(lockFile, { force: true })
+  return true
+}
+export function signalProcessTree(pid, signal) {
+  const normalizedPid = normalizePid(pid)
+  if (normalizedPid <= 0) {
+    return false
+  }
+  try {
+    if (process.platform !== 'win32') {
+      process.kill(-normalizedPid, signal)
+    } else {
+      process.kill(normalizedPid, signal)
+    }
+    return true
+  } catch {
+    return false
+  }
+}
 export async function readSessionId(sessionFile) {
   try {
     return (await fs.readFile(sessionFile, 'utf8')).trim()
@@ -297,6 +471,7 @@ export async function runShellCommand({
     const child = spawn('/bin/zsh', ['-lc', command], {
       cwd,
       env: process.env,
+      detached: process.platform !== 'win32',
       stdio: ['pipe', 'pipe', 'pipe'],
     })
@@ -308,9 +483,9 @@ export async function runShellCommand({
     killTimer = setTimeout(() => {
       timedOut = true
-      child.kill('SIGTERM')
+      signalProcessTree(child.pid, 'SIGTERM')
       forceKillTimer = setTimeout(() => {
-        child.kill('SIGKILL')
+        signalProcessTree(child.pid, 'SIGKILL')
       }, 10000)
     }, timeoutSeconds * 1000)

package/src/pi-rpc-adapter.mjs CHANGED Viewed

@@ -10,6 +10,7 @@ import {
   getHeartbeatDecision,
   resolveHeartbeatConfig,
 } from './pi-heartbeat.mjs'
+import { signalProcessTree } from './pi-repo.mjs'
 function createJsonlReader(stream, onLine) {
   const rl = createInterface({ input: stream })
@@ -151,6 +152,7 @@ async function run() {
   const child = spawn(cli, args, {
     cwd: request.cwd,
     env: process.env,
+    detached: process.platform !== 'win32',
     stdio: ['pipe', 'pipe', 'pipe'],
   })
@@ -239,10 +241,10 @@ async function run() {
     closeAssistantLine()
     writeLive(`[PI guard] ${formatHeartbeatTimeoutMessage(decision)} Aborting current turn (pid=${child.pid ?? 'unknown'}).\n`)
     void send({ type: 'abort' }).catch(() => {})
-    child.kill('SIGTERM')
+    signalProcessTree(child.pid, 'SIGTERM')
     setTimeout(() => {
       if (child.exitCode === null) {
-        child.kill('SIGKILL')
+        signalProcessTree(child.pid, 'SIGKILL')
       }
     }, 1000)
   }
@@ -578,10 +580,10 @@ async function run() {
     }
     pending.clear()
-    child.kill('SIGTERM')
+    signalProcessTree(child.pid, 'SIGTERM')
     await new Promise((resolve) => {
       const timeout = setTimeout(() => {
-        child.kill('SIGKILL')
+        signalProcessTree(child.pid, 'SIGKILL')
         resolve()
       }, 1000)

package/src/pi-supervisor.mjs CHANGED Viewed

@@ -12,9 +12,11 @@ import {
 } from './pi-prompts.mjs'
 import { appendTelemetry, ensureTelemetryFiles } from './pi-telemetry.mjs'
 import {
+  acquireRunLock,
   appendLog,
   collectLargeFileWarnings,
   commitStagedFiles,
+  createRunId,
   didRepoChange,
   ensureFileExists,
   ensureRepo,
@@ -25,10 +27,12 @@ import {
   readOptionalTextFile,
   readSessionId,
   readState,
+  releaseRunLock,
   runVerification,
   runShellCommand,
   stageFiles,
   unstageFiles,
+  updateRunLock,
   runVisualCapture,
   timestamp,
   writeChangedFiles,
@@ -66,7 +70,7 @@ function printTerminalSummary(config, summary) {
   }
   const lines = [
-    `[PI supervisor] iteration=${summary.iteration} phase="${summary.phase}"`,
+    `[PI supervisor] run_id=${summary.runId || config.runId || ''} iteration=${summary.iteration} phase="${summary.phase}"`,
     `[PI supervisor] task=${summary.taskFile || toDisplayPath(config, config.taskFile)} developer_instructions=${summary.developerInstructionsFile || toDisplayPath(config, config.developerInstructionsFile)} tester_instructions=${summary.testerInstructionsFile || toDisplayPath(config, config.testerInstructionsFile)}`,
     `[PI supervisor] transport=${config.transport} developer_model=${summary.developerModel || resolveRoleModelName(config, 'developer') || '(PI default)'} tester_model=${summary.testerModel || resolveRoleModelName(config, 'tester') || '(PI default)'}`,
     `[PI supervisor] developer=${summary.developerStatus} tester=${summary.testerStatus} verification=${summary.verificationStatus}`,
@@ -152,9 +156,13 @@ function formatIterationSummary(summary) {
 async function writeIterationSummary(config, summary) {
   await writeTextFile(config.lastIterationSummaryFile, formatIterationSummary(summary))
+  if (config.runLastIterationSummaryFile && config.runLastIterationSummaryFile !== config.lastIterationSummaryFile) {
+    await writeTextFile(config.runLastIterationSummaryFile, formatIterationSummary(summary))
+  }
 }
 function createIterationSummary({
+  runId,
   iteration,
   phase,
   task,
@@ -174,6 +182,7 @@ function createIterationSummary({
   visualModel,
 }) {
   return {
+    runId,
     iteration,
     phase,
     task,
@@ -194,6 +203,26 @@ function createIterationSummary({
   }
 }
+async function persistStateSnapshot(config, state) {
+  await writeState(config.stateFile, state)
+  if (config.runStateFile && config.runStateFile !== config.stateFile) {
+    await writeState(config.runStateFile, state)
+  }
+}
+async function updateRunOwnership(config, fields = {}) {
+  if (!config.activeRunFile || !config.runId) {
+    return
+  }
+  await updateRunLock(config.activeRunFile, {
+    runId: config.runId,
+    pid: process.pid,
+    heartbeatAt: timestamp(),
+    ...fields,
+  })
+}
 function didInvocationCreateCommit(invocation) {
   return invocation?.beforeSnapshot?.head !== invocation?.afterSnapshot?.head
 }
@@ -272,6 +301,7 @@ function isInfrastructureVerificationFailure(output) {
 async function recordEvent(config, event) {
   await appendTelemetry(config, {
     timestamp: timestamp(),
+    runId: config.runId || '',
     ...event,
   })
 }
@@ -1076,6 +1106,13 @@ async function runIteration({ config, state, iteration }) {
   const iterationStartSnapshot = getRepoSnapshot(config.cwd)
   const taskInfo = findFirstUncheckedTaskInfo(config.taskFile)
   if (!taskInfo.hasUncheckedTasks) {
+    await updateRunOwnership(config, {
+      status: 'idle',
+      iteration,
+      phase: taskInfo.phase || 'complete',
+      task: '',
+      lastCompletedIteration: iteration,
+    })
     await appendLog(config.logFile, 'No unchecked tasks remain in TODOS.md')
     return {
       stateUpdate: {
@@ -1086,9 +1123,12 @@ async function runIteration({ config, state, iteration }) {
         lastPhase: taskInfo.phase,
         lastStatus: 'complete',
         lastVerificationStatus: 'not_needed',
+        runId: config.runId || '',
+        inProgress: null,
         lastRunAt: timestamp(),
       },
       summary: {
+        runId: config.runId || '',
         iteration,
         phase: taskInfo.phase || 'complete',
         task: '',
@@ -1118,6 +1158,26 @@ async function runIteration({ config, state, iteration }) {
   const phase = taskInfo.phase || 'unknown'
   const task = taskInfo.task || 'unknown'
+  const inProgressState = {
+    ...state,
+    runId: config.runId || '',
+    inProgress: {
+      runId: config.runId || '',
+      status: 'in_progress',
+      iteration,
+      phase,
+      task,
+      startedAt: timestamp(),
+      transport: config.transport,
+    },
+  }
+  await persistStateSnapshot(config, inProgressState)
+  await updateRunOwnership(config, {
+    status: 'iteration_in_progress',
+    iteration,
+    phase,
+    task,
+  })
   const canResumePriorSession = (
     state.lastTransport === config.transport
     && state.lastPiModel === developerModelName
@@ -1486,8 +1546,19 @@ async function runIteration({ config, state, iteration }) {
     lastRunAt: timestamp(),
     successfulIterations,
     lastVisualStatus: visualStatus,
+    runId: config.runId || '',
+    inProgress: null,
   }
+  await updateRunOwnership(config, {
+    status: 'idle',
+    iteration,
+    phase,
+    task,
+    lastCompletedIteration: iteration,
+    lastStatus: finalStatus,
+  })
   await appendLog(
     config.logFile,
     `Finished iteration ${iteration} with status=${finalStatus} verification=${finalVerificationStatus} tester_verdict=${testerVerdict} commit_plan_found=${commitPlanFound} terminal_reason=${terminalReason}${largeFileWarnings.length > 0 ? ` large_file_warnings=${formatLargeFileWarningsInline(largeFileWarnings)}` : ''}`
@@ -1495,6 +1566,7 @@ async function runIteration({ config, state, iteration }) {
   const iterationEndSnapshot = getRepoSnapshot(config.cwd)
   const iterationSummary = createIterationSummary({
+    runId: config.runId || '',
     iteration,
     phase,
     task,
@@ -1548,6 +1620,7 @@ async function runIteration({ config, state, iteration }) {
   return {
     stateUpdate: nextState,
     summary: {
+      runId: config.runId || '',
       iteration,
       phase,
       task,
@@ -1578,40 +1651,95 @@ async function runIteration({ config, state, iteration }) {
 async function main() {
   const config = loadConfig(process.argv[2] ?? 'once')
+  const runId = createRunId()
+  const runStartedAt = timestamp()
+  const runDir = path.join(config.piRuntimeDir, 'runs', runId)
+  config.runId = runId
+  config.runStartedAt = runStartedAt
+  config.runRuntimeDir = runDir
+  config.runLogFile = path.join(runDir, 'pi.log')
+  config.runTelemetryJsonl = path.join(runDir, 'pi_telemetry.jsonl')
+  config.runTelemetryCsv = path.join(runDir, 'pi_telemetry.csv')
+  config.runStateFile = path.join(runDir, 'state.json')
+  config.runLastIterationSummaryFile = path.join(runDir, 'last-iteration.json')
   ensureRepo(config.cwd)
   await ensureFileExists(config.taskFile, 'task file')
   await ensureFileExists(config.developerInstructionsFile, 'developer instructions file')
   await ensureFileExists(config.testerInstructionsFile, 'tester instructions file')
-  await ensureTelemetryFiles(config)
-  await runStartupPreflight(config)
-  let state = await readState(config.stateFile)
-  let completedIterations = 0
-  while (!stopRequested) {
-    const iteration = state.iteration + 1
-    const result = await runIteration({ config, state, iteration })
-    await writeIterationSummary(config, result.iterationSummary ?? result.summary)
-    state = result.stateUpdate
-    await writeState(config.stateFile, state)
-    printTerminalSummary(config, result.summary)
-    completedIterations += 1
-    if (result.shouldStop || config.mode !== 'run' || completedIterations >= config.maxIterations) {
-      break
+  const lockResult = await acquireRunLock(config.activeRunFile, {
+    runId,
+    pid: process.pid,
+    startedAt: runStartedAt,
+    heartbeatAt: runStartedAt,
+    status: 'starting',
+    iteration: 0,
+    phase: '',
+    task: '',
+    mode: config.mode,
+    configFile: config.configFile,
+    cwd: config.cwd,
+  })
+  try {
+    process.env.PI_RUN_ID = runId
+    process.env.PI_RUN_LOG_FILE = config.runLogFile
+    await ensureTelemetryFiles(config)
+    await appendLog(config.logFile, `Run started pid=${process.pid} mode=${config.mode}`)
+    if (lockResult.staleLock) {
+      await appendLog(
+        config.logFile,
+        `Recovered stale run lock from runId=${String(lockResult.staleLock.runId ?? '')} pid=${String(lockResult.staleLock.pid ?? '')} startedAt=${String(lockResult.staleLock.startedAt ?? '')}`
+      )
     }
+    await runStartupPreflight(config)
-    await sleep(config.sleepBetweenSeconds)
-  }
+    let state = await readState(config.stateFile)
+    if (state?.inProgress?.status === 'in_progress') {
+      await appendLog(
+        config.logFile,
+        `Recovering unfinished iteration=${state.inProgress.iteration} phase="${state.inProgress.phase || ''}" task="${state.inProgress.task || ''}" from runId=${String(state.inProgress.runId || state.runId || '')}`
+      )
+    }
+    let completedIterations = 0
+    while (!stopRequested) {
+      const iteration = state?.inProgress?.status === 'in_progress'
+        ? Number(state.inProgress.iteration) || (state.iteration + 1)
+        : state.iteration + 1
+      await updateRunOwnership(config, {
+        status: 'starting_iteration',
+        iteration,
+      })
+      const result = await runIteration({ config, state, iteration })
+      await writeIterationSummary(config, result.iterationSummary ?? result.summary)
+      state = result.stateUpdate
+      await persistStateSnapshot(config, state)
+      printTerminalSummary(config, result.summary)
+      completedIterations += 1
+      if (result.shouldStop || config.mode !== 'run' || completedIterations >= config.maxIterations) {
+        break
+      }
+      await sleep(config.sleepBetweenSeconds)
+    }
-  if (stopRequested) {
-    await appendLog(config.logFile, 'Stop requested by signal')
+    if (stopRequested) {
+      await appendLog(config.logFile, 'Stop requested by signal')
+    }
+  } finally {
+    await updateRunOwnership(config, {
+      status: stopRequested ? 'stopped' : 'finished',
+      heartbeatAt: timestamp(),
+    })
+    await releaseRunLock(config.activeRunFile, runId)
+    delete process.env.PI_RUN_ID
+    delete process.env.PI_RUN_LOG_FILE
   }
 }
 main().catch(async (error) => {
   const config = loadConfig(process.argv[2] ?? 'once')
-  await ensureTelemetryFiles(config)
   await appendLog(config.logFile, `Supervisor error: ${error instanceof Error ? error.stack ?? error.message : String(error)}`)
   console.error(error instanceof Error ? error.message : String(error))
   process.exitCode = 1

package/src/pi-telemetry.mjs CHANGED Viewed

@@ -1,6 +1,7 @@
 import fs from 'node:fs/promises'
+import path from 'node:path'
-const CSV_HEADER = 'timestamp,iteration,phase,kind,status,transport,session_id,timed_out,exit_code,duration_seconds,commit_before,commit_after,repo_changed,changed_files_count,verification_status,retry_count,role,model,tool_calls,tool_errors,message_updates,stop_reason,loop_detected,loop_signature,tester_verdict,commit_plan_found,terminal_reason,risk_warnings,notes\n'
+const CSV_HEADER = 'timestamp,run_id,iteration,phase,kind,status,transport,session_id,timed_out,exit_code,duration_seconds,commit_before,commit_after,repo_changed,changed_files_count,verification_status,retry_count,role,model,tool_calls,tool_errors,message_updates,stop_reason,loop_detected,loop_signature,tester_verdict,commit_plan_found,terminal_reason,risk_warnings,notes\n'
 function csvEscape(value) {
   const text = String(value ?? '')
@@ -14,22 +15,42 @@ export async function ensureTelemetryFiles(config) {
   await fs.writeFile(config.lastPromptFile, '', 'utf8')
   await fs.writeFile(config.lastIterationSummaryFile, '', 'utf8')
+  await fs.mkdir(path.dirname(config.logFile), { recursive: true })
+  await fs.mkdir(path.dirname(config.telemetryJsonl), { recursive: true })
+  await fs.mkdir(path.dirname(config.telemetryCsv), { recursive: true })
   await fs.appendFile(config.logFile, '', 'utf8')
   await fs.appendFile(config.telemetryJsonl, '', 'utf8')
+  if (config.runTelemetryJsonl && config.runTelemetryJsonl !== config.telemetryJsonl) {
+    await fs.mkdir(path.dirname(config.runTelemetryJsonl), { recursive: true })
+    await fs.appendFile(config.runTelemetryJsonl, '', 'utf8')
+  }
   try {
     await fs.access(config.telemetryCsv)
   } catch {
     await fs.writeFile(config.telemetryCsv, CSV_HEADER, 'utf8')
   }
+  if (config.runTelemetryCsv && config.runTelemetryCsv !== config.telemetryCsv) {
+    try {
+      await fs.access(config.runTelemetryCsv)
+    } catch {
+      await fs.mkdir(path.dirname(config.runTelemetryCsv), { recursive: true })
+      await fs.writeFile(config.runTelemetryCsv, CSV_HEADER, 'utf8')
+    }
+  }
 }
 export async function appendTelemetry(config, event) {
   const jsonLine = `${JSON.stringify(event)}\n`
   await fs.appendFile(config.telemetryJsonl, jsonLine, 'utf8')
+  if (config.runTelemetryJsonl && config.runTelemetryJsonl !== config.telemetryJsonl) {
+    await fs.appendFile(config.runTelemetryJsonl, jsonLine, 'utf8')
+  }
   const csvRow = [
     event.timestamp,
+    event.runId,
     event.iteration,
     event.phase,
     event.kind,
@@ -61,6 +82,9 @@ export async function appendTelemetry(config, event) {
   ].map(csvEscape).join(',')
   await fs.appendFile(config.telemetryCsv, `${csvRow}\n`, 'utf8')
+  if (config.runTelemetryCsv && config.runTelemetryCsv !== config.telemetryCsv) {
+    await fs.appendFile(config.runTelemetryCsv, `${csvRow}\n`, 'utf8')
+  }
 }
 export async function readTelemetry(config) {