@lnilluv/pi-ralph-loop 0.1.4-dev.0 → 0.1.4-dev.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/README.md +63 -12
  2. package/package.json +1 -1
  3. package/src/index.ts +1034 -168
  4. package/src/ralph-draft-llm.ts +35 -7
  5. package/src/ralph-draft.ts +1 -1
  6. package/src/ralph.ts +708 -51
  7. package/src/runner-rpc.ts +434 -0
  8. package/src/runner-state.ts +822 -0
  9. package/src/runner.ts +957 -0
  10. package/tests/fixtures/parity/migrate/OPEN_QUESTIONS.md +3 -0
  11. package/tests/fixtures/parity/migrate/RALPH.md +27 -0
  12. package/tests/fixtures/parity/migrate/golden/MIGRATED.md +15 -0
  13. package/tests/fixtures/parity/migrate/legacy/source.md +6 -0
  14. package/tests/fixtures/parity/migrate/legacy/source.yaml +3 -0
  15. package/tests/fixtures/parity/migrate/scripts/show-legacy.sh +10 -0
  16. package/tests/fixtures/parity/migrate/scripts/verify.sh +15 -0
  17. package/tests/fixtures/parity/research/OPEN_QUESTIONS.md +3 -0
  18. package/tests/fixtures/parity/research/RALPH.md +45 -0
  19. package/tests/fixtures/parity/research/claim-evidence-checklist.md +15 -0
  20. package/tests/fixtures/parity/research/expected-outputs.md +22 -0
  21. package/tests/fixtures/parity/research/scripts/show-snapshots.sh +13 -0
  22. package/tests/fixtures/parity/research/scripts/verify.sh +55 -0
  23. package/tests/fixtures/parity/research/snapshots/app-factory-ai-cli.md +11 -0
  24. package/tests/fixtures/parity/research/snapshots/docs-factory-ai-cli-features-missions.md +11 -0
  25. package/tests/fixtures/parity/research/snapshots/factory-ai-news-missions.md +11 -0
  26. package/tests/fixtures/parity/research/source-manifest.md +20 -0
  27. package/tests/index.test.ts +3169 -104
  28. package/tests/parity/README.md +9 -0
  29. package/tests/parity/harness.py +526 -0
  30. package/tests/parity-harness.test.ts +42 -0
  31. package/tests/parity-research-fixture.test.ts +34 -0
  32. package/tests/ralph-draft-llm.test.ts +82 -9
  33. package/tests/ralph-draft.test.ts +1 -1
  34. package/tests/ralph.test.ts +1265 -36
  35. package/tests/runner-event-contract.test.ts +235 -0
  36. package/tests/runner-rpc.test.ts +358 -0
  37. package/tests/runner-state.test.ts +553 -0
  38. package/tests/runner.test.ts +1347 -0
package/README.md CHANGED
@@ -49,11 +49,45 @@ That saves the draft but does not launch the loop.
49
49
 
50
50
  ### Smart drafting
51
51
 
52
- Smart drafting sends the selected repo excerpts from the current repo context to the currently selected active pi model, including models chosen with `/model` or by cycling within `/scoped-models`. It excludes common secret-bearing paths from that context, and non-analysis drafts use the shared `policy:secret-bearing-paths` token so runtime write protection stays aligned with the same policy. It does not switch models automatically. If no active authenticated model is available, drafting falls back to the deterministic path.
52
+ Smart drafting sends the selected repo excerpts from the current repo context to the currently selected active pi model when you start `/ralph` interactively, including models chosen with `/model` or by cycling within `/scoped-models`. It excludes common secret-bearing paths from that context, and non-analysis drafts use the shared `policy:secret-bearing-paths` token so runtime write protection stays aligned with the same policy. It does not switch models automatically. When the active model is used to strengthen an existing draft, it now accepts validated body-and-commands drafts instead of body-only drafts. If no active authenticated model is available, drafting falls back to the deterministic path.
53
53
 
54
54
  ## How it works
55
55
 
56
- On each iteration, pi-ralph reads `RALPH.md`, runs the configured commands, injects their output into the prompt through `{{ commands.<name> }}` placeholders, starts a fresh session, sends the prompt, and waits for completion. Failed command output appears in the next iteration, which creates a self-healing loop.
56
+ Each iteration re-reads `RALPH.md`, runs the configured commands, injects their output into `{{ commands.<name> }}` placeholders, and sends the task to a fresh `pi --mode rpc` subprocess instead of keeping a long-lived in-process session. If `RALPH_PROGRESS.md` exists at the task root, Ralph injects it into every prompt as a short writable memory and ignores its churn when deciding whether the loop made durable progress. Failed command output appears in the next iteration, which keeps the loop self-healing.
57
+
58
+ ### Subprocess runner
59
+
60
+ `runner.ts` orchestrates the loop for each iteration:
61
+ - re-read `RALPH.md` so live edits apply on the next turn
62
+ - run any configured pre-iteration commands
63
+ - snapshot the task directory before the agent runs
64
+ - spawn a fresh RPC subprocess
65
+ - compare before/after snapshots and evaluate completion
66
+ - stop on max iterations, timeout, or no-progress exhaustion
67
+
68
+ `runner-rpc.ts` manages the subprocess:
69
+ - starts `pi --mode rpc --no-session`
70
+ - sends `set_model`, `set_thinking_level`, and `prompt` over stdin as JSONL
71
+ - reads JSONL events from stdout
72
+ - keeps stdin open until `agent_end`
73
+ - handles timeouts and process lifecycle
74
+
75
+ ### Model selection
76
+
77
+ `modelPattern` supports `provider/modelId` and `provider/modelId:thinkingLevel`. When a thinking level is present, the runner sends `set_model` first, then `set_thinking_level`, and only then sends the prompt. The RPC manager waits for acknowledgments before continuing.
78
+
79
+ ### Progress detection
80
+
81
+ The runner hashes task-directory files before and after each iteration and diffs the snapshots. New or modified files count as progress. `.git`, `node_modules`, `.ralph-runner`, and similar ignored paths are excluded. If a snapshot is truncated because it exceeds 200 files or 2 MB, progress is reported as `"unknown"` instead of `false`. After the subprocess exits, the runner waits 100 ms and polls once more for late file writes.
82
+
83
+ ### Durable state
84
+
85
+ `runner-state.ts` stores durable state in `.ralph-runner/` inside the task directory:
86
+ - `status.json` — current status, loop token, and timestamps
87
+ - `iterations.jsonl` — appended iteration records
88
+ - `stop.flag` — graceful stop signal
89
+
90
+ `status.json` records runner states such as `initializing`, `running`, `complete`, `max-iterations`, `no-progress-exhaustion`, `stopped`, `timeout`, `error`, and `cancelled`. `/ralph-stop` now writes `stop.flag`, and the runner checks it before each iteration.
57
91
 
58
92
  ## Smart `/ralph` behavior
59
93
 
@@ -71,12 +105,14 @@ On each iteration, pi-ralph reads `RALPH.md`, runs the configured commands, inje
71
105
  Use these when you want to skip heuristics:
72
106
 
73
107
  ```text
74
- /ralph --path my-task
108
+ /ralph --path my-task --arg owner="Ada Lovelace"
75
109
  /ralph --task "reverse engineer the billing flow"
76
110
  /ralph-draft --path my-task
77
111
  /ralph-draft --task "fix flaky auth tests"
78
112
  ```
79
113
 
114
+ `--arg` is for reusable templates that already declare runtime parameters. It is applied only when `/ralph` runs an existing `RALPH.md`; `/ralph-draft` leaves arg placeholders untouched for now. It accepts quoted multiword values like `--arg owner="Ada Lovelace"`.
115
+
80
116
  ### Interactive review
81
117
 
82
118
  Draft flows require an interactive UI because the extension uses a Mission Brief and editor dialog before saving or starting. In non-interactive contexts, pass an existing task folder or `RALPH.md` path instead.
@@ -85,6 +121,8 @@ Draft flows require an interactive UI because the extension uses a Mission Brief
85
121
 
86
122
  ```md
87
123
  ---
124
+ args:
125
+ - owner
88
126
  commands:
89
127
  - name: tests
90
128
  run: npm test
@@ -93,6 +131,7 @@ commands:
93
131
  run: npm run lint
94
132
  timeout: 60
95
133
  max_iterations: 25
134
+ inter_iteration_delay: 0
96
135
  timeout: 300
97
136
  completion_promise: "DONE"
98
137
  guardrails:
@@ -117,14 +156,19 @@ Iteration {{ ralph.iteration }} of {{ ralph.name }}.
117
156
  Apply the smallest safe fix and explain why it works.
118
157
  ```
119
158
 
159
+ Strengthened body-and-commands drafts keep the deterministic baseline exact: command `name -> run` pairs must match the baseline, commands may only be reordered, dropped, or have timeouts stay the same or decrease, `max_iterations` and top-level `timeout` may stay the same or decrease, every `{{ commands.<name> }}` used in the strengthened draft must point to an accepted command, `completion_promise` must stay unchanged, including staying absent when absent, and guardrails stay fixed in this phase. If the strengthened frontmatter is invalid or unsupported, pi rejects the whole strengthened draft and falls back automatically instead of splicing fields.
160
+
120
161
  | Field | Type | Default | Description |
121
162
  |-------|------|---------|-------------|
122
163
  | `commands` | array | `[]` | Commands to run each iteration |
123
- | `commands[].name` | string | required | Key for `{{ commands.<name> }}` |
164
+ | `commands[].name` | string | required | Must match `^\w[\w-]*$`; key for `{{ commands.<name> }}` |
124
165
  | `commands[].run` | string | required | Shell command |
125
- | `commands[].timeout` | number | `60` | Seconds before kill |
126
- | `max_iterations` | number | `50` | Stop after N iterations |
127
- | `timeout` | number | `300` | Per-iteration timeout in seconds; stops the loop if the agent is stuck |
166
+ | `commands[].timeout` | number | `60` | Seconds before kill; greater than 0 and at most 300 seconds, and must be `<= timeout` |
167
+ | `args` | string[] | `[]` | Declared runtime parameters for reusable templates |
168
+ | `args[]` | string | required | Must match `^\w[\w-]*$`; key for `{{ args.<name> }}` |
169
+ | `max_iterations` | integer | `50` | Stop after N iterations; must be 1-50 |
170
+ | `inter_iteration_delay` | integer | `0` | Wait N seconds between completed iterations; must be a non-negative integer |
171
+ | `timeout` | number | `300` | Per-iteration timeout in seconds; must be greater than 0 and at most 300; stops the loop if the agent is stuck |
128
172
  | `completion_promise` | string | — | Agent signals completion by sending `<promise>DONE</promise>`; loop breaks on match |
129
173
  | `guardrails.block_commands` | string[] | `[]` | Regex patterns to block in bash |
130
174
  | `guardrails.protected_files` | string[] | `[]` | Glob patterns, or the shared `policy:secret-bearing-paths` token, enforced on `write`/`edit` tool calls |
@@ -133,17 +177,19 @@ Apply the smallest safe fix and explain why it works.
133
177
 
134
178
  | Placeholder | Description |
135
179
  |-------------|-------------|
136
- | `{{ commands.<name> }}` | Output from the named command |
180
+ | `{{ commands.<name> }}` | Output from the command named `<name>` |
181
+ | `{{ args.<name> }}` | Runtime value supplied with `--arg name=value` during `/ralph` |
137
182
  | `{{ ralph.iteration }}` | Current 1-based iteration number |
138
183
  | `{{ ralph.name }}` | Directory name containing the `RALPH.md` |
184
+ | `{{ ralph.max_iterations }}` | Top-level iteration limit from the current frontmatter |
139
185
 
140
- HTML comments (`<!-- ... -->`) are stripped from the prompt body after placeholder resolution, so you can annotate your `RALPH.md` freely. Generated drafts also escape literal `<!--` and `-->` in the visible task line, and the leading metadata comment is URL-encoded so task text can safely contain comment-like sequences.
186
+ HTML comments (`<!-- ... -->`) are stripped from the prompt body after placeholder resolution, so you can annotate your `RALPH.md` freely. `args` are resolved at runtime during `/ralph` runs only; the template file is never rewritten with supplied values. Generated drafts also escape literal `<!--` and `-->` in the visible task line, and the leading metadata comment is URL-encoded so task text can safely contain comment-like sequences.
141
187
 
142
188
  ## Commands
143
189
 
144
190
  - `/ralph [path-or-task]` - Start Ralph from a task folder or `RALPH.md`, or draft a new loop from natural language.
145
191
  - `/ralph-draft [path-or-task]` - Draft or edit a Ralph task without starting it.
146
- - `/ralph-stop` - Request a graceful stop after the current iteration.
192
+ - `/ralph-stop` - Request a graceful stop after the current iteration by writing `.ralph-runner/stop.flag`.
147
193
 
148
194
  ## Pi-only features
149
195
 
@@ -161,7 +207,7 @@ In the `tool_result` hook, bash outputs are scanned for failure patterns. After
161
207
 
162
208
  ### Completion promise
163
209
 
164
- When `completion_promise` is set (for example, `"DONE"`), the loop scans the agent's messages for `<promise>DONE</promise>` after each iteration. If found, the loop stops early.
210
+ When `completion_promise` is set (for example, `"DONE"`), the loop scans the agent's messages for `<promise>DONE</promise>` after each iteration. If found, the loop only stops early once the completion gate passes: any configured required outputs must exist, and `OPEN_QUESTIONS.md` must have no remaining P0/P1 items.
165
211
 
166
212
  ### Iteration timeout
167
213
 
@@ -169,7 +215,7 @@ Each iteration has a configurable timeout (default 300 seconds). If the agent is
169
215
 
170
216
  ### Input validation
171
217
 
172
- The extension validates `RALPH.md` frontmatter before starting and on each re-parse: `max_iterations` must be a positive integer, `timeout` must be positive, `block_commands` regexes must compile, and commands must have non-empty names and run strings with positive timeouts.
218
+ The extension validates `RALPH.md` frontmatter before starting and on each re-parse: `max_iterations` must be an integer from 1 to 50, `timeout` must be greater than 0 and at most 300 seconds, command names must match `^\w[\w-]*$`, command timeouts must be greater than 0 and at most 300 seconds and no greater than top-level `timeout`, `block_commands` regexes must compile, and commands must have non-empty names and run strings. The current runtime also rejects unsafe `completion_promise` values (non-string, blank, multiline, or angle-bracketed) and universal `guardrails.protected_files` globs such as `**/*`.
173
219
 
174
220
  ## Comparison table
175
221
 
@@ -177,6 +223,11 @@ The extension validates `RALPH.md` frontmatter before starting and on each re-pa
177
223
  |---------|----------------------------|----------|-----------------|--------|----------|
178
224
  | Command output injection | ✓ | ✗ | ✗ | ✗ | ✓ |
179
225
  | Fresh-context sessions | ✓ | ✓ | ✗ | ✓ | ✓ |
226
+ | Subprocess isolation | ✓ | ✗ | ✗ | ✗ | ✗ |
227
+ | Durable state | ✓ | ✗ | ✗ | ✗ | ✗ |
228
+ | Model selection | ✓ | ✗ | ✗ | ✗ | ✗ |
229
+ | Progress detection | ✓ | ✗ | ✗ | ✗ | ✗ |
230
+ | Live RALPH.md editing | ✓ | ✗ | ✗ | ✗ | ✗ | |
180
231
  | Mid-turn guardrails | ✓ | ✗ | ✗ | ✗ | ✗ |
181
232
  | Cross-iteration memory | ✓ | ✗ | ✗ | ✗ | ✗ |
182
233
  | Mid-turn steering | ✓ | ✗ | ✗ | ✗ | ✗ |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@lnilluv/pi-ralph-loop",
3
- "version": "0.1.4-dev.0",
3
+ "version": "0.1.4-dev.1",
4
4
  "description": "Pi-native ralph loop — autonomous coding iterations with mid-turn supervision",
5
5
  "type": "module",
6
6
  "pi": {