spec-and-loop 3.3.0 → 3.3.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/OPENSPEC-RALPH-BP.md +42 -6
- package/lib/mini-ralph/runner.js +886 -2
- package/lib/mini-ralph/status.js +35 -0
- package/package.json +1 -1
- package/scripts/mini-ralph-cli.js +27 -1
- package/scripts/ralph-run.sh +21 -1
package/OPENSPEC-RALPH-BP.md
CHANGED
|
@@ -21,6 +21,7 @@ Enforced rules:
|
|
|
21
21
|
- Title is one outcome, not a list. If you need "and" twice, split.
|
|
22
22
|
- Scope names files so the loop does not hunt.
|
|
23
23
|
- `Done when` bullets are observable or runnable. No soft verbs (`ensure`, `support`, `validate`, `keep`) without attached evidence.
|
|
24
|
+
- Verifier commands use the narrowest runnable command that proves the scoped change. Prefer a named test file, spec pattern, package script, or static check over a full-suite command.
|
|
24
25
|
- `Stop and hand off if` gives the loop written permission to halt.
|
|
25
26
|
|
|
26
27
|
## Ordering
|
|
@@ -52,22 +53,55 @@ Rules:
|
|
|
52
53
|
|
|
53
54
|
Split test: if the loop stopped halfway, would the repo be clean and reviewable? If yes and there's a verifier for each half, split. If no half is meaningful alone, don't split.
|
|
54
55
|
|
|
56
|
+
## Surgical validation
|
|
57
|
+
|
|
58
|
+
Task validators must be surgical and efficient so the loop spends tokens on implementation signal, not unrelated test noise.
|
|
59
|
+
|
|
60
|
+
- Start every task with the cheapest verifier that proves the task's stated scope: direct unit test file, targeted node/browser spec, exact lint/typecheck command for touched files if available, schema validator, or focused `rg` assertion.
|
|
61
|
+
- Verify command routing before writing it into `tasks.md`. If `npm test -- <pattern>` or similar still runs unrelated suites in that repo, write the direct runner command instead (for example, `pnpm exec vitest --config <config> --run <test-file>`).
|
|
62
|
+
- Use broad gates (`npm test`, `pnpm typecheck`, `make all`, browser/e2e suites) only when the task owns repo-wide integration behavior, when they are recorded as pre-flight baselines, or in a final integrated quality-gate task.
|
|
63
|
+
- If a broad gate is still required for a narrow task, pair it with explicit baseline classification: `` `<gate command>` exits 0, or failures match the pre-flight baseline with no new failures in this task's scope ``.
|
|
64
|
+
- Prefer one focused verifier per task. Add a second verifier only when it proves a different artifact class, such as a schema validator plus one targeted unit test.
|
|
65
|
+
|
|
55
66
|
## Quality gates
|
|
56
67
|
|
|
57
68
|
- A failing `Done when` check means the task is NOT done. No rationalization.
|
|
58
69
|
- "Pre-existing" requires a before-baseline. Without one, any failure could be a regression.
|
|
59
70
|
- First task in a chain that needs clean gates must be a pre-flight baseline that records gate output.
|
|
60
71
|
- Explicitly distinguish known-broken validators (document and continue) from required-clean validators (hard stop). If only one is named, the loop generalizes permissively.
|
|
72
|
+
- If a pre-flight baseline records a failing gate, later tasks MUST NOT require only a strict clean result for that same gate unless the task is intentionally responsible for fixing that baseline failure. Use one of these explicit forms:
|
|
73
|
+
- Baseline classification: `` `<gate command>` exits 0, or failures match the pre-flight baseline with no new failures in this task's scope ``
|
|
74
|
+
- Authorized cleanup: `` `<gate command>` exits 0 after fixing the named baseline failures in `<path/one.ts>` and `<path/two.ts>` ``
|
|
75
|
+
- Hard blocker: `` `<gate command>` exits 0; baseline failures are not allowed for this task ``
|
|
76
|
+
- When strict clean-gate text conflicts with a failing pre-flight baseline and no classification/cleanup rule is written, `ralph-run` will warn the agent to stop with `BLOCKED_HANDOFF` instead of spending iterations on unauthorized cleanup.
|
|
77
|
+
- When a task refers to a pre-flight baseline, or follows a completed pre-flight baseline task, but the matching `.ralph/baselines/<change>-<gate>.txt` artifact is missing, `ralph-run` will warn the agent to stop with `BLOCKED_HANDOFF` instead of treating undocumented failures as known.
|
|
78
|
+
- A pre-flight baseline task must produce runner-recognizable artifacts, not just human-readable logs: baseline files must live under the change-local `.ralph/baselines/` directory that `ralph-run` reads, their filenames must identify the gate (`typecheck`, `lint`, `test`, etc.), and every captured gate file must end with a literal `EXIT=<integer>` line.
|
|
79
|
+
- If a later task is allowed to repair baseline artifact compatibility, say so explicitly. Its `Scope:` must name the change-local `.ralph/baselines/` directory and its `Done when:` bullets must require the missing or malformed baseline files to be restored with parseable `EXIT=<integer>` footers. Without that authorization, baseline artifact repair remains an operator handoff, not product implementation work.
|
|
80
|
+
- Authorized cleanup is intentionally narrow: the named files must be backticked, the cleanup is limited to compiler/lint-only fixes, and `ralph-run` gives the agent one repair attempt for those files on that task. If the gate still fails after that attempt, the next prompt tells the agent to hand off instead of retrying.
|
|
61
81
|
|
|
62
82
|
Pre-flight template:
|
|
63
83
|
```markdown
|
|
64
84
|
- [ ] **Pre-flight: record quality gate baselines**
|
|
65
|
-
- Scope: no code edits
|
|
85
|
+
- Scope: no code edits; writes only under `.ralph/baselines/`
|
|
66
86
|
- Change: Capture current state of all gates later tasks require.
|
|
67
87
|
- Done when:
|
|
68
|
-
- `.ralph/baselines/<change>-<gate>.txt` exists for each gate with full output
|
|
69
|
-
-
|
|
70
|
-
|
|
88
|
+
- `.ralph/baselines/<gate>.txt` or `.ralph/baselines/<change>-<gate>.txt` exists for each gate with full output
|
|
89
|
+
- every captured gate file ends with a literal `EXIT=<integer>` line
|
|
90
|
+
- `.ralph/baselines/<change>-readme.md` lists passing/failing gates, exit codes, and exact failing identifiers
|
|
91
|
+
- Stop and hand off if: any gate is nondeterministic across two runs, or any captured baseline file is missing the `EXIT=<integer>` final line after retrying the capture command.
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
Baseline artifact compatibility repair template:
|
|
95
|
+
```markdown
|
|
96
|
+
- [ ] **Repair pre-flight baseline artifact compatibility**
|
|
97
|
+
- Scope: `.ralph/baselines/`, `tasks.md`
|
|
98
|
+
- Change: Restore or regenerate baseline artifacts so `ralph-run` can classify later quality-gate failures.
|
|
99
|
+
- Done when:
|
|
100
|
+
- change-local `.ralph/baselines/<gate>.txt` files exist for every gate referenced by later baseline-classified tasks
|
|
101
|
+
- every restored gate file ends with a literal `EXIT=<integer>` line
|
|
102
|
+
- the baseline readme records the source of any restored artifact and the exit code for each gate
|
|
103
|
+
- Stop and hand off if:
|
|
104
|
+
- the original gate output is missing, the original exit code cannot be recovered, or restoring the artifact would require rerunning a nondeterministic gate.
|
|
71
105
|
```
|
|
72
106
|
|
|
73
107
|
## Anti-patterns (do not do these)
|
|
@@ -80,6 +114,8 @@ Pre-flight template:
|
|
|
80
114
|
- `Done when` that only checks unit tests when real behavior is end-to-end
|
|
81
115
|
- Visual verification without splitting from code changes (context overflow risk)
|
|
82
116
|
- "Maybe this, maybe that" wording in tasks or specs once loop starts
|
|
117
|
+
- Repo-wide or slow validators for a narrow task when a focused verifier exists (`npm test`, `make all`, full browser/e2e suites)
|
|
118
|
+
- Ambiguous package-manager forwarding such as `npm test -- event-schema` unless confirmed to execute only the intended test scope
|
|
83
119
|
|
|
84
120
|
## Examples
|
|
85
121
|
|
|
@@ -119,7 +155,7 @@ Pre-flight template:
|
|
|
119
155
|
- Change: Harbor components registered once at boot, typed for TSX.
|
|
120
156
|
- Done when:
|
|
121
157
|
- `rg "registerHarbor" src` returns exactly one call site
|
|
122
|
-
- `npm
|
|
158
|
+
- `npm exec vitest --run src/components/harbor-bootstrap.test.tsx` exits 0
|
|
123
159
|
- Stop and hand off if: more than one registration site is required.
|
|
124
160
|
```
|
|
125
161
|
|
|
@@ -136,7 +172,7 @@ Pre-flight template:
|
|
|
136
172
|
- Change: ReleaseCard renders timestamps through the shared helper.
|
|
137
173
|
- Done when:
|
|
138
174
|
- `rg "toLocaleDateString" src/components/ReleaseCard.tsx` returns no matches
|
|
139
|
-
- `npm
|
|
175
|
+
- `npm exec vitest --run src/components/ReleaseCard.test.tsx` exits 0
|
|
140
176
|
- Stop and hand off if: `formatDate` does not cover a required locale.
|
|
141
177
|
```
|
|
142
178
|
|
package/lib/mini-ralph/runner.js
CHANGED
|
@@ -50,6 +50,10 @@ const DEFAULTS = {
|
|
|
50
50
|
// toward the streak because their signal is already surfaced via the
|
|
51
51
|
// `Recent Loop Signals` feedback block.
|
|
52
52
|
stallThreshold: 3,
|
|
53
|
+
// Opt-in continuation after a BLOCKED_HANDOFF only when the handoff note has
|
|
54
|
+
// explicit evidence for a safe, bounded resolution class.
|
|
55
|
+
autoResolveHandoffs: true,
|
|
56
|
+
autoResolveHandoffMaxPerRun: 6,
|
|
53
57
|
};
|
|
54
58
|
|
|
55
59
|
/**
|
|
@@ -80,6 +84,170 @@ function _iterationIsStalled(iterationSignals) {
|
|
|
80
84
|
return true;
|
|
81
85
|
}
|
|
82
86
|
|
|
87
|
+
function _resolveAutoResolveHandoffConfig(options, existingState) {
|
|
88
|
+
const enabled = options.autoResolveHandoffs === true;
|
|
89
|
+
const maxPerRun =
|
|
90
|
+
Number.isInteger(options.autoResolveHandoffMaxPerRun) &&
|
|
91
|
+
options.autoResolveHandoffMaxPerRun > 0
|
|
92
|
+
? options.autoResolveHandoffMaxPerRun
|
|
93
|
+
: DEFAULTS.autoResolveHandoffMaxPerRun;
|
|
94
|
+
const previous =
|
|
95
|
+
existingState &&
|
|
96
|
+
existingState.autoResolveHandoffs &&
|
|
97
|
+
typeof existingState.autoResolveHandoffs === 'object'
|
|
98
|
+
? existingState.autoResolveHandoffs
|
|
99
|
+
: {};
|
|
100
|
+
const previousAttempts =
|
|
101
|
+
previous.attempts && typeof previous.attempts === 'object'
|
|
102
|
+
? previous.attempts
|
|
103
|
+
: {};
|
|
104
|
+
const previousTotal = Number.isInteger(previous.totalAttempts)
|
|
105
|
+
? previous.totalAttempts
|
|
106
|
+
: 0;
|
|
107
|
+
|
|
108
|
+
return {
|
|
109
|
+
enabled,
|
|
110
|
+
maxPerRun,
|
|
111
|
+
state: {
|
|
112
|
+
enabled,
|
|
113
|
+
maxPerRun,
|
|
114
|
+
totalAttempts: previousTotal,
|
|
115
|
+
attempts: Object.assign({}, previousAttempts),
|
|
116
|
+
lastDecision: previous.lastDecision || null,
|
|
117
|
+
},
|
|
118
|
+
};
|
|
119
|
+
}
|
|
120
|
+
|
|
121
|
+
function _handoffHasFocusedVerifierEvidence(note) {
|
|
122
|
+
if (!note) return false;
|
|
123
|
+
const text = String(note);
|
|
124
|
+
const mentionsFocusedVerifier =
|
|
125
|
+
/\bfocused\b[\s\S]{0,500}\b(verifier|command|test|spec|vitest)\b/i.test(text) ||
|
|
126
|
+
/\b(verifier|command|test|spec|vitest)\b[\s\S]{0,500}\bfocused\b/i.test(text);
|
|
127
|
+
const saysFocusedPasses =
|
|
128
|
+
/\b(passes?|passed|exits?\s+0|exit(?:ed)?\s+0|green)\b/i.test(text);
|
|
129
|
+
const saysBroadFails =
|
|
130
|
+
/\b(broad|full|required|suite|repo-wide)\b[\s\S]{0,500}\b(fails?|failed|red|non[-\s]?zero)\b/i.test(text) ||
|
|
131
|
+
/\b(fails?|failed|red|non[-\s]?zero)\b[\s\S]{0,500}\b(broad|full|required|suite|repo-wide)\b/i.test(text);
|
|
132
|
+
const saysFailuresAreUnrelated =
|
|
133
|
+
/\b(unrelated|pre[-\s]?existing|out[-\s]?of[-\s]?scope|known failures?|not introduced|baseline)\b/i.test(text);
|
|
134
|
+
|
|
135
|
+
return mentionsFocusedVerifier && saysFocusedPasses && saysBroadFails && saysFailuresAreUnrelated;
|
|
136
|
+
}
|
|
137
|
+
|
|
138
|
+
function _classifyAutoResolvableHandoff(blockerNote, baselineGateConflict) {
|
|
139
|
+
if (_handoffHasFocusedVerifierEvidence(blockerNote)) {
|
|
140
|
+
return {
|
|
141
|
+
className: 'verifier_narrowing',
|
|
142
|
+
summary: 'focused verifier passes while the broad verifier fails on unrelated/pre-existing failures',
|
|
143
|
+
allowedFiles: [],
|
|
144
|
+
};
|
|
145
|
+
}
|
|
146
|
+
|
|
147
|
+
if (
|
|
148
|
+
baselineGateConflict &&
|
|
149
|
+
baselineGateConflict.mode === 'authorized_cleanup' &&
|
|
150
|
+
baselineGateConflict.budgetUsed !== true &&
|
|
151
|
+
Array.isArray(baselineGateConflict.allowedFiles) &&
|
|
152
|
+
baselineGateConflict.allowedFiles.length > 0
|
|
153
|
+
) {
|
|
154
|
+
return {
|
|
155
|
+
className: 'authorized_cleanup',
|
|
156
|
+
summary: 'task text explicitly authorizes one cleanup attempt for named files',
|
|
157
|
+
allowedFiles: baselineGateConflict.allowedFiles.slice(),
|
|
158
|
+
};
|
|
159
|
+
}
|
|
160
|
+
|
|
161
|
+
return null;
|
|
162
|
+
}
|
|
163
|
+
|
|
164
|
+
function _autoResolveHandoffBudgetKey(currentTaskMeta, className) {
|
|
165
|
+
const taskId =
|
|
166
|
+
currentTaskMeta && currentTaskMeta.number
|
|
167
|
+
? currentTaskMeta.number
|
|
168
|
+
: currentTaskMeta && currentTaskMeta.description
|
|
169
|
+
? currentTaskMeta.description
|
|
170
|
+
: 'unknown-task';
|
|
171
|
+
return `${taskId}:${className || 'unknown'}`;
|
|
172
|
+
}
|
|
173
|
+
|
|
174
|
+
function _decideAutoResolveHandoff(config, blockerNote, currentTaskMeta, baselineGateConflict) {
|
|
175
|
+
const disabledDecision = { allowed: false, reason: 'disabled', className: '', budgetKey: '' };
|
|
176
|
+
if (!config || config.enabled !== true) return disabledDecision;
|
|
177
|
+
|
|
178
|
+
const classification = _classifyAutoResolvableHandoff(blockerNote, baselineGateConflict);
|
|
179
|
+
if (!classification) {
|
|
180
|
+
return {
|
|
181
|
+
allowed: false,
|
|
182
|
+
reason: 'ambiguous_or_unsupported_handoff',
|
|
183
|
+
className: '',
|
|
184
|
+
budgetKey: '',
|
|
185
|
+
};
|
|
186
|
+
}
|
|
187
|
+
|
|
188
|
+
const budgetKey = _autoResolveHandoffBudgetKey(currentTaskMeta, classification.className);
|
|
189
|
+
const totalAttempts = Number.isInteger(config.state && config.state.totalAttempts)
|
|
190
|
+
? config.state.totalAttempts
|
|
191
|
+
: 0;
|
|
192
|
+
const maxPerRun = Number.isInteger(config.maxPerRun)
|
|
193
|
+
? config.maxPerRun
|
|
194
|
+
: DEFAULTS.autoResolveHandoffMaxPerRun;
|
|
195
|
+
const attempts = config.state && config.state.attempts ? config.state.attempts : {};
|
|
196
|
+
|
|
197
|
+
if (totalAttempts >= maxPerRun) {
|
|
198
|
+
return Object.assign({}, classification, {
|
|
199
|
+
allowed: false,
|
|
200
|
+
reason: 'global_budget_exhausted',
|
|
201
|
+
budgetKey,
|
|
202
|
+
});
|
|
203
|
+
}
|
|
204
|
+
|
|
205
|
+
if (attempts[budgetKey]) {
|
|
206
|
+
return Object.assign({}, classification, {
|
|
207
|
+
allowed: false,
|
|
208
|
+
reason: 'task_class_budget_exhausted',
|
|
209
|
+
budgetKey,
|
|
210
|
+
});
|
|
211
|
+
}
|
|
212
|
+
|
|
213
|
+
return Object.assign({}, classification, {
|
|
214
|
+
allowed: true,
|
|
215
|
+
reason: 'authorized',
|
|
216
|
+
budgetKey,
|
|
217
|
+
});
|
|
218
|
+
}
|
|
219
|
+
|
|
220
|
+
function _consumeAutoResolveHandoffBudget(config, decision, iteration) {
|
|
221
|
+
if (!config || !config.state || !decision || decision.allowed !== true || !decision.budgetKey) {
|
|
222
|
+
return null;
|
|
223
|
+
}
|
|
224
|
+
|
|
225
|
+
const attempts = Object.assign({}, config.state.attempts || {});
|
|
226
|
+
attempts[decision.budgetKey] = {
|
|
227
|
+
className: decision.className,
|
|
228
|
+
iteration,
|
|
229
|
+
attemptedAt: new Date().toISOString(),
|
|
230
|
+
};
|
|
231
|
+
|
|
232
|
+
const totalAttempts = (Number.isInteger(config.state.totalAttempts)
|
|
233
|
+
? config.state.totalAttempts
|
|
234
|
+
: 0) + 1;
|
|
235
|
+
|
|
236
|
+
config.state = Object.assign({}, config.state, {
|
|
237
|
+
totalAttempts,
|
|
238
|
+
attempts,
|
|
239
|
+
lastDecision: {
|
|
240
|
+
className: decision.className,
|
|
241
|
+
reason: decision.reason,
|
|
242
|
+
budgetKey: decision.budgetKey,
|
|
243
|
+
iteration,
|
|
244
|
+
allowedFiles: decision.allowedFiles || [],
|
|
245
|
+
},
|
|
246
|
+
});
|
|
247
|
+
|
|
248
|
+
return config.state;
|
|
249
|
+
}
|
|
250
|
+
|
|
83
251
|
function _isFailedIteration(result) {
|
|
84
252
|
if (!result || typeof result !== 'object') return false;
|
|
85
253
|
if (result.signal !== null && result.signal !== undefined && result.signal !== '') {
|
|
@@ -462,6 +630,7 @@ async function run(opts) {
|
|
|
462
630
|
const resumeIteration = _resolveStartIteration(existingState, options);
|
|
463
631
|
const priorRunWasBlockedHandoff =
|
|
464
632
|
existingState && existingState.exitReason === 'blocked_handoff';
|
|
633
|
+
const autoResolveHandoffs = _resolveAutoResolveHandoffConfig(options, existingState);
|
|
465
634
|
|
|
466
635
|
if (options.verbose && resumeIteration > 1) {
|
|
467
636
|
process.stderr.write(
|
|
@@ -488,6 +657,9 @@ async function run(opts) {
|
|
|
488
657
|
resumeIteration > 1 && existingState && existingState.startedAt
|
|
489
658
|
? existingState.startedAt
|
|
490
659
|
: nowIso;
|
|
660
|
+
let pendingDirtyPaths = _normalizePendingDirtyPaths(
|
|
661
|
+
existingState && existingState.pendingDirtyPaths
|
|
662
|
+
);
|
|
491
663
|
|
|
492
664
|
state.init(ralphDir, {
|
|
493
665
|
active: true,
|
|
@@ -508,6 +680,8 @@ async function run(opts) {
|
|
|
508
680
|
completedAt: null,
|
|
509
681
|
stoppedAt: null,
|
|
510
682
|
exitReason: null,
|
|
683
|
+
pendingDirtyPaths,
|
|
684
|
+
autoResolveHandoffs: autoResolveHandoffs.state,
|
|
511
685
|
});
|
|
512
686
|
stateInitialized = true;
|
|
513
687
|
|
|
@@ -532,6 +706,20 @@ async function run(opts) {
|
|
|
532
706
|
: [];
|
|
533
707
|
const currentTask = _getCurrentTaskDescription(tasksBefore);
|
|
534
708
|
const currentTaskMeta = _getCurrentTaskMeta(tasksBefore);
|
|
709
|
+
pendingDirtyPaths = _refreshPendingDirtyPaths(pendingDirtyPaths);
|
|
710
|
+
state.update(ralphDir, { pendingDirtyPaths });
|
|
711
|
+
|
|
712
|
+
if (
|
|
713
|
+
pendingDirtyPaths &&
|
|
714
|
+
!_samePendingTask(pendingDirtyPaths, currentTaskMeta, currentTask)
|
|
715
|
+
) {
|
|
716
|
+
reporter.note(
|
|
717
|
+
_formatPendingDirtyPathsBlock(pendingDirtyPaths, currentTaskMeta, currentTask),
|
|
718
|
+
'error'
|
|
719
|
+
);
|
|
720
|
+
exitReason = 'pending_dirty_paths';
|
|
721
|
+
break;
|
|
722
|
+
}
|
|
535
723
|
|
|
536
724
|
reporter.iterationStarted({
|
|
537
725
|
iteration: iterationCount,
|
|
@@ -542,6 +730,7 @@ async function run(opts) {
|
|
|
542
730
|
let result;
|
|
543
731
|
let promptSize = null;
|
|
544
732
|
let responseSize = { bytes: 0, chars: 0, tokens: 0 };
|
|
733
|
+
let baselineGateConflict = null;
|
|
545
734
|
|
|
546
735
|
try {
|
|
547
736
|
// Build the prompt for this iteration
|
|
@@ -558,6 +747,7 @@ async function run(opts) {
|
|
|
558
747
|
// iteration N" line, so the 3-entry window is sufficient to surface
|
|
559
748
|
// recurring patterns without bloating the prompt.
|
|
560
749
|
const recentHistory = history.recent(ralphDir, 3);
|
|
750
|
+
const fullHistory = history.read(ralphDir);
|
|
561
751
|
const errorEntries = errors.readEntries(ralphDir, 3);
|
|
562
752
|
const blockerArtifacts = _detectBlockerArtifacts(ralphDir, {
|
|
563
753
|
repoRoot: process.cwd(),
|
|
@@ -570,6 +760,14 @@ async function run(opts) {
|
|
|
570
760
|
errorEntries,
|
|
571
761
|
blockerArtifacts,
|
|
572
762
|
);
|
|
763
|
+
baselineGateConflict = _analyzeBaselineGateConflict(
|
|
764
|
+
ralphDir,
|
|
765
|
+
options.tasksFile,
|
|
766
|
+
currentTaskMeta,
|
|
767
|
+
fullHistory,
|
|
768
|
+
);
|
|
769
|
+
const baselineGateFeedback = _formatBaselineGateFeedback(baselineGateConflict);
|
|
770
|
+
const autoResolveHandoffFeedback = _buildAutoResolveHandoffFeedback(recentHistory);
|
|
573
771
|
|
|
574
772
|
// Inject any pending context
|
|
575
773
|
const pendingContext = context.consume(ralphDir);
|
|
@@ -577,10 +775,18 @@ async function run(opts) {
|
|
|
577
775
|
const lessonsSection = lessons.inject(ralphDir, { limit: 15 });
|
|
578
776
|
const promptSections = [renderedPrompt];
|
|
579
777
|
|
|
778
|
+
if (baselineGateFeedback) {
|
|
779
|
+
promptSections.push(`## Baseline Gate Conflict\n\n${baselineGateFeedback}`);
|
|
780
|
+
}
|
|
781
|
+
|
|
580
782
|
if (iterationFeedback) {
|
|
581
783
|
promptSections.push(`## Recent Loop Signals\n\n${iterationFeedback}`);
|
|
582
784
|
}
|
|
583
785
|
|
|
786
|
+
if (autoResolveHandoffFeedback) {
|
|
787
|
+
promptSections.push(`## Auto-Resolve Handoff\n\n${autoResolveHandoffFeedback}`);
|
|
788
|
+
}
|
|
789
|
+
|
|
584
790
|
if (lessonsSection) {
|
|
585
791
|
promptSections.push(lessonsSection);
|
|
586
792
|
}
|
|
@@ -674,6 +880,24 @@ async function run(opts) {
|
|
|
674
880
|
const blockerNote = hasBlockedHandoff
|
|
675
881
|
? _extractBlockerNote(outputText, blockedHandoffPromise)
|
|
676
882
|
: '';
|
|
883
|
+
const autoResolveHandoffDecision = hasBlockedHandoff
|
|
884
|
+
? _decideAutoResolveHandoff(
|
|
885
|
+
autoResolveHandoffs,
|
|
886
|
+
blockerNote,
|
|
887
|
+
currentTaskMeta,
|
|
888
|
+
baselineGateConflict,
|
|
889
|
+
)
|
|
890
|
+
: null;
|
|
891
|
+
if (autoResolveHandoffDecision && autoResolveHandoffDecision.allowed) {
|
|
892
|
+
const nextAutoResolveState = _consumeAutoResolveHandoffBudget(
|
|
893
|
+
autoResolveHandoffs,
|
|
894
|
+
autoResolveHandoffDecision,
|
|
895
|
+
iterationCount,
|
|
896
|
+
);
|
|
897
|
+
if (nextAutoResolveState) {
|
|
898
|
+
state.update(ralphDir, { autoResolveHandoffs: nextAutoResolveState });
|
|
899
|
+
}
|
|
900
|
+
}
|
|
677
901
|
const tasksAfter = options.tasksMode && options.tasksFile
|
|
678
902
|
? tasks.parseTasks(options.tasksFile)
|
|
679
903
|
: [];
|
|
@@ -701,13 +925,42 @@ async function run(opts) {
|
|
|
701
925
|
result.filesChanged.length > 0 &&
|
|
702
926
|
(hasCompletion || (options.tasksMode && hasTask))
|
|
703
927
|
) {
|
|
928
|
+
const filesToStage = _buildAutoCommitAllowlist(
|
|
929
|
+
_mergePathLists(result.filesChanged, pendingDirtyPaths ? pendingDirtyPaths.files : []),
|
|
930
|
+
completedTasks,
|
|
931
|
+
options.tasksFile
|
|
932
|
+
);
|
|
704
933
|
commitResult = _autoCommit(iterationCount, {
|
|
705
934
|
completedTasks,
|
|
706
|
-
filesToStage
|
|
935
|
+
filesToStage,
|
|
707
936
|
tasksFile: options.tasksFile,
|
|
708
937
|
verbose: options.verbose,
|
|
709
938
|
reporter,
|
|
710
939
|
});
|
|
940
|
+
if (commitResult.committed && pendingDirtyPaths) {
|
|
941
|
+
pendingDirtyPaths = _remainingPendingDirtyPathsAfterCommit(
|
|
942
|
+
pendingDirtyPaths,
|
|
943
|
+
commitResult.anomaly
|
|
944
|
+
);
|
|
945
|
+
state.update(ralphDir, { pendingDirtyPaths });
|
|
946
|
+
}
|
|
947
|
+
}
|
|
948
|
+
|
|
949
|
+
if (
|
|
950
|
+
!commitResult.committed &&
|
|
951
|
+
Array.isArray(result.filesChanged) &&
|
|
952
|
+
result.filesChanged.length > 0 &&
|
|
953
|
+
(_isFailedIteration(result) || hasBlockedHandoff)
|
|
954
|
+
) {
|
|
955
|
+
pendingDirtyPaths = _recordPendingDirtyPaths(pendingDirtyPaths, {
|
|
956
|
+
iteration: iterationCount,
|
|
957
|
+
reason: hasBlockedHandoff ? 'blocked_handoff' : 'failed_iteration',
|
|
958
|
+
task: currentTask,
|
|
959
|
+
taskNumber: currentTaskMeta.number,
|
|
960
|
+
taskDescription: currentTaskMeta.description,
|
|
961
|
+
files: result.filesChanged,
|
|
962
|
+
});
|
|
963
|
+
state.update(ralphDir, { pendingDirtyPaths });
|
|
711
964
|
}
|
|
712
965
|
|
|
713
966
|
// Record iteration in history after commit handling so operator-visible
|
|
@@ -727,11 +980,30 @@ async function run(opts) {
|
|
|
727
980
|
signal: result.signal || '',
|
|
728
981
|
failureStage: result.failureStage || '',
|
|
729
982
|
completedTasks: completedTasks.map((task) => task.fullDescription || task.description),
|
|
983
|
+
...(autoResolveHandoffDecision
|
|
984
|
+
? {
|
|
985
|
+
autoResolveHandoffAttempted: autoResolveHandoffDecision.allowed === true,
|
|
986
|
+
autoResolveHandoffClass: autoResolveHandoffDecision.className || '',
|
|
987
|
+
autoResolveHandoffReason: autoResolveHandoffDecision.reason || '',
|
|
988
|
+
autoResolveHandoffBudgetKey: autoResolveHandoffDecision.budgetKey || '',
|
|
989
|
+
autoResolveHandoffAllowedFiles: autoResolveHandoffDecision.allowedFiles || [],
|
|
990
|
+
}
|
|
991
|
+
: {}),
|
|
730
992
|
commitAttempted: commitResult.attempted,
|
|
731
993
|
commitCreated: commitResult.committed,
|
|
732
994
|
commitAnomaly: commitResult.anomaly ? commitResult.anomaly.message : '',
|
|
733
995
|
commitAnomalyType: commitResult.anomaly ? commitResult.anomaly.type : '',
|
|
734
996
|
protectedArtifacts: commitResult.anomaly ? commitResult.anomaly.protectedArtifacts || [] : [],
|
|
997
|
+
...(baselineGateConflict
|
|
998
|
+
? {
|
|
999
|
+
baselineGateConflictMode: baselineGateConflict.mode,
|
|
1000
|
+
baselineGateRepairAllowedFiles: baselineGateConflict.allowedFiles || [],
|
|
1001
|
+
baselineGateRepairAttempted: _baselineGateRepairAttempted(
|
|
1002
|
+
baselineGateConflict,
|
|
1003
|
+
result.filesChanged || []
|
|
1004
|
+
),
|
|
1005
|
+
}
|
|
1006
|
+
: {}),
|
|
735
1007
|
...(commitResult.anomaly && commitResult.anomaly.ignoredPaths && commitResult.anomaly.ignoredPaths.length > 0
|
|
736
1008
|
? { ignoredPaths: commitResult.anomaly.ignoredPaths }
|
|
737
1009
|
: {}),
|
|
@@ -824,9 +1096,21 @@ async function run(opts) {
|
|
|
824
1096
|
reporter.note(
|
|
825
1097
|
handoffPath
|
|
826
1098
|
? `agent emitted ${blockedHandoffPromise}; blocker note saved to ${handoffPath}.`
|
|
827
|
-
: `agent emitted ${blockedHandoffPromise};
|
|
1099
|
+
: `agent emitted ${blockedHandoffPromise}; HANDOFF.md write failed (see stderr).`,
|
|
828
1100
|
'warn'
|
|
829
1101
|
);
|
|
1102
|
+
if (autoResolveHandoffDecision && autoResolveHandoffDecision.allowed) {
|
|
1103
|
+
reporter.note(
|
|
1104
|
+
`auto-resolve handoffs: continuing once for ${autoResolveHandoffDecision.className} (${autoResolveHandoffDecision.budgetKey}).`,
|
|
1105
|
+
'warn'
|
|
1106
|
+
);
|
|
1107
|
+
if (options.verbose) {
|
|
1108
|
+
process.stderr.write(
|
|
1109
|
+
`[mini-ralph] auto-resolve handoff consumed budget key ${autoResolveHandoffDecision.budgetKey}; continuing.\n`
|
|
1110
|
+
);
|
|
1111
|
+
}
|
|
1112
|
+
continue;
|
|
1113
|
+
}
|
|
830
1114
|
if (options.verbose) {
|
|
831
1115
|
process.stderr.write(
|
|
832
1116
|
`[mini-ralph] ${blockedHandoffPromise} detected at iteration ${iterationCount}; halting.\n`
|
|
@@ -915,6 +1199,145 @@ function _containsPromise(text, promiseName) {
|
|
|
915
1199
|
.some((line) => line.trim() === expectedTag);
|
|
916
1200
|
}
|
|
917
1201
|
|
|
1202
|
+
function _normalizePendingDirtyPaths(pending) {
|
|
1203
|
+
if (!pending || typeof pending !== 'object') return null;
|
|
1204
|
+
const files = _mergePathLists(pending.files || pending.paths || []);
|
|
1205
|
+
if (files.length === 0) return null;
|
|
1206
|
+
|
|
1207
|
+
return {
|
|
1208
|
+
iteration: typeof pending.iteration === 'number' ? pending.iteration : null,
|
|
1209
|
+
reason: pending.reason || 'blocked_handoff',
|
|
1210
|
+
task: pending.task || '',
|
|
1211
|
+
taskNumber: pending.taskNumber || '',
|
|
1212
|
+
taskDescription: pending.taskDescription || '',
|
|
1213
|
+
files,
|
|
1214
|
+
recordedAt: pending.recordedAt || new Date().toISOString(),
|
|
1215
|
+
};
|
|
1216
|
+
}
|
|
1217
|
+
|
|
1218
|
+
function _recordPendingDirtyPaths(existing, update) {
|
|
1219
|
+
const normalized = _normalizePendingDirtyPaths({
|
|
1220
|
+
iteration: update && typeof update.iteration === 'number' ? update.iteration : null,
|
|
1221
|
+
reason: update && update.reason ? update.reason : 'blocked_handoff',
|
|
1222
|
+
task: update && update.task ? update.task : '',
|
|
1223
|
+
taskNumber: update && update.taskNumber ? update.taskNumber : '',
|
|
1224
|
+
taskDescription: update && update.taskDescription ? update.taskDescription : '',
|
|
1225
|
+
files: _mergePathLists(
|
|
1226
|
+
existing && existing.files ? existing.files : [],
|
|
1227
|
+
update && update.files ? update.files : []
|
|
1228
|
+
),
|
|
1229
|
+
recordedAt: update && update.recordedAt ? update.recordedAt : new Date().toISOString(),
|
|
1230
|
+
});
|
|
1231
|
+
|
|
1232
|
+
return normalized;
|
|
1233
|
+
}
|
|
1234
|
+
|
|
1235
|
+
function _remainingPendingDirtyPathsAfterCommit(pending, anomaly) {
|
|
1236
|
+
const normalized = _normalizePendingDirtyPaths(pending);
|
|
1237
|
+
if (!normalized) return null;
|
|
1238
|
+
|
|
1239
|
+
const ignoredPaths = anomaly && Array.isArray(anomaly.ignoredPaths)
|
|
1240
|
+
? anomaly.ignoredPaths.map(_repoRelativePath).filter(Boolean)
|
|
1241
|
+
: [];
|
|
1242
|
+
if (ignoredPaths.length === 0) return null;
|
|
1243
|
+
|
|
1244
|
+
const ignoredSet = new Set(ignoredPaths);
|
|
1245
|
+
const files = normalized.files.filter((file) => ignoredSet.has(file));
|
|
1246
|
+
if (files.length === 0) return null;
|
|
1247
|
+
return Object.assign({}, normalized, { files });
|
|
1248
|
+
}
|
|
1249
|
+
|
|
1250
|
+
function _refreshPendingDirtyPaths(pending) {
|
|
1251
|
+
const normalized = _normalizePendingDirtyPaths(pending);
|
|
1252
|
+
if (!normalized) return null;
|
|
1253
|
+
|
|
1254
|
+
const dirtyPaths = _currentDirtyPathSet();
|
|
1255
|
+
if (!dirtyPaths) return normalized;
|
|
1256
|
+
const files = normalized.files.filter((file) => dirtyPaths.has(file));
|
|
1257
|
+
if (files.length === 0) return null;
|
|
1258
|
+
|
|
1259
|
+
return Object.assign({}, normalized, { files });
|
|
1260
|
+
}
|
|
1261
|
+
|
|
1262
|
+
function _samePendingTask(pending, currentTaskMeta, currentTask) {
|
|
1263
|
+
if (!pending) return true;
|
|
1264
|
+
const currentNumber = currentTaskMeta && currentTaskMeta.number ? currentTaskMeta.number : '';
|
|
1265
|
+
const currentDescription = currentTaskMeta && currentTaskMeta.description ? currentTaskMeta.description : '';
|
|
1266
|
+
const currentFull = currentTask || '';
|
|
1267
|
+
|
|
1268
|
+
if (pending.taskNumber && currentNumber) {
|
|
1269
|
+
return pending.taskNumber === currentNumber;
|
|
1270
|
+
}
|
|
1271
|
+
|
|
1272
|
+
if (pending.taskDescription && currentDescription) {
|
|
1273
|
+
return pending.taskDescription === currentDescription;
|
|
1274
|
+
}
|
|
1275
|
+
|
|
1276
|
+
return Boolean(pending.task && currentFull && pending.task === currentFull);
|
|
1277
|
+
}
|
|
1278
|
+
|
|
1279
|
+
function _formatPendingDirtyPathsBlock(pending, currentTaskMeta, currentTask) {
|
|
1280
|
+
const currentStamp = currentTaskMeta && currentTaskMeta.number
|
|
1281
|
+
? `${currentTaskMeta.number} ${currentTaskMeta.description || ''}`.trim()
|
|
1282
|
+
: (currentTask || 'the current task');
|
|
1283
|
+
const pendingStamp = pending.taskNumber
|
|
1284
|
+
? `${pending.taskNumber} ${pending.taskDescription || ''}`.trim()
|
|
1285
|
+
: (pending.task || 'a prior blocked handoff');
|
|
1286
|
+
const files = (pending.files || []).slice(0, 8);
|
|
1287
|
+
const extra = (pending.files || []).length - files.length;
|
|
1288
|
+
const fileLines = files.map((file) => ` - ${file}`).join('\n');
|
|
1289
|
+
const suffix = extra > 0 ? `\n - (+${extra} more)` : '';
|
|
1290
|
+
|
|
1291
|
+
return [
|
|
1292
|
+
`pending dirty paths from ${pending.reason || 'blocked_handoff'} iteration ${pending.iteration || 'unknown'} remain unresolved.`,
|
|
1293
|
+
`Prior task: ${pendingStamp}`,
|
|
1294
|
+
`Current task: ${currentStamp}`,
|
|
1295
|
+
'Resolve the prior patch before Ralph can safely continue: commit it with the same task, revert it, or move it to a separate change.',
|
|
1296
|
+
'Pending paths:',
|
|
1297
|
+
`${fileLines}${suffix}`,
|
|
1298
|
+
].join('\n');
|
|
1299
|
+
}
|
|
1300
|
+
|
|
1301
|
+
function _currentDirtyPathSet() {
|
|
1302
|
+
try {
|
|
1303
|
+
const output = childProcess.execFileSync('git', ['status', '--porcelain'], {
|
|
1304
|
+
encoding: 'utf8',
|
|
1305
|
+
stdio: ['pipe', 'pipe', 'pipe'],
|
|
1306
|
+
});
|
|
1307
|
+
const paths = new Set();
|
|
1308
|
+
for (const line of output.split('\n')) {
|
|
1309
|
+
for (const file of _parseGitStatusPaths(line)) {
|
|
1310
|
+
if (file) paths.add(file);
|
|
1311
|
+
}
|
|
1312
|
+
}
|
|
1313
|
+
return paths;
|
|
1314
|
+
} catch (_) {
|
|
1315
|
+
return null;
|
|
1316
|
+
}
|
|
1317
|
+
}
|
|
1318
|
+
|
|
1319
|
+
function _parseGitStatusPaths(line) {
|
|
1320
|
+
if (!line || typeof line !== 'string') return [];
|
|
1321
|
+
const rawPath = line.slice(3).trim();
|
|
1322
|
+
if (!rawPath) return [];
|
|
1323
|
+
if (rawPath.includes(' -> ')) {
|
|
1324
|
+
return rawPath.split(' -> ').map(_stripGitStatusQuotes).filter(Boolean);
|
|
1325
|
+
}
|
|
1326
|
+
return [_stripGitStatusQuotes(rawPath)].filter(Boolean);
|
|
1327
|
+
}
|
|
1328
|
+
|
|
1329
|
+
function _stripGitStatusQuotes(value) {
|
|
1330
|
+
if (!value) return '';
|
|
1331
|
+
const trimmed = value.trim();
|
|
1332
|
+
if (!(trimmed.startsWith('"') && trimmed.endsWith('"'))) {
|
|
1333
|
+
return trimmed;
|
|
1334
|
+
}
|
|
1335
|
+
return trimmed
|
|
1336
|
+
.slice(1, -1)
|
|
1337
|
+
.replace(/\\"/g, '"')
|
|
1338
|
+
.replace(/\\\\/g, '\\');
|
|
1339
|
+
}
|
|
1340
|
+
|
|
918
1341
|
/**
|
|
919
1342
|
* Validate required options and throw descriptive errors.
|
|
920
1343
|
*
|
|
@@ -1163,6 +1586,19 @@ function _filterGitignored(paths, cwd) {
|
|
|
1163
1586
|
}
|
|
1164
1587
|
}
|
|
1165
1588
|
|
|
1589
|
+
function _mergePathLists(...lists) {
|
|
1590
|
+
const merged = new Set();
|
|
1591
|
+
for (const list of lists) {
|
|
1592
|
+
for (const file of list || []) {
|
|
1593
|
+
const relativeFile = _repoRelativePath(file);
|
|
1594
|
+
if (relativeFile) {
|
|
1595
|
+
merged.add(relativeFile);
|
|
1596
|
+
}
|
|
1597
|
+
}
|
|
1598
|
+
}
|
|
1599
|
+
return Array.from(merged);
|
|
1600
|
+
}
|
|
1601
|
+
|
|
1166
1602
|
/**
|
|
1167
1603
|
* Build the explicit per-iteration git staging allowlist.
|
|
1168
1604
|
*
|
|
@@ -1558,6 +1994,430 @@ function _buildIterationFeedback(recentHistory, errorEntries, blockerArtifacts)
|
|
|
1558
1994
|
return sections.join('\n');
|
|
1559
1995
|
}
|
|
1560
1996
|
|
|
1997
|
+
function _buildAutoResolveHandoffFeedback(recentHistory) {
|
|
1998
|
+
if (!Array.isArray(recentHistory) || recentHistory.length === 0) return '';
|
|
1999
|
+
|
|
2000
|
+
const entry = recentHistory
|
|
2001
|
+
.slice()
|
|
2002
|
+
.reverse()
|
|
2003
|
+
.find((item) => item && item.autoResolveHandoffAttempted === true);
|
|
2004
|
+
|
|
2005
|
+
if (!entry) return '';
|
|
2006
|
+
|
|
2007
|
+
const className = entry.autoResolveHandoffClass || 'unknown';
|
|
2008
|
+
const lines = [
|
|
2009
|
+
`The previous iteration emitted BLOCKED_HANDOFF, but auto-resolution is enabled and spent its bounded attempt for ${className}.`,
|
|
2010
|
+
'You have exactly one continuation attempt for this task/blocker class. Do not broaden task scope, do not repair unrelated snapshots or UI behavior, and do not keep retrying if the evidence does not hold.',
|
|
2011
|
+
];
|
|
2012
|
+
|
|
2013
|
+
if (className === 'verifier_narrowing') {
|
|
2014
|
+
lines.push(
|
|
2015
|
+
'Allowed action: if the handoff explicitly names a focused verifier that passes and a broad verifier that fails only on unrelated/pre-existing failures, update only the current task verifier from the broad command to that focused command, run the focused command once, and complete the task only if it passes. If the focused command is absent, ambiguous, or fails, emit BLOCKED_HANDOFF instead of retrying.'
|
|
2016
|
+
);
|
|
2017
|
+
} else if (className === 'authorized_cleanup') {
|
|
2018
|
+
const files = Array.isArray(entry.autoResolveHandoffAllowedFiles)
|
|
2019
|
+
? entry.autoResolveHandoffAllowedFiles.filter(Boolean)
|
|
2020
|
+
: [];
|
|
2021
|
+
lines.push(
|
|
2022
|
+
`Allowed action: make one cleanup attempt only in the task-authorized file list${files.length > 0 ? ` (${files.join(', ')})` : ''}. If the gate still fails, emit BLOCKED_HANDOFF instead of continuing.`
|
|
2023
|
+
);
|
|
2024
|
+
} else {
|
|
2025
|
+
lines.push(
|
|
2026
|
+
'Allowed action: continue only if the blocker evidence remains explicit and within the runner-approved safe class; otherwise emit BLOCKED_HANDOFF.'
|
|
2027
|
+
);
|
|
2028
|
+
}
|
|
2029
|
+
|
|
2030
|
+
return lines.join('\n');
|
|
2031
|
+
}
|
|
2032
|
+
|
|
2033
|
+
function _buildBaselineGateFeedback(ralphDir, tasksFile, currentTaskMeta, recentHistory) {
|
|
2034
|
+
return _formatBaselineGateFeedback(
|
|
2035
|
+
_analyzeBaselineGateConflict(ralphDir, tasksFile, currentTaskMeta, recentHistory)
|
|
2036
|
+
);
|
|
2037
|
+
}
|
|
2038
|
+
|
|
2039
|
+
function _analyzeBaselineGateConflict(ralphDir, tasksFile, currentTaskMeta, recentHistory) {
|
|
2040
|
+
if (!ralphDir || !tasksFile || !currentTaskMeta || !currentTaskMeta.description) {
|
|
2041
|
+
return null;
|
|
2042
|
+
}
|
|
2043
|
+
|
|
2044
|
+
const taskBlock = _extractCurrentTaskBlock(tasksFile, currentTaskMeta);
|
|
2045
|
+
if (!taskBlock) return null;
|
|
2046
|
+
|
|
2047
|
+
const strictGates = _detectStrictCleanGates(taskBlock);
|
|
2048
|
+
if (strictGates.length === 0) return null;
|
|
2049
|
+
|
|
2050
|
+
const recordedBaselines = _detectRecordedBaselineGates(ralphDir);
|
|
2051
|
+
const missingBaselines = _detectMissingBaselineGates(
|
|
2052
|
+
strictGates,
|
|
2053
|
+
recordedBaselines,
|
|
2054
|
+
taskBlock,
|
|
2055
|
+
tasksFile
|
|
2056
|
+
);
|
|
2057
|
+
|
|
2058
|
+
if (missingBaselines.length > 0) {
|
|
2059
|
+
return {
|
|
2060
|
+
mode: 'missing_baseline',
|
|
2061
|
+
conflicts: [],
|
|
2062
|
+
missingBaselines,
|
|
2063
|
+
allowedFiles: [],
|
|
2064
|
+
budgetUsed: false,
|
|
2065
|
+
};
|
|
2066
|
+
}
|
|
2067
|
+
|
|
2068
|
+
const failingBaselines = recordedBaselines.filter((gate) => gate.exitCode !== 0);
|
|
2069
|
+
if (failingBaselines.length === 0) return null;
|
|
2070
|
+
|
|
2071
|
+
const baselineByGate = new Map(failingBaselines.map((gate) => [gate.name, gate]));
|
|
2072
|
+
const conflicts = strictGates
|
|
2073
|
+
.map((gate) => ({ gate, baseline: baselineByGate.get(gate.name) }))
|
|
2074
|
+
.filter((item) => item.baseline);
|
|
2075
|
+
|
|
2076
|
+
if (conflicts.length === 0) return null;
|
|
2077
|
+
|
|
2078
|
+
const cleanup = _detectAuthorizedBaselineCleanup(taskBlock);
|
|
2079
|
+
if (cleanup.allowedFiles.length > 0) {
|
|
2080
|
+
return {
|
|
2081
|
+
mode: 'authorized_cleanup',
|
|
2082
|
+
conflicts,
|
|
2083
|
+
allowedFiles: cleanup.allowedFiles,
|
|
2084
|
+
budgetUsed: _baselineGateRepairBudgetUsed(recentHistory, currentTaskMeta, cleanup.allowedFiles),
|
|
2085
|
+
};
|
|
2086
|
+
}
|
|
2087
|
+
|
|
2088
|
+
if (_taskExplicitlyHandlesBaselineFailures(taskBlock)) {
|
|
2089
|
+
return {
|
|
2090
|
+
mode: 'baseline_classification',
|
|
2091
|
+
conflicts,
|
|
2092
|
+
allowedFiles: [],
|
|
2093
|
+
budgetUsed: false,
|
|
2094
|
+
};
|
|
2095
|
+
}
|
|
2096
|
+
|
|
2097
|
+
return {
|
|
2098
|
+
mode: 'missing_policy',
|
|
2099
|
+
conflicts,
|
|
2100
|
+
allowedFiles: [],
|
|
2101
|
+
budgetUsed: false,
|
|
2102
|
+
};
|
|
2103
|
+
}
|
|
2104
|
+
|
|
2105
|
+
function _formatBaselineGateFeedback(conflict) {
|
|
2106
|
+
const conflicts = Array.isArray(conflict && conflict.conflicts) ? conflict.conflicts : [];
|
|
2107
|
+
const missingBaselines = Array.isArray(conflict && conflict.missingBaselines)
|
|
2108
|
+
? conflict.missingBaselines
|
|
2109
|
+
: [];
|
|
2110
|
+
|
|
2111
|
+
if (!conflict || (conflicts.length === 0 && missingBaselines.length === 0)) {
|
|
2112
|
+
return '';
|
|
2113
|
+
}
|
|
2114
|
+
|
|
2115
|
+
const conflictLines = conflicts.map(({ gate, baseline }) =>
|
|
2116
|
+
`- ${gate.command}: baseline ${baseline.file} exits ${baseline.exitCode}.`
|
|
2117
|
+
);
|
|
2118
|
+
const missingLines = missingBaselines.map((gate) =>
|
|
2119
|
+
`- ${gate.command}: no matching baseline artifact found under .ralph/baselines.`
|
|
2120
|
+
);
|
|
2121
|
+
|
|
2122
|
+
if (conflict.mode === 'missing_baseline') {
|
|
2123
|
+
return [
|
|
2124
|
+
'The current task uses a strict clean quality gate and the task plan indicates a pre-flight baseline should exist, but the matching baseline artifact is missing.',
|
|
2125
|
+
'Do not classify failures as pre-existing or spend an implementation iteration trying to satisfy an impossible task contract.',
|
|
2126
|
+
'emit BLOCKED_HANDOFF and ask the operator to rerun or restore the pre-flight baseline artifact, or update the task spec to authorize a different gate policy.',
|
|
2127
|
+
'',
|
|
2128
|
+
...missingLines,
|
|
2129
|
+
].join('\n');
|
|
2130
|
+
}
|
|
2131
|
+
|
|
2132
|
+
if (conflict.mode === 'authorized_cleanup') {
|
|
2133
|
+
if (conflict.budgetUsed) {
|
|
2134
|
+
return [
|
|
2135
|
+
'The current task explicitly authorized cleanup for baseline gate failures, but its one repair attempt has already been used.',
|
|
2136
|
+
'Do not keep iterating on cleanup or broaden the edit scope.',
|
|
2137
|
+
'If the gate is still failing, emit BLOCKED_HANDOFF with the remaining failing identifiers and ask for either a broader cleanup task or a task-spec change.',
|
|
2138
|
+
'',
|
|
2139
|
+
`Authorized cleanup files: ${conflict.allowedFiles.join(', ')}`,
|
|
2140
|
+
...conflictLines,
|
|
2141
|
+
].join('\n');
|
|
2142
|
+
}
|
|
2143
|
+
|
|
2144
|
+
return [
|
|
2145
|
+
'The current task explicitly authorizes cleanup for baseline gate failures in named files.',
|
|
2146
|
+
'You have exactly one repair attempt for this task. Limit edits to compiler/lint-only fixes in the authorized files; do not change behavior or edit other files for this cleanup.',
|
|
2147
|
+
'If this attempt does not clear the gate, emit BLOCKED_HANDOFF instead of continuing to retry.',
|
|
2148
|
+
'',
|
|
2149
|
+
`Authorized cleanup files: ${conflict.allowedFiles.join(', ')}`,
|
|
2150
|
+
...conflictLines,
|
|
2151
|
+
].join('\n');
|
|
2152
|
+
}
|
|
2153
|
+
|
|
2154
|
+
if (conflict.mode === 'baseline_classification') {
|
|
2155
|
+
return [
|
|
2156
|
+
'The current task has strict quality-gate checks, and matching pre-flight baselines are already failing.',
|
|
2157
|
+
'The task text appears to authorize baseline classification, so do not repair unrelated baseline failures unless the task explicitly names those files.',
|
|
2158
|
+
'Complete the task only if the current run has no new failures beyond the named baseline failures.',
|
|
2159
|
+
'',
|
|
2160
|
+
...conflictLines,
|
|
2161
|
+
].join('\n');
|
|
2162
|
+
}
|
|
2163
|
+
|
|
2164
|
+
return [
|
|
2165
|
+
'The current task requires a clean gate that already has a failing pre-flight baseline, but the task text does not say whether baseline-matching failures may be classified.',
|
|
2166
|
+
'Do not spend iterations repairing unrelated files outside the current task scope.',
|
|
2167
|
+
'If the only remaining gate failures match the baseline, emit BLOCKED_HANDOFF with a task-spec correction request: either allow baseline classification for this gate, or explicitly authorize the named out-of-scope repair.',
|
|
2168
|
+
'',
|
|
2169
|
+
...conflictLines,
|
|
2170
|
+
].join('\n');
|
|
2171
|
+
}
|
|
2172
|
+
|
|
2173
|
+
function _extractCurrentTaskBlock(tasksFile, currentTaskMeta) {
|
|
2174
|
+
const fs = require('fs');
|
|
2175
|
+
if (!tasksFile || !fs.existsSync(tasksFile)) return '';
|
|
2176
|
+
|
|
2177
|
+
const lines = fs.readFileSync(tasksFile, 'utf8').split(/\r?\n/);
|
|
2178
|
+
const taskHeader = /^-\s+\[[ x/]\]\s+(.+)$/;
|
|
2179
|
+
const targetNumber = currentTaskMeta.number || '';
|
|
2180
|
+
const targetDescription = (currentTaskMeta.description || '').trim();
|
|
2181
|
+
let start = -1;
|
|
2182
|
+
|
|
2183
|
+
for (let i = 0; i < lines.length; i++) {
|
|
2184
|
+
const match = lines[i].match(taskHeader);
|
|
2185
|
+
if (!match) continue;
|
|
2186
|
+
|
|
2187
|
+
const fullDescription = match[1].trim();
|
|
2188
|
+
const numMatch = fullDescription.match(/^(\d+\.\d+)\s+(.+)$/);
|
|
2189
|
+
const number = numMatch ? numMatch[1] : '';
|
|
2190
|
+
const description = (numMatch ? numMatch[2] : fullDescription).trim();
|
|
2191
|
+
|
|
2192
|
+
if (
|
|
2193
|
+
(targetNumber && number === targetNumber) ||
|
|
2194
|
+
(!targetNumber && description === targetDescription) ||
|
|
2195
|
+
(targetNumber && description === targetDescription)
|
|
2196
|
+
) {
|
|
2197
|
+
start = i;
|
|
2198
|
+
break;
|
|
2199
|
+
}
|
|
2200
|
+
}
|
|
2201
|
+
|
|
2202
|
+
if (start === -1) return '';
|
|
2203
|
+
|
|
2204
|
+
let end = lines.length;
|
|
2205
|
+
for (let i = start + 1; i < lines.length; i++) {
|
|
2206
|
+
if (taskHeader.test(lines[i])) {
|
|
2207
|
+
end = i;
|
|
2208
|
+
break;
|
|
2209
|
+
}
|
|
2210
|
+
}
|
|
2211
|
+
|
|
2212
|
+
return lines.slice(start, end).join('\n');
|
|
2213
|
+
}
|
|
2214
|
+
|
|
2215
|
+
function _detectStrictCleanGates(taskBlock) {
|
|
2216
|
+
if (!taskBlock) return [];
|
|
2217
|
+
|
|
2218
|
+
const gates = [
|
|
2219
|
+
{
|
|
2220
|
+
name: 'typecheck',
|
|
2221
|
+
command: 'pnpm typecheck',
|
|
2222
|
+
pattern: /`?pnpm\s+typecheck`?[^\n]*(?:exits?|returns?)\s+0/i,
|
|
2223
|
+
},
|
|
2224
|
+
{
|
|
2225
|
+
name: 'lint',
|
|
2226
|
+
command: 'pnpm lint',
|
|
2227
|
+
pattern: /`?pnpm\s+lint`?[^\n]*(?:exits?|returns?)\s+0/i,
|
|
2228
|
+
},
|
|
2229
|
+
{
|
|
2230
|
+
name: 'test',
|
|
2231
|
+
command: 'pnpm test',
|
|
2232
|
+
pattern: /`?pnpm\s+test`?[^\n]*(?:exits?|returns?)\s+0/i,
|
|
2233
|
+
},
|
|
2234
|
+
];
|
|
2235
|
+
|
|
2236
|
+
return gates.filter((gate) => gate.pattern.test(taskBlock));
|
|
2237
|
+
}
|
|
2238
|
+
|
|
2239
|
+
function _detectFailingBaselineGates(ralphDir) {
|
|
2240
|
+
return _detectRecordedBaselineGates(ralphDir).filter((gate) => gate.exitCode !== 0);
|
|
2241
|
+
}
|
|
2242
|
+
|
|
2243
|
+
function _detectRecordedBaselineGates(ralphDir) {
|
|
2244
|
+
const fs = require('fs');
|
|
2245
|
+
const fsPath = require('path');
|
|
2246
|
+
const baselinesDir = fsPath.join(ralphDir, 'baselines');
|
|
2247
|
+
if (!fs.existsSync(baselinesDir) || !fs.statSync(baselinesDir).isDirectory()) {
|
|
2248
|
+
return [];
|
|
2249
|
+
}
|
|
2250
|
+
|
|
2251
|
+
const gates = [];
|
|
2252
|
+
for (const name of fs.readdirSync(baselinesDir)) {
|
|
2253
|
+
if (!/\.txt$/i.test(name)) continue;
|
|
2254
|
+
|
|
2255
|
+
const gateName = _gateNameFromBaselineFile(name);
|
|
2256
|
+
if (!gateName) continue;
|
|
2257
|
+
|
|
2258
|
+
const file = fsPath.join(baselinesDir, name);
|
|
2259
|
+
const tail = _readFileTail(file, 16384);
|
|
2260
|
+
const exitMatch = tail.match(/(?:^|\n)EXIT=(\d+)(?:\n|$)/);
|
|
2261
|
+
if (!exitMatch) continue;
|
|
2262
|
+
|
|
2263
|
+
const exitCode = Number(exitMatch[1]);
|
|
2264
|
+
if (!Number.isInteger(exitCode)) continue;
|
|
2265
|
+
|
|
2266
|
+
gates.push({ name: gateName, file: fsPath.join('baselines', name), exitCode });
|
|
2267
|
+
}
|
|
2268
|
+
|
|
2269
|
+
const priority = { typecheck: 1, lint: 2, test: 3 };
|
|
2270
|
+
return gates.sort((a, b) =>
|
|
2271
|
+
(priority[a.name] || 99) - (priority[b.name] || 99) ||
|
|
2272
|
+
a.file.localeCompare(b.file)
|
|
2273
|
+
);
|
|
2274
|
+
}
|
|
2275
|
+
|
|
2276
|
+
function _detectMissingBaselineGates(strictGates, recordedBaselines, taskBlock, tasksFile) {
|
|
2277
|
+
if (!Array.isArray(strictGates) || strictGates.length === 0) return [];
|
|
2278
|
+
|
|
2279
|
+
const expectsBaseline =
|
|
2280
|
+
_taskExplicitlyHandlesBaselineFailures(taskBlock) ||
|
|
2281
|
+
_completedPreflightBaselineExists(tasksFile);
|
|
2282
|
+
|
|
2283
|
+
if (!expectsBaseline) return [];
|
|
2284
|
+
|
|
2285
|
+
const recordedNames = new Set((recordedBaselines || []).map((gate) => gate.name));
|
|
2286
|
+
return strictGates.filter((gate) => !recordedNames.has(gate.name));
|
|
2287
|
+
}
|
|
2288
|
+
|
|
2289
|
+
function _completedPreflightBaselineExists(tasksFile) {
|
|
2290
|
+
const fs = require('fs');
|
|
2291
|
+
if (!tasksFile || !fs.existsSync(tasksFile)) return false;
|
|
2292
|
+
|
|
2293
|
+
const lines = fs.readFileSync(tasksFile, 'utf8').split(/\r?\n/);
|
|
2294
|
+
return lines.some((line) =>
|
|
2295
|
+
/^-\s+\[x\]\s+.*\bpre-?flight\b.*\bbaselines?\b/i.test(line)
|
|
2296
|
+
);
|
|
2297
|
+
}
|
|
2298
|
+
|
|
2299
|
+
function _gateNameFromBaselineFile(fileName) {
|
|
2300
|
+
const normalized = fileName.toLowerCase();
|
|
2301
|
+
if (/(^|[-_.])typecheck([-_.]|\.|$)/.test(normalized)) return 'typecheck';
|
|
2302
|
+
if (/(^|[-_.])lint([-_.]|\.|$)/.test(normalized)) return 'lint';
|
|
2303
|
+
if (/(^|[-_.])test([-_.]|\.|$)/.test(normalized)) return 'test';
|
|
2304
|
+
return '';
|
|
2305
|
+
}
|
|
2306
|
+
|
|
2307
|
+
function _readFileTail(file, maxBytes) {
|
|
2308
|
+
const fs = require('fs');
|
|
2309
|
+
let fd = null;
|
|
2310
|
+
try {
|
|
2311
|
+
const stat = fs.statSync(file);
|
|
2312
|
+
const length = Math.min(stat.size, maxBytes);
|
|
2313
|
+
const offset = Math.max(0, stat.size - length);
|
|
2314
|
+
const buffer = Buffer.alloc(length);
|
|
2315
|
+
fd = fs.openSync(file, 'r');
|
|
2316
|
+
fs.readSync(fd, buffer, 0, length, offset);
|
|
2317
|
+
return buffer.toString('utf8');
|
|
2318
|
+
} catch {
|
|
2319
|
+
return '';
|
|
2320
|
+
} finally {
|
|
2321
|
+
if (fd !== null) {
|
|
2322
|
+
try {
|
|
2323
|
+
fs.closeSync(fd);
|
|
2324
|
+
} catch {
|
|
2325
|
+
// Ignore close failures while building best-effort feedback.
|
|
2326
|
+
}
|
|
2327
|
+
}
|
|
2328
|
+
}
|
|
2329
|
+
}
|
|
2330
|
+
|
|
2331
|
+
function _taskExplicitlyHandlesBaselineFailures(taskBlock) {
|
|
2332
|
+
return /\bbaseline\b/i.test(taskBlock) &&
|
|
2333
|
+
/\b(match|matches|matching|classif(?:y|ied|ication)|pre-existing|preexisting|no new failures?)\b/i.test(taskBlock);
|
|
2334
|
+
}
|
|
2335
|
+
|
|
2336
|
+
function _detectAuthorizedBaselineCleanup(taskBlock) {
|
|
2337
|
+
if (!taskBlock || !/\b(authori[sz]ed cleanup|after fixing|fixing the named baseline failures?)\b/i.test(taskBlock)) {
|
|
2338
|
+
return { allowedFiles: [] };
|
|
2339
|
+
}
|
|
2340
|
+
|
|
2341
|
+
const allowedFiles = [];
|
|
2342
|
+
const seen = new Set();
|
|
2343
|
+
const backtickPattern = /`([^`]+)`/g;
|
|
2344
|
+
let match;
|
|
2345
|
+
|
|
2346
|
+
while ((match = backtickPattern.exec(taskBlock)) !== null) {
|
|
2347
|
+
const candidate = match[1].trim();
|
|
2348
|
+
if (!_looksLikeCleanupPath(candidate)) continue;
|
|
2349
|
+
|
|
2350
|
+
const normalized = candidate.replace(/\\/g, '/');
|
|
2351
|
+
if (seen.has(normalized)) continue;
|
|
2352
|
+
|
|
2353
|
+
seen.add(normalized);
|
|
2354
|
+
allowedFiles.push(normalized);
|
|
2355
|
+
}
|
|
2356
|
+
|
|
2357
|
+
return { allowedFiles };
|
|
2358
|
+
}
|
|
2359
|
+
|
|
2360
|
+
function _looksLikeCleanupPath(value) {
|
|
2361
|
+
if (!value || /\s/.test(value)) return false;
|
|
2362
|
+
if (/^(pnpm|npm|yarn|node|gtimeout|timeout|rg|git)(\s|$)/i.test(value)) return false;
|
|
2363
|
+
if (/^--?/.test(value)) return false;
|
|
2364
|
+
if (/[*{}]/.test(value)) return false;
|
|
2365
|
+
return value.includes('/') || /\.[A-Za-z0-9]+$/.test(value);
|
|
2366
|
+
}
|
|
2367
|
+
|
|
2368
|
+
function _baselineGateRepairBudgetUsed(recentHistory, currentTaskMeta, allowedFiles) {
|
|
2369
|
+
if (!Array.isArray(recentHistory) || recentHistory.length === 0) return false;
|
|
2370
|
+
|
|
2371
|
+
return recentHistory.some((entry) => {
|
|
2372
|
+
if (!_historyEntryMatchesTask(entry, currentTaskMeta)) return false;
|
|
2373
|
+
if (entry.baselineGateRepairAttempted === true) return true;
|
|
2374
|
+
|
|
2375
|
+
return _baselineGateRepairAttempted(
|
|
2376
|
+
{ mode: 'authorized_cleanup', allowedFiles },
|
|
2377
|
+
entry.filesChanged || []
|
|
2378
|
+
);
|
|
2379
|
+
});
|
|
2380
|
+
}
|
|
2381
|
+
|
|
2382
|
+
function _baselineGateRepairAttempted(conflict, filesChanged) {
|
|
2383
|
+
if (
|
|
2384
|
+
!conflict ||
|
|
2385
|
+
conflict.mode !== 'authorized_cleanup' ||
|
|
2386
|
+
!Array.isArray(conflict.allowedFiles) ||
|
|
2387
|
+
conflict.allowedFiles.length === 0 ||
|
|
2388
|
+
!Array.isArray(filesChanged) ||
|
|
2389
|
+
filesChanged.length === 0
|
|
2390
|
+
) {
|
|
2391
|
+
return false;
|
|
2392
|
+
}
|
|
2393
|
+
|
|
2394
|
+
return _pathsIntersect(conflict.allowedFiles, filesChanged);
|
|
2395
|
+
}
|
|
2396
|
+
|
|
2397
|
+
function _historyEntryMatchesTask(entry, currentTaskMeta) {
|
|
2398
|
+
if (!entry || !currentTaskMeta) return false;
|
|
2399
|
+
|
|
2400
|
+
const currentNumber = currentTaskMeta.number || '';
|
|
2401
|
+
const currentDescription = currentTaskMeta.description || '';
|
|
2402
|
+
|
|
2403
|
+
if (currentNumber && entry.taskNumber === currentNumber) return true;
|
|
2404
|
+
if (!currentNumber && currentDescription && entry.taskDescription === currentDescription) return true;
|
|
2405
|
+
|
|
2406
|
+
return false;
|
|
2407
|
+
}
|
|
2408
|
+
|
|
2409
|
+
function _pathsIntersect(left, right) {
|
|
2410
|
+
const normalizedLeft = new Set((left || []).map(_normalizeComparablePath));
|
|
2411
|
+
return (right || []).some((pathValue) => normalizedLeft.has(_normalizeComparablePath(pathValue)));
|
|
2412
|
+
}
|
|
2413
|
+
|
|
2414
|
+
function _normalizeComparablePath(pathValue) {
|
|
2415
|
+
return String(pathValue || '')
|
|
2416
|
+
.replace(/\\/g, '/')
|
|
2417
|
+
.replace(/^\.\//, '')
|
|
2418
|
+
.replace(/\/+$/, '');
|
|
2419
|
+
}
|
|
2420
|
+
|
|
1561
2421
|
function _extractErrorForIteration(errorEntries, iteration) {
|
|
1562
2422
|
if (!Array.isArray(errorEntries) || errorEntries.length === 0) return null;
|
|
1563
2423
|
|
|
@@ -1769,12 +2629,36 @@ module.exports = {
|
|
|
1769
2629
|
_validateOptions,
|
|
1770
2630
|
_autoCommit,
|
|
1771
2631
|
_buildAutoCommitAllowlist,
|
|
2632
|
+
_mergePathLists,
|
|
2633
|
+
_normalizePendingDirtyPaths,
|
|
2634
|
+
_recordPendingDirtyPaths,
|
|
2635
|
+
_remainingPendingDirtyPathsAfterCommit,
|
|
2636
|
+
_refreshPendingDirtyPaths,
|
|
2637
|
+
_samePendingTask,
|
|
2638
|
+
_currentDirtyPathSet,
|
|
1772
2639
|
_filterGitignored,
|
|
1773
2640
|
_resolveStartIteration,
|
|
1774
2641
|
_completedTaskDelta,
|
|
1775
2642
|
_formatAutoCommitMessage,
|
|
1776
2643
|
_truncateSubjectSummary,
|
|
1777
2644
|
_buildIterationFeedback,
|
|
2645
|
+
_buildAutoResolveHandoffFeedback,
|
|
2646
|
+
_resolveAutoResolveHandoffConfig,
|
|
2647
|
+
_handoffHasFocusedVerifierEvidence,
|
|
2648
|
+
_classifyAutoResolvableHandoff,
|
|
2649
|
+
_decideAutoResolveHandoff,
|
|
2650
|
+
_consumeAutoResolveHandoffBudget,
|
|
2651
|
+
_buildBaselineGateFeedback,
|
|
2652
|
+
_analyzeBaselineGateConflict,
|
|
2653
|
+
_formatBaselineGateFeedback,
|
|
2654
|
+
_extractCurrentTaskBlock,
|
|
2655
|
+
_detectStrictCleanGates,
|
|
2656
|
+
_detectFailingBaselineGates,
|
|
2657
|
+
_detectRecordedBaselineGates,
|
|
2658
|
+
_detectMissingBaselineGates,
|
|
2659
|
+
_detectAuthorizedBaselineCleanup,
|
|
2660
|
+
_baselineGateRepairBudgetUsed,
|
|
2661
|
+
_baselineGateRepairAttempted,
|
|
1778
2662
|
_extractErrorForIteration,
|
|
1779
2663
|
_getCurrentTaskDescription,
|
|
1780
2664
|
_getCurrentTaskMeta,
|
package/lib/mini-ralph/status.js
CHANGED
|
@@ -60,6 +60,31 @@ function render(ralphDir, tasksFile) {
|
|
|
60
60
|
lines.push(`Exit reason: ${loopState.exitReason}`);
|
|
61
61
|
}
|
|
62
62
|
|
|
63
|
+
const pendingDirtyPaths = _pendingDirtyPaths(loopState);
|
|
64
|
+
if (pendingDirtyPaths) {
|
|
65
|
+
lines.push('');
|
|
66
|
+
lines.push('--- Pending Dirty Paths ---');
|
|
67
|
+
lines.push(` Reason: ${pendingDirtyPaths.reason || 'blocked_handoff'}`);
|
|
68
|
+
if (pendingDirtyPaths.iteration) {
|
|
69
|
+
lines.push(` From iteration: ${pendingDirtyPaths.iteration}`);
|
|
70
|
+
}
|
|
71
|
+
const task = pendingDirtyPaths.taskNumber
|
|
72
|
+
? `${pendingDirtyPaths.taskNumber} ${pendingDirtyPaths.taskDescription || ''}`.trim()
|
|
73
|
+
: (pendingDirtyPaths.task || '');
|
|
74
|
+
if (task) {
|
|
75
|
+
lines.push(` Prior task: ${task}`);
|
|
76
|
+
}
|
|
77
|
+
const files = pendingDirtyPaths.files.slice(0, 10);
|
|
78
|
+
for (const file of files) {
|
|
79
|
+
lines.push(` - ${file}`);
|
|
80
|
+
}
|
|
81
|
+
if (pendingDirtyPaths.files.length > files.length) {
|
|
82
|
+
lines.push(` - (+${pendingDirtyPaths.files.length - files.length} more)`);
|
|
83
|
+
}
|
|
84
|
+
lines.push(' Resolve before continuing: commit with the same task, revert, or move to a separate change.');
|
|
85
|
+
lines.push('-'.repeat(50));
|
|
86
|
+
}
|
|
87
|
+
|
|
63
88
|
const latestCommitAnomaly = _latestCommitAnomaly(history.recent(ralphDir, 20));
|
|
64
89
|
if (latestCommitAnomaly) {
|
|
65
90
|
lines.push(`Commit issue: ${latestCommitAnomaly.commitAnomaly}`);
|
|
@@ -186,6 +211,16 @@ function _promptSummary(loopState) {
|
|
|
186
211
|
return '';
|
|
187
212
|
}
|
|
188
213
|
|
|
214
|
+
function _pendingDirtyPaths(loopState) {
|
|
215
|
+
const pending = loopState && loopState.pendingDirtyPaths;
|
|
216
|
+
if (!pending || typeof pending !== 'object') return null;
|
|
217
|
+
const files = Array.isArray(pending.files)
|
|
218
|
+
? pending.files.filter((file) => typeof file === 'string' && file.trim())
|
|
219
|
+
: [];
|
|
220
|
+
if (files.length === 0) return null;
|
|
221
|
+
return Object.assign({}, pending, { files });
|
|
222
|
+
}
|
|
223
|
+
|
|
189
224
|
/**
|
|
190
225
|
* Try to find a tasks file path from loop state.
|
|
191
226
|
*
|
package/package.json
CHANGED
|
@@ -27,6 +27,10 @@
|
|
|
27
27
|
* Loop exits cleanly with `blocked_handoff` when the
|
|
28
28
|
* agent emits this tag and writes the agent's note
|
|
29
29
|
* to <ralph-dir>/HANDOFF.md.
|
|
30
|
+
* --auto-resolve-handoffs Enable bounded continuation attempts for
|
|
31
|
+
* explicit, safe BLOCKED_HANDOFF classes
|
|
32
|
+
* --no-auto-resolve-handoffs
|
|
33
|
+
* Disable auto-resolution even when enabled by env
|
|
30
34
|
* --no-commit Suppress auto-commit
|
|
31
35
|
* --model <name> Optional model override
|
|
32
36
|
* --verbose Verbose output
|
|
@@ -44,6 +48,11 @@ const miniRalph = require('../lib/mini-ralph/index');
|
|
|
44
48
|
// Argument parsing
|
|
45
49
|
// ---------------------------------------------------------------------------
|
|
46
50
|
|
|
51
|
+
function _envFlagDefaultEnabled(value) {
|
|
52
|
+
if (value === undefined) return true;
|
|
53
|
+
return !/^(0|false|no|off)$/i.test(String(value || '').trim());
|
|
54
|
+
}
|
|
55
|
+
|
|
47
56
|
function parseArgs(argv) {
|
|
48
57
|
const args = argv.slice(2);
|
|
49
58
|
const opts = {
|
|
@@ -59,6 +68,7 @@ function parseArgs(argv) {
|
|
|
59
68
|
completionPromise: 'COMPLETE',
|
|
60
69
|
taskPromise: 'READY_FOR_NEXT_TASK',
|
|
61
70
|
blockedHandoffPromise: 'BLOCKED_HANDOFF',
|
|
71
|
+
autoResolveHandoffs: _envFlagDefaultEnabled(process.env.RALPH_AUTO_RESOLVE_HANDOFFS),
|
|
62
72
|
noCommit: false,
|
|
63
73
|
model: '',
|
|
64
74
|
verbose: false,
|
|
@@ -110,6 +120,12 @@ function parseArgs(argv) {
|
|
|
110
120
|
case '--blocked-handoff-promise':
|
|
111
121
|
opts.blockedHandoffPromise = args[++i];
|
|
112
122
|
break;
|
|
123
|
+
case '--auto-resolve-handoffs':
|
|
124
|
+
opts.autoResolveHandoffs = true;
|
|
125
|
+
break;
|
|
126
|
+
case '--no-auto-resolve-handoffs':
|
|
127
|
+
opts.autoResolveHandoffs = false;
|
|
128
|
+
break;
|
|
113
129
|
case '--no-commit':
|
|
114
130
|
opts.noCommit = true;
|
|
115
131
|
break;
|
|
@@ -165,6 +181,8 @@ Options:
|
|
|
165
181
|
--task-promise <s> Task promise string
|
|
166
182
|
--blocked-handoff-promise <s>
|
|
167
183
|
Blocked-handoff promise string (default: BLOCKED_HANDOFF)
|
|
184
|
+
--auto-resolve-handoffs Enable bounded continuation for explicit safe handoffs
|
|
185
|
+
--no-auto-resolve-handoffs Disable bounded continuation for explicit safe handoffs
|
|
168
186
|
--no-commit Suppress auto-commit
|
|
169
187
|
--model <name> Model override
|
|
170
188
|
--verbose Verbose output
|
|
@@ -224,6 +242,7 @@ async function main() {
|
|
|
224
242
|
completionPromise: opts.completionPromise,
|
|
225
243
|
taskPromise: opts.taskPromise,
|
|
226
244
|
blockedHandoffPromise: opts.blockedHandoffPromise,
|
|
245
|
+
autoResolveHandoffs: opts.autoResolveHandoffs,
|
|
227
246
|
noCommit: opts.noCommit,
|
|
228
247
|
model: opts.model,
|
|
229
248
|
verbose: opts.verbose,
|
|
@@ -251,4 +270,11 @@ async function main() {
|
|
|
251
270
|
}
|
|
252
271
|
}
|
|
253
272
|
|
|
254
|
-
main
|
|
273
|
+
if (require.main === module) {
|
|
274
|
+
main();
|
|
275
|
+
}
|
|
276
|
+
|
|
277
|
+
module.exports = {
|
|
278
|
+
_envFlagDefaultEnabled,
|
|
279
|
+
_parseArgs: parseArgs,
|
|
280
|
+
};
|
package/scripts/ralph-run.sh
CHANGED
|
@@ -129,6 +129,7 @@ resolve_ralph_command() {
|
|
|
129
129
|
CHANGE_NAME=""
|
|
130
130
|
MAX_ITERATIONS=""
|
|
131
131
|
NO_COMMIT=false
|
|
132
|
+
AUTO_RESOLVE_HANDOFFS=""
|
|
132
133
|
SHOW_STATUS=false
|
|
133
134
|
SHOW_VERSION=false
|
|
134
135
|
ADD_CONTEXT=""
|
|
@@ -186,6 +187,9 @@ OPTIONS:
|
|
|
186
187
|
--change <name> Specify the OpenSpec change to execute (default: auto-detect)
|
|
187
188
|
--max-iterations <n> Maximum iterations for Ralph loop (default: 50)
|
|
188
189
|
--no-commit Suppress automatic git commits during the loop
|
|
190
|
+
--auto-resolve-handoffs Enable bounded continuation for explicit safe handoffs
|
|
191
|
+
--no-auto-resolve-handoffs
|
|
192
|
+
Disable bounded continuation for explicit safe handoffs
|
|
189
193
|
--verbose, -v Enable verbose mode for debugging
|
|
190
194
|
--quiet Suppress the per-iteration progress stream
|
|
191
195
|
--version Print the version and exit
|
|
@@ -232,6 +236,14 @@ parse_arguments() {
|
|
|
232
236
|
NO_COMMIT=true
|
|
233
237
|
shift
|
|
234
238
|
;;
|
|
239
|
+
--auto-resolve-handoffs)
|
|
240
|
+
AUTO_RESOLVE_HANDOFFS=true
|
|
241
|
+
shift
|
|
242
|
+
;;
|
|
243
|
+
--no-auto-resolve-handoffs)
|
|
244
|
+
AUTO_RESOLVE_HANDOFFS=false
|
|
245
|
+
shift
|
|
246
|
+
;;
|
|
235
247
|
--verbose|-v)
|
|
236
248
|
VERBOSE=true
|
|
237
249
|
shift
|
|
@@ -1006,6 +1018,12 @@ Do not create git commits yourself. The Ralph runner manages automatic task comm
|
|
|
1006
1018
|
mini_ralph_args+=("--no-commit")
|
|
1007
1019
|
fi
|
|
1008
1020
|
|
|
1021
|
+
if [[ "$AUTO_RESOLVE_HANDOFFS" == true ]]; then
|
|
1022
|
+
mini_ralph_args+=("--auto-resolve-handoffs")
|
|
1023
|
+
elif [[ "$AUTO_RESOLVE_HANDOFFS" == false ]]; then
|
|
1024
|
+
mini_ralph_args+=("--no-auto-resolve-handoffs")
|
|
1025
|
+
fi
|
|
1026
|
+
|
|
1009
1027
|
if [[ "$VERBOSE" == true ]]; then
|
|
1010
1028
|
mini_ralph_args+=("--verbose")
|
|
1011
1029
|
fi
|
|
@@ -1182,6 +1200,7 @@ rules:
|
|
|
1182
1200
|
tasks:
|
|
1183
1201
|
- Use the task template from OPENSPEC-RALPH-BP.md
|
|
1184
1202
|
- Each task has one dominant outcome and one verification cluster
|
|
1203
|
+
- Use surgical, scope-targeted validation commands; reserve broad gates for pre-flight baselines or final integration tasks
|
|
1185
1204
|
- Include explicit stop-and-hand-off conditions
|
|
1186
1205
|
design:
|
|
1187
1206
|
- Do not leave core policy choices unresolved
|
|
@@ -1203,6 +1222,7 @@ Before generating any OpenSpec artifacts, you MUST:
|
|
|
1203
1222
|
- Read `openspec/OPENSPEC-RALPH-BP.md` (Ralph Wiggum authoring guide)
|
|
1204
1223
|
- Verify proposals against the Ralph authoring checklist
|
|
1205
1224
|
- Ensure tasks use the task template with objective done-when conditions
|
|
1225
|
+
- Ensure each task uses the narrowest verifier that proves its scope; use broad gates only with baseline classification or final integration tasks
|
|
1206
1226
|
- Include explicit stop-and-hand-off conditions in every task
|
|
1207
1227
|
RALPH_AGENTS
|
|
1208
1228
|
log_verbose "Updated $agents_file with Ralph Wiggum compliance section"
|
|
@@ -1311,7 +1331,7 @@ WARNING_BOX
|
|
|
1311
1331
|
fi
|
|
1312
1332
|
local ralph_guidance=""
|
|
1313
1333
|
if [[ -f "$bp_file" ]]; then
|
|
1314
|
-
ralph_guidance=" When creating artifacts, read ${bp_file} and follow the Ralph Wiggum task template and authoring checklist. Ensure the proposal includes explicit scope, non-goals, first-rollout boundaries, and capabilities that map to Ralph-friendly tasks. Ensure tasks use the task template with objective done-when conditions and explicit stop-and-hand-off conditions. Do NOT restore or copy from any .bak backup files - write fresh artifacts from scratch."
|
|
1334
|
+
ralph_guidance=" When creating artifacts, read ${bp_file} and follow the Ralph Wiggum task template and authoring checklist. Ensure the proposal includes explicit scope, non-goals, first-rollout boundaries, and capabilities that map to Ralph-friendly tasks. Ensure tasks use the task template with objective done-when conditions, surgical scope-targeted verifier commands, and explicit stop-and-hand-off conditions. Prefer direct test-file or validator commands over full-suite commands; reserve broad gates for pre-flight baselines or final integration tasks. Do NOT restore or copy from any .bak backup files - write fresh artifacts from scratch."
|
|
1315
1335
|
fi
|
|
1316
1336
|
|
|
1317
1337
|
log_info "Invoking opencode to regenerate proposal and tasks with Ralph Wiggum best practices..."
|