@ai-dev-methodologies/rlp-desk 0.9.2 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,817 @@
1
+ # Flywheel Enhancement Implementation Plan
2
+
3
+ > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
+
5
+ **Goal:** Add an independent Guard agent that validates flywheel direction decisions before Worker acts on them, plus selective CEO cognitive pattern internalization.
6
+
7
+ **Architecture:** When `--flywheel-guard on`, every flywheel execution is followed by an independent Guard agent (fresh context) that checks look-ahead bias, metric alignment, deployability, and repeat patterns. Guard verdict is 3-state (pass/fail/inconclusive). On fail, flywheel retries with guard feedback (max 2). On inconclusive, BLOCKED. Guard count tracked per-US.
8
+
9
+ **Tech Stack:** Node.js (ESM), zsh (init script), node:test
10
+
11
+ **Spec:** `docs/blueprints/blueprint-flywheel-enhancement.md`
12
+
13
+ ---
14
+
15
+ ### Task 1: Add CEO cognitive patterns #11-12 to flywheel prompt
16
+
17
+ **Files:**
18
+ - Modify: `src/scripts/init_ralph_desk.zsh:624-634`
19
+ - Modify: `tests/node/test-flywheel.mjs:35-48`
20
+
21
+ - [ ] **Step 1: Update test T5 to expect 12 patterns**
22
+
23
+ In `tests/node/test-flywheel.mjs`, update the test title and add 2 new assertions:
24
+
25
+ ```javascript
26
+ test('T5: flywheel prompt contains 12 CEO cognitive patterns', async () => {
27
+ const script = path.join(repoRoot, 'src', 'scripts', 'init_ralph_desk.zsh');
28
+ const content = await fs.readFile(script, 'utf8');
29
+ assert.match(content, /First-principles/);
30
+ assert.match(content, /10x check/);
31
+ assert.match(content, /Inversion/);
32
+ assert.match(content, /Simplicity bias/);
33
+ assert.match(content, /User-back/);
34
+ assert.match(content, /Time-value/);
35
+ assert.match(content, /Sunk cost immunity/);
36
+ assert.match(content, /Blast radius/);
37
+ assert.match(content, /Reversibility/);
38
+ assert.match(content, /Evidence > opinion/);
39
+ assert.match(content, /Proxy skepticism/);
40
+ assert.match(content, /Classification/);
41
+ });
42
+ ```
43
+
44
+ - [ ] **Step 2: Run test to verify it fails**
45
+
46
+ ```bash
47
+ node --test tests/node/test-flywheel.mjs
48
+ ```
49
+ Expected: T5 FAIL (`Proxy skepticism` not found)
50
+
51
+ - [ ] **Step 3: Add patterns #11-12 to flywheel prompt in init_ralph_desk.zsh**
52
+
53
+ In `src/scripts/init_ralph_desk.zsh`, replace lines 624-634 (the CEO Cognitive Patterns section inside the flywheel prompt heredoc):
54
+
55
+ ```
56
+ ## CEO Cognitive Patterns (apply throughout your review)
57
+ 1. First-principles — ignore convention, start from the problem itself
58
+ 2. 10x check — can 2x effort yield 10x better result?
59
+ 3. Inversion — what must be true for this approach to fail?
60
+ 4. Simplicity bias — prefer simple over complex solutions
61
+ 5. User-back — reason backwards from end-user experience
62
+ 6. Time-value — does this direction change save 3+ iterations?
63
+ 7. Sunk cost immunity — ignore what was already invested
64
+ 8. Blast radius — assess impact scope of direction change
65
+ 9. Reversibility — prefer easily reversible decisions
66
+ 10. Evidence > opinion — judge only by this iteration's actual results
67
+ 11. Proxy skepticism — is the optimization metric the right proxy for the real goal?
68
+ 12. Classification — hard-to-reverse + large-magnitude changes need stronger evidence
69
+ ```
70
+
71
+ - [ ] **Step 4: Run tests to verify they pass**
72
+
73
+ ```bash
74
+ zsh -n src/scripts/init_ralph_desk.zsh && echo "SYNTAX OK"
75
+ node --test tests/node/test-flywheel.mjs
76
+ ```
77
+ Expected: SYNTAX OK, all PASS
78
+
79
+ - [ ] **Step 5: Commit**
80
+
81
+ ```bash
82
+ git add tests/node/test-flywheel.mjs src/scripts/init_ralph_desk.zsh
83
+ git commit -m "feat: add CEO patterns #11-12 (proxy skepticism, classification) to flywheel prompt"
84
+ ```
85
+
86
+ ---
87
+
88
+ ### Task 2: Add flywheel guard CLI flags
89
+
90
+ **Files:**
91
+ - Modify: `src/node/run.mjs:8-26` (RUN_DEFAULTS), `:32-65` (help), `:84-163` (parser)
92
+ - Create: `tests/node/test-flywheel-guard.mjs`
93
+
94
+ - [ ] **Step 1: Write failing tests**
95
+
96
+ Create `tests/node/test-flywheel-guard.mjs`:
97
+
98
+ ```javascript
99
+ import test from 'node:test';
100
+ import assert from 'node:assert/strict';
101
+
102
+ test('G1: RUN_DEFAULTS includes flywheelGuard off and flywheelGuardModel opus', async () => {
103
+ const runModule = await import('../../src/node/run.mjs');
104
+ // Test via CLI parsing with no flags — defaults should apply
105
+ const stream = { data: '', write(v) { this.data += v; } };
106
+ // Just verify the module loads without error — defaults tested via G3
107
+ assert.ok(runModule.main);
108
+ });
109
+
110
+ test('G2: --flywheel-guard flag is parsed', async () => {
111
+ const { main } = await import('../../src/node/run.mjs');
112
+ const stream = { data: '', write(v) { this.data += v; } };
113
+ // --flywheel-guard without value should error
114
+ const code = await main(['run', 'test-slug', '--flywheel-guard'], {
115
+ cwd: '/tmp/nonexistent',
116
+ stdout: stream,
117
+ stderr: stream,
118
+ runCampaign: async () => {},
119
+ initCampaign: async () => {},
120
+ readStatus: async () => '',
121
+ });
122
+ assert.equal(code, 1);
123
+ assert.match(stream.data, /missing value for --flywheel-guard/);
124
+ });
125
+
126
+ test('G3: --flywheel-guard-model flag is parsed', async () => {
127
+ const { main } = await import('../../src/node/run.mjs');
128
+ const stream = { data: '', write(v) { this.data += v; } };
129
+ const code = await main(['run', 'test-slug', '--flywheel-guard-model'], {
130
+ cwd: '/tmp/nonexistent',
131
+ stdout: stream,
132
+ stderr: stream,
133
+ runCampaign: async () => {},
134
+ initCampaign: async () => {},
135
+ readStatus: async () => '',
136
+ });
137
+ assert.equal(code, 1);
138
+ assert.match(stream.data, /missing value for --flywheel-guard-model/);
139
+ });
140
+
141
+ test('G4: help text includes flywheel guard flags', async () => {
142
+ const { main } = await import('../../src/node/run.mjs');
143
+ const stream = { data: '', write(v) { this.data += v; } };
144
+ await main(['--help'], {
145
+ cwd: '/tmp',
146
+ stdout: stream,
147
+ stderr: stream,
148
+ runCampaign: async () => {},
149
+ initCampaign: async () => {},
150
+ readStatus: async () => '',
151
+ });
152
+ assert.match(stream.data, /--flywheel-guard off\|on/);
153
+ assert.match(stream.data, /--flywheel-guard-model MODEL/);
154
+ });
155
+ ```
156
+
157
+ - [ ] **Step 2: Run tests to verify they fail**
158
+
159
+ ```bash
160
+ node --test tests/node/test-flywheel-guard.mjs
161
+ ```
162
+ Expected: G2-G4 FAIL (unknown option, help text missing)
163
+
164
+ - [ ] **Step 3: Add defaults, help text, and parser in run.mjs**
165
+
166
+ In `src/node/run.mjs`, add to `RUN_DEFAULTS` (after line 25 `flywheelModel: 'opus'`):
167
+
168
+ ```javascript
169
+ flywheelGuard: 'off',
170
+ flywheelGuardModel: 'opus',
171
+ ```
172
+
173
+ Add to `buildHelpText()` array (after `--flywheel-model MODEL` line):
174
+
175
+ ```javascript
176
+ ' --flywheel-guard off|on',
177
+ ' --flywheel-guard-model MODEL',
178
+ ```
179
+
180
+ Add to `parseRunOptions()` switch (after `--flywheel-model` case):
181
+
182
+ ```javascript
183
+ case '--flywheel-guard':
184
+ options.flywheelGuard = consumeValue(args, index, token);
185
+ index += 1;
186
+ break;
187
+ case '--flywheel-guard-model':
188
+ options.flywheelGuardModel = consumeValue(args, index, token);
189
+ index += 1;
190
+ break;
191
+ ```
192
+
193
+ - [ ] **Step 4: Run tests to verify they pass**
194
+
195
+ ```bash
196
+ node --test tests/node/test-flywheel-guard.mjs
197
+ ```
198
+ Expected: G1-G4 all PASS
199
+
200
+ - [ ] **Step 5: Run regression**
201
+
202
+ ```bash
203
+ node --test tests/node/us008-cli-entrypoint.test.mjs
204
+ ```
205
+ Expected: PASS (new defaults don't break existing deepEqual checks — verify; if deepEqual on RUN_DEFAULTS exists, update it to include new fields)
206
+
207
+ - [ ] **Step 6: Commit**
208
+
209
+ ```bash
210
+ git add src/node/run.mjs tests/node/test-flywheel-guard.mjs
211
+ git commit -m "feat: add --flywheel-guard and --flywheel-guard-model CLI flags"
212
+ ```
213
+
214
+ ---
215
+
216
+ ### Task 3: Add guard paths and shouldRunGuard logic
217
+
218
+ **Files:**
219
+ - Modify: `src/node/runner/campaign-main-loop.mjs:37-66` (buildPaths), `:415-419` (shouldRunFlywheel area)
220
+ - Modify: `tests/node/test-flywheel-guard.mjs`
221
+
222
+ - [ ] **Step 1: Write failing tests**
223
+
224
+ Append to `tests/node/test-flywheel-guard.mjs`:
225
+
226
+ ```javascript
227
+ test('G5: shouldRunGuard returns false when flywheelGuard=off', async () => {
228
+ const { shouldRunGuard } = await import('../../src/node/runner/campaign-main-loop.mjs');
229
+ assert.equal(shouldRunGuard('off', { flywheel_guard_count: {} }), false);
230
+ });
231
+
232
+ test('G6: shouldRunGuard returns true when flywheelGuard=on', async () => {
233
+ const { shouldRunGuard } = await import('../../src/node/runner/campaign-main-loop.mjs');
234
+ assert.equal(shouldRunGuard('on', { flywheel_guard_count: {} }), true);
235
+ });
236
+
237
+ test('G7: shouldRunGuard returns false when flywheelGuard=on but guard retries exhausted', async () => {
238
+ const { shouldRunGuard } = await import('../../src/node/runner/campaign-main-loop.mjs');
239
+ assert.equal(shouldRunGuard('on', { flywheel_guard_count: { 'US-001': 3 } }, 'US-001'), false);
240
+ });
241
+ ```
242
+
243
+ - [ ] **Step 2: Run tests to verify they fail**
244
+
245
+ ```bash
246
+ node --test tests/node/test-flywheel-guard.mjs
247
+ ```
248
+ Expected: G5-G7 FAIL (shouldRunGuard not exported)
249
+
250
+ - [ ] **Step 3: Implement shouldRunGuard and add guard paths**
251
+
252
+ In `src/node/runner/campaign-main-loop.mjs`, add to `buildPaths()` (after `flywheelSignalFile` line 65):
253
+
254
+ ```javascript
255
+ flywheelGuardPromptFile: path.join(deskRoot, 'prompts', `${slug}.flywheel-guard.prompt.md`),
256
+ flywheelGuardVerdictFile: path.join(deskRoot, 'memos', `${slug}-flywheel-guard-verdict.json`),
257
+ ```
258
+
259
+ Add new exported function (after `shouldRunFlywheel`):
260
+
261
+ ```javascript
262
+ export function shouldRunGuard(flywheelGuard, state, usId) {
263
+ if (flywheelGuard !== 'on') return false;
264
+ const count = (state.flywheel_guard_count ?? {})[usId] ?? 0;
265
+ // max 2 retries (guard runs 1st time + 2 retries = 3 total guard executions max)
266
+ if (count >= 3) return false;
267
+ return true;
268
+ }
269
+ ```
270
+
271
+ - [ ] **Step 4: Run tests to verify they pass**
272
+
273
+ ```bash
274
+ node --test tests/node/test-flywheel-guard.mjs
275
+ ```
276
+ Expected: G1-G7 all PASS
277
+
278
+ - [ ] **Step 5: Commit**
279
+
280
+ ```bash
281
+ git add src/node/runner/campaign-main-loop.mjs tests/node/test-flywheel-guard.mjs
282
+ git commit -m "feat: add shouldRunGuard logic and guard paths to buildPaths"
283
+ ```
284
+
285
+ ---
286
+
287
+ ### Task 4: Add guard prompt template to init_ralph_desk.zsh
288
+
289
+ **Files:**
290
+ - Modify: `src/scripts/init_ralph_desk.zsh` (after flywheel prompt section, ~line 690)
291
+ - Modify: `src/scripts/init_ralph_desk.zsh:276-283` (cleanup list)
292
+ - Modify: `src/scripts/init_ralph_desk.zsh:294-303` (prompt cleanup)
293
+ - Modify: `tests/node/test-flywheel-guard.mjs`
294
+
295
+ - [ ] **Step 1: Write failing tests**
296
+
297
+ Append to `tests/node/test-flywheel-guard.mjs`:
298
+
299
+ ```javascript
300
+ import fs from 'node:fs/promises';
301
+ import path from 'node:path';
302
+ import { fileURLToPath } from 'node:url';
303
+
304
+ const repoRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), '..', '..');
305
+
306
+ test('G8: init generates guard prompt with 4 validation checks', async () => {
307
+ const script = path.join(repoRoot, 'src', 'scripts', 'init_ralph_desk.zsh');
308
+ const content = await fs.readFile(script, 'utf8');
309
+ assert.match(content, /Look-ahead Bias/);
310
+ assert.match(content, /Metric Alignment/);
311
+ assert.match(content, /Deployability/);
312
+ assert.match(content, /Repeat Pattern/);
313
+ });
314
+
315
+ test('G9: guard prompt writes to flywheel-guard-verdict.json', async () => {
316
+ const script = path.join(repoRoot, 'src', 'scripts', 'init_ralph_desk.zsh');
317
+ const content = await fs.readFile(script, 'utf8');
318
+ assert.match(content, /flywheel-guard-verdict\.json/);
319
+ });
320
+
321
+ test('G10: guard verdict includes analysis_only field', async () => {
322
+ const script = path.join(repoRoot, 'src', 'scripts', 'init_ralph_desk.zsh');
323
+ const content = await fs.readFile(script, 'utf8');
324
+ assert.match(content, /analysis_only/);
325
+ });
326
+
327
+ test('G11: guard prompt references PRD as ground truth', async () => {
328
+ const script = path.join(repoRoot, 'src', 'scripts', 'init_ralph_desk.zsh');
329
+ const content = await fs.readFile(script, 'utf8');
330
+ assert.match(content, /PRD is ground truth/);
331
+ });
332
+
333
+ test('G12: cleanup list includes guard verdict file', async () => {
334
+ const script = path.join(repoRoot, 'src', 'scripts', 'init_ralph_desk.zsh');
335
+ const content = await fs.readFile(script, 'utf8');
336
+ assert.match(content, /flywheel-guard-verdict\.json/);
337
+ });
338
+ ```
339
+
340
+ Note: the `fs`, `path`, `fileURLToPath`, `repoRoot` imports at the top of the file already exist from test-flywheel.mjs pattern. If creating a new file, include them. If appending, they must be at the top of the file — move the import block to the top and ensure no duplicates.
341
+
342
+ - [ ] **Step 2: Run tests to verify they fail**
343
+
344
+ ```bash
345
+ node --test tests/node/test-flywheel-guard.mjs
346
+ ```
347
+ Expected: G8-G12 FAIL
348
+
349
+ - [ ] **Step 3: Add guard prompt template to init_ralph_desk.zsh**
350
+
351
+ After the flywheel prompt section (after `else echo " · $F"; fi` around line 690), add:
352
+
353
+ ```bash
354
+ # --- Flywheel Guard Prompt ---
355
+ F="$DESK/prompts/$SLUG.flywheel-guard.prompt.md"
356
+ if [[ ! -f "$F" ]]; then
357
+ cat > "$F" <<'GUARD_EOF'
358
+ # Flywheel Guard Review
359
+
360
+ You are an independent reviewer verifying whether a flywheel direction decision is safe to execute.
361
+ You have NO prior context about this campaign. Read the files below and evaluate the decision objectively.
362
+
363
+ ## Files to Read (in order)
364
+ 1. PRD: {DESK}/plans/prd-{SLUG}.md — the ground truth for what success means
365
+ 2. Flywheel Decision: {DESK}/memos/{SLUG}-flywheel-signal.json — what the flywheel decided
366
+ 3. Flywheel Analysis: {DESK}/memos/{SLUG}-flywheel-review.md — the flywheel's reasoning
367
+ 4. Campaign Memory: {DESK}/memos/{SLUG}-memory.md — history, rejected directions, key decisions
368
+ 5. Done Claim: {DESK}/memos/{SLUG}-done-claim.json — what the Worker actually produced
369
+ 6. Verify Verdict: {DESK}/memos/{SLUG}-verify-verdict.json — why the Verifier failed it
370
+
371
+ ## Validation Checks
372
+
373
+ ### Check 1: Look-ahead Bias
374
+ List every data feature the flywheel's proposed direction depends on.
375
+ For each: "feature X — available at decision time: YES/NO/UNCLEAR"
376
+ - YES: feature is known before the event (entry time, session start price, order book state)
377
+ - NO: feature requires future information (peak price, session end, outcome)
378
+ - UNCLEAR: cannot determine from available context → mark inconclusive
379
+ If ANY feature is NO and used in a deployable strategy (not just upper-bound analysis): FAIL.
380
+
381
+ ### Check 2: Metric Alignment
382
+ 1. What metric does the PRD define as the optimization target?
383
+ 2. What metric does the flywheel's direction optimize?
384
+ 3. Are they the same?
385
+ - Same metric → pass
386
+ - Different metric, not flagged → FAIL (silent metric switch)
387
+ - Different metric, flagged with evidence → FAIL with recommendation: "metric mismatch requires PRD update or user approval before proceeding"
388
+ PRD is ground truth. The guard cannot approve off-PRD metric changes autonomously.
389
+
390
+ ### Check 3: Deployability
391
+ Can the proposed direction's output be used in production as-is?
392
+ - Requires post-hoc data → FAIL
393
+ - Requires infrastructure not mentioned in PRD → FAIL
394
+ - Labeled as "upper-bound only" or "reference" → pass, but you MUST include "analysis_only": true in your verdict so Leader skips Worker dispatch (no implementation, analysis record only)
395
+
396
+ ### Check 4: Repeat Pattern (same-US scoped)
397
+ Compare to prior flywheel decisions for the current US only in campaign memory's Key Decisions section.
398
+ - Same scope decision + same underlying approach as a prior flywheel for this US → FAIL
399
+ - Reframing of a previously rejected direction (check Rejected Directions) → FAIL
400
+ - Genuinely new approach → pass
401
+ Before writing your verdict, you MUST append any rejected flywheel direction to campaign memory's Rejected Directions section. This persists the record before cleanup can erase it.
402
+
403
+ ## Output
404
+ Write verdict to: {DESK}/memos/{SLUG}-flywheel-guard-verdict.json
405
+
406
+ Use this format:
407
+ {
408
+ "verdict": "pass|fail|inconclusive",
409
+ "issues": [{"check": "check-name", "status": "pass|fail|inconclusive", "detail": "finding", "evidence": "reference"}],
410
+ "analysis_only": false,
411
+ "recommendation": "proceed|retry-flywheel|escalate-to-user",
412
+ "timestamp": "ISO"
413
+ }
414
+
415
+ Rules:
416
+ - If ALL checks pass → verdict: pass, recommendation: proceed
417
+ - If ANY check is fail → verdict: fail, recommendation: retry-flywheel
418
+ - If ANY check is inconclusive and none are fail → verdict: inconclusive, recommendation: escalate-to-user
419
+ - Include specific evidence for every check. No "seems fine" or "probably ok."
420
+ GUARD_EOF
421
+
422
+ # Replace placeholders with actual paths
423
+ sed -i '' "s|{DESK}|$DESK|g; s|{SLUG}|$SLUG|g" "$F"
424
+
425
+ echo " + $F"
426
+ else echo " · $F"; fi
427
+ ```
428
+
429
+ - [ ] **Step 4: Add guard files to cleanup lists**
430
+
431
+ In `src/scripts/init_ralph_desk.zsh`, add to the runtime memos cleanup list (after `"$DESK/memos/$SLUG-flywheel-review.md"` around line 283):
432
+
433
+ ```bash
434
+ "$DESK/memos/$SLUG-flywheel-guard-verdict.json" \
435
+ ```
436
+
437
+ Add to prompt cleanup list (after `"$DESK/prompts/$SLUG.flywheel.prompt.md"` around line 298):
438
+
439
+ ```bash
440
+ "$DESK/prompts/$SLUG.flywheel-guard.prompt.md" \
441
+ ```
442
+
443
+ - [ ] **Step 5: Run tests to verify they pass**
444
+
445
+ ```bash
446
+ zsh -n src/scripts/init_ralph_desk.zsh && echo "SYNTAX OK"
447
+ node --test tests/node/test-flywheel-guard.mjs
448
+ ```
449
+ Expected: SYNTAX OK, G1-G12 all PASS
450
+
451
+ - [ ] **Step 6: Commit**
452
+
453
+ ```bash
454
+ git add src/scripts/init_ralph_desk.zsh tests/node/test-flywheel-guard.mjs
455
+ git commit -m "feat: add flywheel guard prompt template with 4 validation checks"
456
+ ```
457
+
458
+ ---
459
+
460
+ ### Task 5: Wire guard into campaign-main-loop.mjs
461
+
462
+ **Files:**
463
+ - Modify: `src/node/runner/campaign-main-loop.mjs:242-261` (readCurrentState), `:402-404` (buildFlywheelTriggerCmd area), `:537-559` (flywheel block in main loop)
464
+ - Modify: `tests/node/test-flywheel-guard.mjs`
465
+
466
+ - [ ] **Step 1: Write failing tests**
467
+
468
+ Append to `tests/node/test-flywheel-guard.mjs`:
469
+
470
+ ```javascript
471
+ test('G13: buildPaths includes guard paths', async () => {
472
+ const script = path.join(repoRoot, 'src', 'node', 'runner', 'campaign-main-loop.mjs');
473
+ const content = await fs.readFile(script, 'utf8');
474
+ assert.match(content, /flywheelGuardPromptFile/);
475
+ assert.match(content, /flywheelGuardVerdictFile/);
476
+ });
477
+
478
+ test('G14: guard dispatch exists in main loop', async () => {
479
+ const script = path.join(repoRoot, 'src', 'node', 'runner', 'campaign-main-loop.mjs');
480
+ const content = await fs.readFile(script, 'utf8');
481
+ assert.match(content, /dispatchGuard/);
482
+ assert.match(content, /phase.*guard/i);
483
+ });
484
+
485
+ test('G15: guard runs AFTER flywheel and BEFORE worker', async () => {
486
+ const script = path.join(repoRoot, 'src', 'node', 'runner', 'campaign-main-loop.mjs');
487
+ const content = await fs.readFile(script, 'utf8');
488
+ const flywheelPos = content.indexOf('dispatchFlywheel');
489
+ const guardPos = content.indexOf('dispatchGuard');
490
+ const workerPos = content.indexOf('dispatchWorker');
491
+ assert.ok(flywheelPos < guardPos, 'flywheel must come before guard');
492
+ assert.ok(guardPos < workerPos, 'guard must come before worker');
493
+ });
494
+
495
+ test('G16: readCurrentState includes flywheel_guard_count', async () => {
496
+ const script = path.join(repoRoot, 'src', 'node', 'runner', 'campaign-main-loop.mjs');
497
+ const content = await fs.readFile(script, 'utf8');
498
+ assert.match(content, /flywheel_guard_count/);
499
+ });
500
+
501
+ test('G17: inconclusive verdict triggers BLOCKED', async () => {
502
+ const script = path.join(repoRoot, 'src', 'node', 'runner', 'campaign-main-loop.mjs');
503
+ const content = await fs.readFile(script, 'utf8');
504
+ assert.match(content, /inconclusive/);
505
+ assert.match(content, /escalate/i);
506
+ });
507
+ ```
508
+
509
+ - [ ] **Step 2: Run tests to verify they fail**
510
+
511
+ ```bash
512
+ node --test tests/node/test-flywheel-guard.mjs
513
+ ```
514
+ Expected: G13-G17 FAIL
515
+
516
+ - [ ] **Step 3: Add flywheel_guard_count to readCurrentState**
517
+
518
+ In `src/node/runner/campaign-main-loop.mjs`, add to `readCurrentState()` return object (after `verifier_pane_id` line 259):
519
+
520
+ ```javascript
521
+ flywheel_guard_count: status.flywheel_guard_count ?? {},
522
+ ```
523
+
524
+ - [ ] **Step 4: Add buildGuardTriggerCmd and dispatchGuard**
525
+
526
+ After `dispatchFlywheel` function (around line 413), add:
527
+
528
+ ```javascript
529
+ function buildGuardTriggerCmd({ guardPromptFile, guardModel, rootDir }) {
530
+ return `cd ${JSON.stringify(rootDir)} && DISABLE_OMC=1 claude --model ${guardModel} --no-mcp -p "$(cat ${JSON.stringify(guardPromptFile)})"`;
531
+ }
532
+
533
+ async function dispatchGuard({ paths, sendKeys, guardPaneId, guardModel, rootDir }) {
534
+ const triggerCmd = buildGuardTriggerCmd({
535
+ guardPromptFile: paths.flywheelGuardPromptFile,
536
+ guardModel,
537
+ rootDir,
538
+ });
539
+ await sendKeys(guardPaneId, triggerCmd);
540
+ }
541
+ ```
542
+
543
+ - [ ] **Step 5: Wire guard into main loop flywheel block**
544
+
545
+ Replace the flywheel block (lines 537-559) with the expanded version that includes guard:
546
+
547
+ ```javascript
548
+ // Flywheel direction review (runs BEFORE Worker)
549
+ if (shouldRunFlywheel(options.flywheel ?? 'off', state)) {
550
+ state.phase = 'flywheel';
551
+ await writeStatus(paths, state, options.onStatusChange, options.now);
552
+
553
+ await dispatchFlywheel({
554
+ paths,
555
+ sendKeys,
556
+ flywheelPaneId: state.flywheel_pane_id ?? state.verifier_pane_id,
557
+ flywheelModel: options.flywheelModel ?? 'opus',
558
+ rootDir,
559
+ });
560
+
561
+ const flywheelSignal = await pollForSignal(paths.flywheelSignalFile, {
562
+ mode: 'claude',
563
+ paneId: state.flywheel_pane_id ?? state.verifier_pane_id,
564
+ });
565
+
566
+ state.last_flywheel_decision = flywheelSignal.decision;
567
+ await fs.unlink(paths.flywheelSignalFile).catch(() => {});
568
+
569
+ // Flywheel Guard (independent validation)
570
+ if (shouldRunGuard(options.flywheelGuard ?? 'off', state, state.current_us)) {
571
+ state.phase = 'guard';
572
+ await writeStatus(paths, state, options.onStatusChange, options.now);
573
+
574
+ const guardPaneId = state.flywheel_pane_id ?? state.verifier_pane_id;
575
+ const guardModel = options.flywheelGuardModel ?? 'opus';
576
+
577
+ await dispatchGuard({ paths, sendKeys, guardPaneId, guardModel, rootDir });
578
+
579
+ const guardVerdict = await pollForSignal(paths.flywheelGuardVerdictFile, {
580
+ mode: 'claude',
581
+ paneId: guardPaneId,
582
+ });
583
+
584
+ // Track per-US guard count
585
+ if (!state.flywheel_guard_count[state.current_us]) {
586
+ state.flywheel_guard_count[state.current_us] = 0;
587
+ }
588
+ state.flywheel_guard_count[state.current_us] += 1;
589
+
590
+ await fs.unlink(paths.flywheelGuardVerdictFile).catch(() => {});
591
+
592
+ if (guardVerdict.verdict === 'inconclusive') {
593
+ // Escalate to user — BLOCKED
594
+ state.phase = 'blocked';
595
+ await writeSentinel(paths.blockedSentinel, 'blocked', state.current_us);
596
+ await writeStatus(paths, state, options.onStatusChange, options.now);
597
+ return {
598
+ status: 'blocked',
599
+ usId: state.current_us,
600
+ reason: 'flywheel-guard-inconclusive',
601
+ guardIssues: guardVerdict.issues,
602
+ statusFile: paths.statusFile,
603
+ };
604
+ }
605
+
606
+ if (guardVerdict.verdict === 'fail') {
607
+ // Check if retries exhausted
608
+ if (state.flywheel_guard_count[state.current_us] >= 3) {
609
+ state.phase = 'blocked';
610
+ await writeSentinel(paths.blockedSentinel, 'blocked', state.current_us);
611
+ await writeStatus(paths, state, options.onStatusChange, options.now);
612
+ return {
613
+ status: 'blocked',
614
+ usId: state.current_us,
615
+ reason: 'flywheel-guard-retries-exhausted',
616
+ guardIssues: guardVerdict.issues,
617
+ statusFile: paths.statusFile,
618
+ };
619
+ }
620
+ // Retry: skip Worker, go to next iteration (flywheel will re-run)
621
+ // Guard feedback is already persisted via guard agent's memory write-back
622
+ state.phase = 'worker';
623
+ await writeStatus(paths, state, options.onStatusChange, options.now);
624
+ state.iteration += 1;
625
+ continue;
626
+ }
627
+
628
+ // verdict === 'pass'
629
+ if (guardVerdict.analysis_only) {
630
+ // Analysis-only direction — skip Worker, record and continue
631
+ state.phase = 'worker';
632
+ await writeStatus(paths, state, options.onStatusChange, options.now);
633
+ state.iteration += 1;
634
+ continue;
635
+ }
636
+ }
637
+
638
+ // Reset guard count on pass (flywheel direction accepted)
639
+ if (state.flywheel_guard_count[state.current_us]) {
640
+ state.flywheel_guard_count[state.current_us] = 0;
641
+ }
642
+ }
643
+ ```
644
+
645
+ - [ ] **Step 6: Run tests to verify they pass**
646
+
647
+ ```bash
648
+ node --test tests/node/test-flywheel-guard.mjs
649
+ node --test tests/node/test-flywheel.mjs
650
+ ```
651
+ Expected: all PASS
652
+
653
+ - [ ] **Step 7: Run regression**
654
+
655
+ ```bash
656
+ node --test tests/node/us007-analytics-reporting.test.mjs
657
+ node --test tests/node/us008-cli-entrypoint.test.mjs
658
+ ```
659
+ Expected: all PASS (update us008 deepEqual if it checks status.json shape with new `flywheel_guard_count` field)
660
+
661
+ - [ ] **Step 8: Commit**
662
+
663
+ ```bash
664
+ git add src/node/runner/campaign-main-loop.mjs tests/node/test-flywheel-guard.mjs
665
+ git commit -m "feat: wire flywheel guard into campaign loop (after flywheel, before worker)"
666
+ ```
667
+
668
+ ---
669
+
670
+ ### Task 6: Add guard flags to docs and presets
671
+
672
+ **Files:**
673
+ - Modify: `src/commands/rlp-desk.md:192-194` and `:222-224`
674
+ - Modify: `src/scripts/init_ralph_desk.zsh:243-244`
675
+ - Modify: `tests/node/test-flywheel-guard.mjs`
676
+
677
+ - [ ] **Step 1: Write failing tests**
678
+
679
+ Append to `tests/node/test-flywheel-guard.mjs`:
680
+
681
+ ```javascript
682
+ test('G18: rlp-desk.md options reference includes guard flags', async () => {
683
+ const content = await fs.readFile(path.join(repoRoot, 'src', 'commands', 'rlp-desk.md'), 'utf8');
684
+ assert.match(content, /--flywheel-guard off\|on/);
685
+ assert.match(content, /--flywheel-guard-model MODEL/);
686
+ });
687
+
688
+ test('G19: init presets include guard flags', async () => {
689
+ const content = await fs.readFile(path.join(repoRoot, 'src', 'scripts', 'init_ralph_desk.zsh'), 'utf8');
690
+ assert.match(content, /--flywheel-guard off\|on/);
691
+ assert.match(content, /--flywheel-guard-model MODEL/);
692
+ });
693
+ ```
694
+
695
+ - [ ] **Step 2: Run tests to verify they fail**
696
+
697
+ ```bash
698
+ node --test tests/node/test-flywheel-guard.mjs
699
+ ```
700
+ Expected: G18-G19 FAIL
701
+
702
+ - [ ] **Step 3: Add guard flags to rlp-desk.md**
703
+
704
+ In `src/commands/rlp-desk.md`, after both `--flywheel-model MODEL` lines (lines 194 and 224), add:
705
+
706
+ ```
707
+ # --flywheel-guard off|on guard validates flywheel decisions (default: off)
708
+ # --flywheel-guard-model MODEL guard reviewer model (default: opus)
709
+ ```
710
+
711
+ - [ ] **Step 4: Add guard flags to init presets**
712
+
713
+ In `src/scripts/init_ralph_desk.zsh`, after `--flywheel-model MODEL` echo (line 244), add:
714
+
715
+ ```bash
716
+ echo "# --flywheel-guard off|on guard validates flywheel decisions (default: off)"
717
+ echo "# --flywheel-guard-model MODEL guard reviewer model (default: opus)"
718
+ ```
719
+
720
+ - [ ] **Step 5: Run tests to verify they pass**
721
+
722
+ ```bash
723
+ zsh -n src/scripts/init_ralph_desk.zsh && echo "SYNTAX OK"
724
+ node --test tests/node/test-flywheel-guard.mjs
725
+ ```
726
+ Expected: SYNTAX OK, all PASS
727
+
728
+ - [ ] **Step 6: Commit**
729
+
730
+ ```bash
731
+ git add src/commands/rlp-desk.md src/scripts/init_ralph_desk.zsh tests/node/test-flywheel-guard.mjs
732
+ git commit -m "feat: add flywheel guard flags to docs and run presets"
733
+ ```
734
+
735
+ ---
736
+
737
+ ### Task 7: Update governance.md — guard step in Leader Loop
738
+
739
+ **Files:**
740
+ - Modify: `src/governance.md:453-509` (§7 Leader Loop Protocol)
741
+
742
+ - [ ] **Step 1: Add guard step to Leader Loop Protocol**
743
+
744
+ In `src/governance.md`, in the Leader Loop Protocol (§7), add after the flywheel description. The flywheel is not yet mentioned in §7 (it's only in the code), so add a new sub-step between ⑥ and ⑦:
745
+
746
+ After `⑥ Read memory.md again` (line 479), add:
747
+
748
+ ```
749
+ ⑥½ Flywheel direction review (when --flywheel on-fail and consecutive_failures > 0)
750
+ - Dispatch Flywheel agent (fresh context, --flywheel-model)
751
+ - Read flywheel-signal.json for direction decision (hold/pivot/reduce/expand)
752
+ - If --flywheel-guard on:
753
+ - Dispatch Guard agent (fresh context, --flywheel-guard-model)
754
+ - Read flywheel-guard-verdict.json:
755
+ • pass → proceed to Worker with updated contract
756
+ • pass + analysis_only → skip Worker, record analysis, next iteration
757
+ • fail → re-run Flywheel with guard feedback (max 2 retries)
758
+ • fail + retries exhausted → BLOCKED
759
+ • inconclusive → BLOCKED (escalate to user)
760
+ - Guard count tracked per-US in status.json
761
+ ```
762
+
763
+ - [ ] **Step 2: Verify syntax**
764
+
765
+ ```bash
766
+ # Quick check the file is valid markdown:
767
+ head -5 src/governance.md
768
+ ```
769
+
770
+ - [ ] **Step 3: Commit**
771
+
772
+ ```bash
773
+ git add src/governance.md
774
+ git commit -m "docs: add flywheel guard step to §7 Leader Loop Protocol"
775
+ ```
776
+
777
+ ---
778
+
779
+ ### Task 8: Local sync + full regression
780
+
781
+ **Files:** none modified — sync and verification only
782
+
783
+ - [ ] **Step 1: Run full test suite**
784
+
785
+ ```bash
786
+ node --test tests/node/test-flywheel.mjs tests/node/test-flywheel-guard.mjs tests/node/us007-analytics-reporting.test.mjs tests/node/us008-cli-entrypoint.test.mjs
787
+ ```
788
+ Expected: 0 failures
789
+
790
+ - [ ] **Step 2: Check zsh syntax**
791
+
792
+ ```bash
793
+ zsh -n src/scripts/init_ralph_desk.zsh && echo "SYNTAX OK"
794
+ ```
795
+
796
+ - [ ] **Step 3: Local file sync**
797
+
798
+ ```bash
799
+ cp src/commands/rlp-desk.md ~/.claude/commands/rlp-desk.md
800
+ cp src/governance.md ~/.claude/ralph-desk/governance.md
801
+ cp src/scripts/init_ralph_desk.zsh ~/.claude/ralph-desk/init_ralph_desk.zsh
802
+ cp src/scripts/run_ralph_desk.zsh ~/.claude/ralph-desk/run_ralph_desk.zsh
803
+ cp src/scripts/lib_ralph_desk.zsh ~/.claude/ralph-desk/lib_ralph_desk.zsh
804
+ cp README.md ~/.claude/ralph-desk/README.md
805
+ ```
806
+
807
+ - [ ] **Step 4: Verify sync**
808
+
809
+ ```bash
810
+ diff -q src/commands/rlp-desk.md ~/.claude/commands/rlp-desk.md
811
+ diff -q src/governance.md ~/.claude/ralph-desk/governance.md
812
+ diff -q src/scripts/init_ralph_desk.zsh ~/.claude/ralph-desk/init_ralph_desk.zsh
813
+ diff -q src/scripts/run_ralph_desk.zsh ~/.claude/ralph-desk/run_ralph_desk.zsh
814
+ diff -q src/scripts/lib_ralph_desk.zsh ~/.claude/ralph-desk/lib_ralph_desk.zsh
815
+ diff -q README.md ~/.claude/ralph-desk/README.md
816
+ ```
817
+ All must produce no output (identical).