wiggum-cli 0.15.0 → 0.17.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (110) hide show
  1. package/README.md +7 -1
  2. package/bin/ralph.js +0 -0
  3. package/dist/agent/memory/ingest.d.ts +14 -0
  4. package/dist/agent/memory/ingest.js +77 -0
  5. package/dist/agent/memory/store.d.ts +15 -0
  6. package/dist/agent/memory/store.js +98 -0
  7. package/dist/agent/memory/types.d.ts +16 -0
  8. package/dist/agent/memory/types.js +14 -0
  9. package/dist/agent/orchestrator.d.ts +7 -0
  10. package/dist/agent/orchestrator.js +266 -0
  11. package/dist/agent/resolve-config.d.ts +26 -0
  12. package/dist/agent/resolve-config.js +43 -0
  13. package/dist/agent/tools/backlog.d.ts +27 -0
  14. package/dist/agent/tools/backlog.js +51 -0
  15. package/dist/agent/tools/dry-run.d.ts +106 -0
  16. package/dist/agent/tools/dry-run.js +119 -0
  17. package/dist/agent/tools/execution.d.ts +51 -0
  18. package/dist/agent/tools/execution.js +256 -0
  19. package/dist/agent/tools/feature-state.d.ts +43 -0
  20. package/dist/agent/tools/feature-state.js +184 -0
  21. package/dist/agent/tools/introspection.d.ts +23 -0
  22. package/dist/agent/tools/introspection.js +40 -0
  23. package/dist/agent/tools/memory.d.ts +44 -0
  24. package/dist/agent/tools/memory.js +99 -0
  25. package/dist/agent/tools/preflight.d.ts +7 -0
  26. package/dist/agent/tools/preflight.js +137 -0
  27. package/dist/agent/tools/reporting.d.ts +58 -0
  28. package/dist/agent/tools/reporting.js +119 -0
  29. package/dist/agent/tools/schemas.d.ts +2 -0
  30. package/dist/agent/tools/schemas.js +3 -0
  31. package/dist/agent/types.d.ts +45 -0
  32. package/dist/agent/types.js +1 -0
  33. package/dist/ai/conversation/conversation-manager.js +8 -0
  34. package/dist/ai/conversation/url-fetcher.js +27 -0
  35. package/dist/ai/providers.js +5 -5
  36. package/dist/commands/agent.d.ts +17 -0
  37. package/dist/commands/agent.js +114 -0
  38. package/dist/commands/monitor.js +50 -183
  39. package/dist/commands/new-auto.d.ts +15 -0
  40. package/dist/commands/new-auto.js +237 -0
  41. package/dist/commands/run.js +20 -10
  42. package/dist/commands/sync.d.ts +15 -0
  43. package/dist/commands/sync.js +68 -0
  44. package/dist/generator/config.d.ts +1 -41
  45. package/dist/generator/config.js +7 -0
  46. package/dist/generator/index.d.ts +2 -2
  47. package/dist/generator/templates.d.ts +3 -0
  48. package/dist/generator/templates.js +22 -1
  49. package/dist/index.d.ts +14 -1
  50. package/dist/index.js +333 -40
  51. package/dist/repl/command-parser.d.ts +5 -0
  52. package/dist/repl/command-parser.js +5 -0
  53. package/dist/templates/prompts/PROMPT.md.tmpl +13 -10
  54. package/dist/templates/prompts/PROMPT_e2e.md.tmpl +162 -5
  55. package/dist/templates/prompts/PROMPT_feature.md.tmpl +39 -3
  56. package/dist/templates/prompts/PROMPT_review_auto.md.tmpl +33 -8
  57. package/dist/templates/prompts/PROMPT_review_manual.md.tmpl +4 -1
  58. package/dist/templates/prompts/PROMPT_review_merge.md.tmpl +40 -10
  59. package/dist/templates/prompts/PROMPT_verify.md.tmpl +5 -2
  60. package/dist/templates/scripts/feature-loop.sh.tmpl +611 -95
  61. package/dist/tui/app.d.ts +34 -2
  62. package/dist/tui/app.js +31 -5
  63. package/dist/tui/components/ActivityFeed.d.ts +18 -0
  64. package/dist/tui/components/ActivityFeed.js +31 -0
  65. package/dist/tui/components/IssuePicker.d.ts +27 -0
  66. package/dist/tui/components/IssuePicker.js +64 -0
  67. package/dist/tui/components/RunCompletionSummary.d.ts +27 -1
  68. package/dist/tui/components/RunCompletionSummary.js +103 -10
  69. package/dist/tui/components/SummaryBox.d.ts +4 -0
  70. package/dist/tui/components/SummaryBox.js +4 -2
  71. package/dist/tui/hooks/useAgentOrchestrator.d.ts +29 -0
  72. package/dist/tui/hooks/useAgentOrchestrator.js +453 -0
  73. package/dist/tui/hooks/useBackgroundRuns.js +1 -1
  74. package/dist/tui/orchestration/interview-orchestrator.d.ts +5 -1
  75. package/dist/tui/orchestration/interview-orchestrator.js +27 -6
  76. package/dist/tui/screens/AgentScreen.d.ts +21 -0
  77. package/dist/tui/screens/AgentScreen.js +159 -0
  78. package/dist/tui/screens/InitScreen.js +4 -0
  79. package/dist/tui/screens/InterviewScreen.d.ts +3 -1
  80. package/dist/tui/screens/InterviewScreen.js +146 -10
  81. package/dist/tui/screens/MainShell.d.ts +1 -1
  82. package/dist/tui/screens/MainShell.js +36 -1
  83. package/dist/tui/screens/RunScreen.d.ts +15 -15
  84. package/dist/tui/screens/RunScreen.js +96 -11
  85. package/dist/tui/utils/build-run-summary.d.ts +1 -1
  86. package/dist/tui/utils/build-run-summary.js +44 -85
  87. package/dist/tui/utils/clear-screen.d.ts +14 -0
  88. package/dist/tui/utils/clear-screen.js +16 -0
  89. package/dist/tui/utils/git-summary.d.ts +13 -0
  90. package/dist/tui/utils/git-summary.js +30 -0
  91. package/dist/tui/utils/loop-status.d.ts +94 -0
  92. package/dist/tui/utils/loop-status.js +430 -10
  93. package/dist/tui/utils/pr-summary.d.ts +3 -2
  94. package/dist/tui/utils/pr-summary.js +41 -6
  95. package/dist/utils/ci.d.ts +8 -0
  96. package/dist/utils/ci.js +13 -0
  97. package/dist/utils/config.d.ts +8 -0
  98. package/dist/utils/config.js +8 -0
  99. package/dist/utils/github.d.ts +32 -0
  100. package/dist/utils/github.js +106 -0
  101. package/dist/utils/spec-names.js +5 -1
  102. package/package.json +10 -2
  103. package/src/templates/prompts/PROMPT.md.tmpl +13 -10
  104. package/src/templates/prompts/PROMPT_e2e.md.tmpl +162 -5
  105. package/src/templates/prompts/PROMPT_feature.md.tmpl +39 -3
  106. package/src/templates/prompts/PROMPT_review_auto.md.tmpl +33 -8
  107. package/src/templates/prompts/PROMPT_review_manual.md.tmpl +4 -1
  108. package/src/templates/prompts/PROMPT_review_merge.md.tmpl +40 -10
  109. package/src/templates/prompts/PROMPT_verify.md.tmpl +5 -2
  110. package/src/templates/scripts/feature-loop.sh.tmpl +611 -95
@@ -0,0 +1,32 @@
1
+ export declare function isGhInstalled(): Promise<boolean>;
2
+ export declare function _resetGhCache(): void;
3
+ export interface GitHubIssueDetail {
4
+ title: string;
5
+ body: string;
6
+ labels: string[];
7
+ }
8
+ export declare function fetchGitHubIssue(owner: string, repo: string, number: number): Promise<GitHubIssueDetail | null>;
9
+ export interface GitHubIssueListItem {
10
+ number: number;
11
+ title: string;
12
+ state: 'open' | 'closed';
13
+ labels: string[];
14
+ createdAt: string;
15
+ }
16
+ export interface ListIssuesResult {
17
+ issues: GitHubIssueListItem[];
18
+ error?: string;
19
+ }
20
+ export declare function listRepoIssues(owner: string, repo: string, search?: string, limit?: number): Promise<ListIssuesResult>;
21
+ export declare function detectGitHubRemote(projectRoot: string): Promise<GitHubRepo | null>;
22
+ export interface ParsedGitHubIssue {
23
+ owner: string;
24
+ repo: string;
25
+ number: number;
26
+ }
27
+ export declare function isGitHubIssueUrl(input: string): ParsedGitHubIssue | null;
28
+ export interface GitHubRepo {
29
+ owner: string;
30
+ repo: string;
31
+ }
32
+ export declare function parseGitHubRemote(remoteUrl: string): GitHubRepo | null;
@@ -0,0 +1,106 @@
1
+ import { execFile as execFileCb } from 'node:child_process';
2
+ /**
3
+ * Safe command execution using execFile (no shell, array-based args).
4
+ */
5
+ function safeExec(cmd, args, cwd) {
6
+ return new Promise((resolve, reject) => {
7
+ execFileCb(cmd, args, { cwd, timeout: 10000 }, (error, stdout) => {
8
+ if (error)
9
+ reject(error);
10
+ else
11
+ resolve(String(stdout));
12
+ });
13
+ });
14
+ }
15
+ let ghInstalledCache = null;
16
+ export async function isGhInstalled() {
17
+ if (ghInstalledCache !== null)
18
+ return ghInstalledCache;
19
+ try {
20
+ await safeExec('gh', ['--version']);
21
+ ghInstalledCache = true;
22
+ }
23
+ catch {
24
+ ghInstalledCache = false;
25
+ }
26
+ return ghInstalledCache;
27
+ }
28
+ export function _resetGhCache() {
29
+ ghInstalledCache = null;
30
+ }
31
+ export async function fetchGitHubIssue(owner, repo, number) {
32
+ try {
33
+ const stdout = await safeExec('gh', [
34
+ 'issue', 'view', String(number),
35
+ '--repo', `${owner}/${repo}`,
36
+ '--json', 'title,body,labels',
37
+ ]);
38
+ const data = JSON.parse(stdout);
39
+ return {
40
+ title: data.title ?? '',
41
+ body: data.body ?? '',
42
+ labels: (data.labels ?? []).map((l) => l.name),
43
+ };
44
+ }
45
+ catch {
46
+ return null;
47
+ }
48
+ }
49
+ export async function listRepoIssues(owner, repo, search, limit = 20) {
50
+ try {
51
+ const args = [
52
+ 'issue', 'list',
53
+ '--repo', `${owner}/${repo}`,
54
+ '--limit', String(limit),
55
+ '--json', 'number,title,state,labels,createdAt',
56
+ '--state', 'open',
57
+ ];
58
+ if (search) {
59
+ args.push('--search', search);
60
+ }
61
+ const stdout = await safeExec('gh', args);
62
+ const data = JSON.parse(stdout);
63
+ const issues = data.map((item) => ({
64
+ number: item.number,
65
+ title: item.title,
66
+ state: item.state.toLowerCase(),
67
+ labels: (item.labels ?? []).map((l) => l.name),
68
+ createdAt: item.createdAt ?? '',
69
+ }));
70
+ return { issues };
71
+ }
72
+ catch (err) {
73
+ const msg = err instanceof Error ? err.message : String(err);
74
+ if (msg.includes('auth') || msg.includes('login') || msg.includes('not logged')) {
75
+ return { issues: [], error: 'Run "gh auth login" to enable issue browsing' };
76
+ }
77
+ return { issues: [] };
78
+ }
79
+ }
80
+ export async function detectGitHubRemote(projectRoot) {
81
+ try {
82
+ const stdout = await safeExec('git', ['remote', 'get-url', 'origin'], projectRoot);
83
+ return parseGitHubRemote(stdout.trim());
84
+ }
85
+ catch {
86
+ return null;
87
+ }
88
+ }
89
+ const GITHUB_ISSUE_RE = /^https?:\/\/github\.com\/([^/]+)\/([^/]+)\/(issues|pull)\/(\d+)\/?/;
90
+ export function isGitHubIssueUrl(input) {
91
+ const match = input.match(GITHUB_ISSUE_RE);
92
+ if (!match)
93
+ return null;
94
+ return { owner: match[1], repo: match[2], number: parseInt(match[4], 10) };
95
+ }
96
+ const SSH_REMOTE_RE = /^git@github\.com:([^/]+)\/(.+?)(?:\.git)?$/;
97
+ const HTTPS_REMOTE_RE = /^https?:\/\/github\.com\/([^/]+)\/([^/]+?)(?:\.git)?$/;
98
+ export function parseGitHubRemote(remoteUrl) {
99
+ const ssh = remoteUrl.match(SSH_REMOTE_RE);
100
+ if (ssh)
101
+ return { owner: ssh[1], repo: ssh[2] };
102
+ const https = remoteUrl.match(HTTPS_REMOTE_RE);
103
+ if (https)
104
+ return { owner: https[1], repo: https[2] };
105
+ return null;
106
+ }
@@ -17,7 +17,11 @@ export async function listSpecNames(specsDir) {
17
17
  return [];
18
18
  }
19
19
  return entries
20
- .filter((e) => e.isFile() && e.name.endsWith('.md'))
20
+ .filter((e) => e.isFile() &&
21
+ e.name.endsWith('.md') &&
22
+ !e.name.startsWith('_') &&
23
+ e.name !== 'README.md' &&
24
+ !e.name.endsWith('-implementation-plan.md'))
21
25
  .map((e) => e.name.slice(0, -3))
22
26
  .sort();
23
27
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "wiggum-cli",
3
- "version": "0.15.0",
3
+ "version": "0.17.0",
4
4
  "description": "AI-powered feature development loop CLI",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -22,6 +22,7 @@
22
22
  "test": "vitest run",
23
23
  "test:watch": "vitest",
24
24
  "typecheck": "tsc --noEmit",
25
+ "e2e:bridge": "tsx e2e/bridge/server.ts",
25
26
  "prepublishOnly": "npm run build && npm test"
26
27
  },
27
28
  "keywords": [
@@ -60,13 +61,20 @@
60
61
  "react": "^18.3.1",
61
62
  "zod": "^4.3.5"
62
63
  },
64
+ "overrides": {
65
+ "minimatch": ">=10.2.1"
66
+ },
63
67
  "devDependencies": {
64
68
  "@types/node": "^20.10.0",
65
69
  "@types/react": "^19.2.9",
70
+ "@types/ws": "^8.5.10",
66
71
  "@vitest/coverage-v8": "^4.0.18",
67
72
  "ink-testing-library": "^4.0.0",
73
+ "node-pty": "^1.0.0",
74
+ "tsx": "^4.7.0",
68
75
  "typescript": "^5.3.0",
69
- "vitest": "^4.0.17"
76
+ "vitest": "^4.0.17",
77
+ "ws": "^8.16.0"
70
78
  },
71
79
  "engines": {
72
80
  "node": ">=18.0.0"
@@ -1,5 +1,5 @@
1
1
  ## Context
2
- Study @.ralph/AGENTS.md for commands and patterns.
2
+ If @.ralph/guides/AGENTS.md exists, study it for commands and patterns.
3
3
  Study @.ralph/specs/$FEATURE.md for feature specification.
4
4
  Study @.ralph/specs/$FEATURE-implementation-plan.md for current tasks.
5
5
  {{#if frameworkVariant}}For detailed architecture, see @{{appDir}}/.claude/CLAUDE.md{{/if}}
@@ -18,10 +18,10 @@ Key patterns: parallel fetches, direct imports, React.cache(), lazy loading.
18
18
  - Search codebase before assuming something doesn't exist
19
19
 
20
20
  ## Task
21
- Pick the next incomplete task from the implementation plan.
21
+ Work through ALL incomplete tasks in the implementation plan in a single session.
22
22
  **Skip E2E tasks** (tasks starting with `E2E:`) - those are handled in a separate phase.
23
- Implement it following the patterns in AGENTS.md.
24
- Write tests for the implementation.
23
+ For each task: implement it, write tests, validate, commit, then move to the next task.
24
+ Do not stop after one task — keep going until all non-E2E tasks are complete.
25
25
 
26
26
  ## Validation
27
27
  After changes, ALL must pass:
@@ -36,7 +36,8 @@ If any validation fails, fix the issue before proceeding.
36
36
  Before committing, review your changes against @.ralph/guides/SECURITY.md:
37
37
  1. **Quick scan**: Input validation, injection prevention, auth checks, data exposure
38
38
  2. **Run**: `cd {{appDir}} && {{packageManager}} audit` (check for vulnerable dependencies)
39
- 3. **Check**: `mcp__supabase__get_advisors` with type "security" (RLS policies)
39
+ {{#if hasSupabase}}3. **Check**: `mcp__supabase__get_advisors` with type "security" (RLS policies)
40
+ {{/if}}
40
41
  4. **Red team**: Can auth be bypassed? Can other users' data be accessed?
41
42
 
42
43
  Flag any security issues in the implementation plan and fix before committing.
@@ -54,7 +55,7 @@ If any check fails, fix before committing.
54
55
 
55
56
  ## Completion
56
57
  When ALL validations pass:
57
- 1. Update @.ralph/specs/$FEATURE-implementation-plan.md - mark task done with commit hash
58
+ 1. Update @.ralph/specs/$FEATURE-implementation-plan.md change the task's `- [ ]` to `- [x]` and append the commit hash (e.g., `- [x] Task description - abc1234`). The harness tracks progress by counting checkboxes, so this step is mandatory.
58
59
  2. `git -C {{appDir}} add -A`
59
60
  3. `git -C {{appDir}} commit -m "type(scope): description"`
60
61
  4. `git -C {{appDir}} push origin feat/$FEATURE`
@@ -69,9 +70,11 @@ If this iteration revealed something useful, append to @.ralph/LEARNINGS.md:
69
70
  Format: `- [YYYY-MM-DD] [$FEATURE] Brief description`
70
71
 
71
72
  ## Rules
72
- - One task per iteration
73
+ - Complete ALL remaining non-E2E tasks before ending the session
74
+ - Commit after each task so progress is preserved if the session is interrupted
73
75
  - Tests are mandatory - no task is complete without tests
74
76
  - Search codebase before assuming something doesn't exist
75
- - If blocked, document in implementation plan and move to next task
76
- - Use Supabase MCP for database operations
77
- - Use PostHog MCP for analytics queries
77
+ - If blocked on a task, document in implementation plan and move to the next task
78
+ {{#if hasSupabase}}- Use Supabase MCP for database operations
79
+ {{/if}}{{#if hasPosthog}}- Use PostHog MCP for analytics queries
80
+ {{/if}}
@@ -1,5 +1,5 @@
1
1
  ## Context
2
- Study @.ralph/AGENTS.md for commands and patterns.
2
+ If @.ralph/guides/AGENTS.md exists, study it for commands and patterns.
3
3
  Study @.ralph/specs/$FEATURE.md for feature specification.
4
4
  Study @.ralph/specs/$FEATURE-implementation-plan.md for E2E test scenarios.
5
5
  {{#if frameworkVariant}}For detailed architecture, see @{{appDir}}/.claude/CLAUDE.md{{/if}}
@@ -12,12 +12,164 @@ Pay special attention to "E2E Pitfalls" section to avoid known issues.
12
12
  Before E2E testing, verify:
13
13
  1. Build passes: `cd {{appDir}} && {{buildCommand}}`
14
14
  2. All unit tests pass: `cd {{appDir}} && {{testCommand}}`
15
+ {{#unless isTui}}
15
16
  3. Clear cache if issues: `rm -rf {{appDir}}/.next`
17
+ {{/unless}}
16
18
 
17
19
  If either fails, fix issues before proceeding with E2E tests.
18
20
 
21
+ {{#if isTui}}
22
+ ## Task
23
+ Execute automated E2E tests for the completed TUI feature using the xterm.js bridge and agent-browser.
24
+ Run ALL scenarios in a single session — do not end between scenarios.
25
+
26
+ ### Step 1: Start Bridge
27
+ Check the bridge is running:
28
+ ```bash
29
+ curl -s http://localhost:3999/health || (cd {{projectRoot}} && npm run e2e:bridge &)
30
+ sleep 3
31
+ ```
32
+
33
+ ### Step 2: Parse E2E Test Scenarios
34
+ Read E2E test scenarios from @.ralph/specs/$FEATURE-implementation-plan.md.
35
+ Each scenario is marked with `- [ ] E2E:` prefix and follows this format:
36
+
37
+ ```
38
+ - [ ] E2E: [Scenario name]
39
+ - **Command:** [wiggum command, e.g., init, new auth-flow]
40
+ - **CWD:** [working directory, e.g., e2e/fixtures/bare-project]
41
+ - **Steps:**
42
+ 1. [Action] -> [expected terminal output]
43
+ - **Verify:** [text that should appear in terminal]
44
+ ```
45
+
46
+ ### Step 3: Execute Each Scenario
47
+
48
+ For each scenario:
49
+
50
+ 1. **Open the TUI in the bridge:**
51
+ ```bash
52
+ agent-browser open "http://localhost:3999?cmd=<command>&cwd=<path>"
53
+ ```
54
+
55
+ 2. **Wait for terminal ready:**
56
+ ```bash
57
+ agent-browser wait --text "Ready" --source title --timeout 10000
58
+ ```
59
+
60
+ 3. **Read terminal content:**
61
+ ```bash
62
+ agent-browser eval "document.getElementById('terminal-mirror').textContent"
63
+ ```
64
+
65
+ 4. **Interact with TUI:**
66
+ ```bash
67
+ # Click to focus terminal
68
+ agent-browser snapshot -i
69
+ agent-browser click @<terminal-ref>
70
+
71
+ # Type text (Enter = press Enter after)
72
+ agent-browser type @<terminal-ref> "/help"
73
+ agent-browser key Enter
74
+
75
+ # Arrow navigation
76
+ agent-browser key ArrowDown
77
+ agent-browser key ArrowDown
78
+ agent-browser key Enter
79
+
80
+ # Escape
81
+ agent-browser key Escape
82
+ ```
83
+
84
+ 5. **Assert expected output:**
85
+ ```bash
86
+ # Read terminal and check for expected text
87
+ CONTENT=$(agent-browser eval "document.getElementById('terminal-mirror').textContent")
88
+ # Verify CONTENT contains expected strings
89
+ ```
90
+
91
+ 6. **Reset between scenarios:**
92
+ ```bash
93
+ agent-browser close
94
+ ```
95
+
96
+ ### TUI Interaction Cheatsheet
97
+
98
+ | Action | Command |
99
+ |--------|---------|
100
+ | Open TUI | `agent-browser open "http://localhost:3999?cmd=init&cwd=/path"` |
101
+ | Read screen | `agent-browser eval "document.getElementById('terminal-mirror').textContent"` |
102
+ | Take snapshot | `agent-browser snapshot -i` |
103
+ | Click element | `agent-browser click @ref` |
104
+ | Type text | `agent-browser type @ref "text"` |
105
+ | Press Enter | `agent-browser key Enter` |
106
+ | Arrow down | `agent-browser key ArrowDown` |
107
+ | Escape | `agent-browser key Escape` |
108
+ | Screenshot | `agent-browser screenshot e2e-failure.png` |
109
+ | Wait for text | `agent-browser wait --text "expected" --timeout 10000` |
110
+ | Close session | `agent-browser close` |
111
+
112
+ ### Key Rules
113
+ - Always wait for expected text before asserting (TUI renders async via React)
114
+ - Use `agent-browser eval` with `terminal-mirror` for reliable text reading
115
+ - Take screenshots on failures for debugging
116
+ - Each scenario navigates to a fresh URL (clean state)
117
+ - Wait 500ms after key presses before reading (Ink re-render delay)
118
+
119
+ ### Step 4: Report Results
120
+ Update @.ralph/specs/$FEATURE-implementation-plan.md for each scenario:
121
+
122
+ **Passed:**
123
+ ```markdown
124
+ - [x] E2E: scenario name - PASSED
125
+ ```
126
+
127
+ **Failed:**
128
+ ```markdown
129
+ - [ ] E2E: scenario name - FAILED: [brief reason]
130
+ - Error: [what went wrong]
131
+ - Screenshot: [if captured]
132
+ - Fix needed: [suggested action]
133
+ ```
134
+
135
+ ## Error Recovery
136
+
137
+ If a scenario fails:
138
+ 1. Document the failure with specific error details
139
+ 2. Take a screenshot: `agent-browser screenshot e2e-failure-<scenario>.png`
140
+ 3. Note what fix is likely needed (code bug vs test spec issue)
141
+ 4. Continue with remaining scenarios
142
+ 5. At end, summary shows total passed/failed
143
+
144
+ Failures will trigger a fix iteration in the loop.
145
+
146
+ ## Completion
147
+
148
+ When all scenarios are executed:
149
+ 1. Update implementation plan with results for each scenario
150
+ 2. Update the Implementation Summary status to `[PASSED]` if all passed
151
+ 3. **Commit the updated implementation plan:**
152
+ ```bash
153
+ git -C {{appDir}} add -A && git -C {{appDir}} commit -m "test($FEATURE): E2E tests passed via agent-browser"
154
+ ```
155
+ 4. **Push to remote:**
156
+ ```bash
157
+ git -C {{appDir}} push origin feat/$FEATURE
158
+ ```
159
+ 5. If all passed: signal ready for PR phase
160
+ 6. If any failed: failures documented, loop will retry after fix iteration
161
+
162
+ ## Learning Capture
163
+ If E2E testing revealed issues worth remembering, append to @.ralph/LEARNINGS.md:
164
+ - Flaky test patterns -> Add under "## Anti-Patterns" > "E2E Pitfalls"
165
+ - TUI timing issues -> Add under "## Anti-Patterns"
166
+ - Useful agent-browser techniques -> Add under "## Tool Usage"
167
+
168
+ Format: `- [YYYY-MM-DD] [$FEATURE] Brief description`
169
+ {{else}}
19
170
  ## Task
20
171
  Execute automated E2E tests for the completed feature using Playwright MCP tools.
172
+ Run ALL scenarios in a single session — do not end between scenarios.
21
173
 
22
174
  ### Step 1: Check Dev Server
23
175
  Verify dev server is running at http://localhost:3000. If not accessible, start it:
@@ -26,7 +178,7 @@ cd {{appDir}} && {{devCommand}} &
26
178
  ```
27
179
  Wait ~10 seconds for server startup, then verify with a simple browser_navigate.
28
180
 
29
- ### Step 1.5: Seed Test Data (if needed)
181
+ {{#if hasSupabase}}### Step 1.5: Seed Test Data (if needed)
30
182
 
31
183
  Check if test scenarios require specific data volumes (e.g., pagination needs >10 rows).
32
184
 
@@ -49,6 +201,7 @@ query: "DELETE FROM table_name WHERE data->>'_test' = 'true';"
49
201
  ```
50
202
 
51
203
  **If seeding is impractical:** Document in implementation plan that E2E was skipped but unit tests provide coverage.
204
+ {{/if}}
52
205
 
53
206
  ### Step 2: Parse E2E Test Scenarios
54
207
  Read E2E test scenarios from @.ralph/specs/$FEATURE-implementation-plan.md.
@@ -96,7 +249,7 @@ For each E2E test scenario:
96
249
  - Check console: `browser_console_messages` for JS errors
97
250
  - Document failure details
98
251
 
99
- ### Step 4: Database Verification
252
+ {{#if hasSupabase}}### Step 4: Database Verification
100
253
  For scenarios with database checks, use Supabase MCP:
101
254
  ```
102
255
  mcp__plugin_supabase_supabase__execute_sql
@@ -105,6 +258,7 @@ query: "SELECT * FROM survey_responses WHERE ..."
105
258
  ```
106
259
 
107
260
  Verify returned data matches expected state.
261
+ {{/if}}
108
262
 
109
263
  ### Unique Test Data (for Parallel Execution)
110
264
  When creating test data, use unique identifiers to avoid conflicts with other loops:
@@ -165,11 +319,12 @@ Update @.ralph/specs/$FEATURE-implementation-plan.md for each scenario:
165
319
  2. Verify URL contains expected path/params
166
320
  ```
167
321
 
168
- ### Database State
322
+ {{#if hasSupabase}}### Database State
169
323
  ```
170
324
  1. mcp__plugin_supabase_supabase__execute_sql with SELECT query
171
325
  2. Verify row count, column values match expectations
172
326
  ```
327
+ {{/if}}
173
328
 
174
329
  ## Browser State Management
175
330
 
@@ -214,8 +369,9 @@ If code changes don't appear in the browser:
214
369
 
215
370
  ### Stale Data
216
371
  - Clear browser storage: Use `browser_close` between scenarios
217
- - Check Supabase for stale test data from previous runs
372
+ {{#if hasSupabase}}- Check Supabase for stale test data from previous runs
218
373
  - Delete test data: `DELETE FROM table WHERE data->>'_test' = 'true'`
374
+ {{/if}}
219
375
 
220
376
  ## Rules
221
377
  - Always get a fresh `browser_snapshot` after actions before making assertions
@@ -232,3 +388,4 @@ If E2E testing revealed issues worth remembering, append to @.ralph/LEARNINGS.md
232
388
  - Timing issues or race conditions -> Add under "## Anti-Patterns"
233
389
 
234
390
  Format: `- [YYYY-MM-DD] [$FEATURE] Brief description`
391
+ {{/if}}
@@ -1,5 +1,5 @@
1
1
  ## Context
2
- Study @.ralph/AGENTS.md for commands and patterns.
2
+ If @.ralph/guides/AGENTS.md exists, study it for commands and patterns.
3
3
  Study @.ralph/specs/README.md for spec structure.
4
4
  Study @.ralph/specs/$FEATURE.md for feature specification.
5
5
  {{#if frameworkVariant}}For detailed architecture, see @{{appDir}}/.claude/CLAUDE.md{{/if}}
@@ -44,6 +44,28 @@ Study @.ralph/specs/$FEATURE.md for feature specification.
44
44
  - [ ] Task N (additional polish)
45
45
 
46
46
  ### Phase 5: E2E Testing
47
+ {{#if isTui}}
48
+ TUI E2E tests executed via xterm.js bridge + agent-browser.
49
+ Fixture projects in `e2e/fixtures/`. Bridge at `http://localhost:3999`.
50
+
51
+ - [ ] E2E: [Scenario name] - [brief description]
52
+ - **Command:** [wiggum command, e.g., init, new auth-flow]
53
+ - **CWD:** [working directory, e.g., e2e/fixtures/bare-project]
54
+ - **Steps:**
55
+ 1. [Action] -> [expected terminal output]
56
+ 2. [Action] -> [expected terminal output]
57
+ - **Verify:** [text that should appear in terminal]
58
+
59
+ Example TUI E2E scenario:
60
+ - [ ] E2E: Init in bare project - happy path
61
+ - **Command:** init
62
+ - **CWD:** e2e/fixtures/bare-project
63
+ - **Steps:**
64
+ 1. Open bridge with init command -> Welcome screen renders
65
+ 2. Arrow down to select option -> Option highlighted
66
+ 3. Press Enter -> Next screen appears
67
+ - **Verify:** "initialized" text visible in terminal
68
+ {{else}}
47
69
  Browser-based tests executed via Playwright MCP tools.
48
70
 
49
71
  - [ ] E2E: [Scenario name] - [brief description]
@@ -67,15 +89,29 @@ Example E2E scenario:
67
89
  5. Wait for "Thank You!" -> Success card displays
68
90
  - **Verify:** "successfully submitted" text visible
69
91
  - **Database check:** SELECT * FROM survey_responses WHERE survey_id = '{surveyId}'
92
+ {{/if}}
70
93
 
71
94
  ## Done
72
95
  - [x] Completed task - [commit hash]
73
96
  - [x] E2E: Scenario name - PASSED
74
97
  ```
75
98
 
99
+ ## CRITICAL CONSTRAINT — PLANNING ONLY
100
+ **You are in the PLANNING phase. Your ONLY job is to produce an implementation plan.**
101
+ - Do NOT write any source code, test code, or configuration files
102
+ - Do NOT create, modify, or touch any files outside `.ralph/specs/`
103
+ - Do NOT run build, test, or lint commands
104
+ - Do NOT make git commits
105
+ - If you feel the urge to "just implement a small piece", STOP — that is a phase violation
106
+ - The implementation phase runs AFTER this session ends, in a separate session
107
+ - Violation of this constraint wastes tokens and breaks the harness automation
108
+
76
109
  ## Rules
77
- - Plan only in this phase, do NOT implement
78
- - One task = one commit-sized unit of work
110
+ - You MUST use `- [ ]` checkbox syntax for every task in the plan
111
+ - Do NOT use heading-based task formats (e.g., `#### Task 1:`) for individual tasks
112
+ - The harness parses `- [ ]` lines to track progress — other formats will break automation
113
+ - Use `### Phase N:` headings only for phase grouping, not for individual tasks
114
+ - One task = one commit-sized unit of work (but tasks can be grouped into phases for batch implementation)
79
115
  - Every implementation task needs a corresponding test task
80
116
  - Use Supabase MCP to check existing schema
81
117
  - Use PostHog MCP to check existing analytics setup
@@ -1,5 +1,5 @@
1
1
  ## Context
2
- Study @.ralph/AGENTS.md for commands and patterns.
2
+ If @.ralph/guides/AGENTS.md exists, study it for commands and patterns.
3
3
  Study @.ralph/specs/$FEATURE.md for feature specification.
4
4
  Study @.ralph/specs/$FEATURE-implementation-plan.md for completed tasks.
5
5
 
@@ -9,6 +9,7 @@ Capture any review feedback patterns for future iterations.
9
9
 
10
10
  ## Task
11
11
  All implementation and E2E tasks are complete. Create PR and request review.
12
+ Complete ALL steps in a single pass — do not end the session between steps.
12
13
 
13
14
  ### Step 1: Verify Ready State
14
15
  1. Check all tasks are complete in implementation plan (no `- [ ]` items)
@@ -52,12 +53,14 @@ cd {{appDir}} && gh pr create --base main --head feat/$FEATURE \
52
53
  [Read from implementation plan - list completed phases]
53
54
 
54
55
  ## Testing
55
- - [x] Unit/integration tests: 97 passing
56
- - [x] E2E tests: All scenarios passed via Playwright MCP
56
+ - [x] Unit/integration tests: passing
57
+ - [x] E2E tests: All scenarios passed
57
58
  - [x] Build succeeds
58
59
 
59
60
  ## E2E Test Results
60
- [Copy from implementation plan Phase 9]
61
+ [Copy from implementation plan E2E section]
62
+
63
+ Closes #[Read the source issue number from the spec file metadata or context section]
61
64
 
62
65
  Generated with Claude Code
63
66
  EOF
@@ -88,14 +91,29 @@ Review the git diff against main and check:
88
91
 
89
92
  Run: git diff main
90
93
 
91
- Respond with:
92
- - APPROVED if everything looks good
93
- - Or list specific issues with file:line references that need to be fixed"
94
+ Then:
95
+ 1. Post your complete review as a comment on the PR using:
96
+ gh pr comment --body '<your review in markdown>'
97
+ Format the comment with: a summary, specific issues with file:line refs (if any), and the verdict.
98
+ 2. Print your final verdict as the LAST line of stdout. Print exactly one of:
99
+ VERDICT: APPROVED
100
+ VERDICT: NOT APPROVED
101
+ This line is parsed by the automation — do not omit it."
94
102
  fi
95
103
  ```
96
104
 
105
+ After the review completes, check its output:
106
+ - If it contains "VERDICT: APPROVED", echo that line so the automation detects it:
107
+ ```bash
108
+ echo "VERDICT: APPROVED"
109
+ ```
110
+ - If issues were found, echo:
111
+ ```bash
112
+ echo "VERDICT: NOT APPROVED"
113
+ ```
114
+
97
115
  **Handle review feedback:**
98
- - If Claude outputs "APPROVED" -> Done. The PR is ready for manual merge by the user.
116
+ - If Claude outputs "VERDICT: APPROVED" -> Done. The PR is ready for manual merge by the user.
99
117
  - If Claude lists issues:
100
118
  1. Address each issue with code fixes
101
119
  2. Commit: `git -C {{appDir}} add -A && git -C {{appDir}} commit -m "fix($FEATURE): address review feedback"`
@@ -108,8 +126,15 @@ After review is complete (approved or max iterations reached):
108
126
  1. Post a summary comment on the PR with the review outcome
109
127
  2. Do NOT merge — the user will review and merge manually
110
128
 
129
+ ## IMPORTANT: Review scope
130
+ If you discover that no implementation code exists (empty diff, no source files changed),
131
+ do NOT implement the feature yourself. Instead, report "VERDICT: NOT APPROVED —
132
+ no implementation found" so the harness can trigger a new implementation iteration.
133
+
111
134
  ## Rules
135
+ - **NEVER approve if any tests are failing.** Output "VERDICT: NOT APPROVED — test failures" if any tests fail.
112
136
  - Do NOT merge the PR — auto mode only reviews, the user merges
137
+ - Do NOT implement missing features — only review and fix minor issues
113
138
  - Address ALL review comments before marking as approved
114
139
  - If gh CLI fails, check authentication: `gh auth status`
115
140
  - Keep review conversation focused and professional
@@ -1,5 +1,5 @@
1
1
  ## Context
2
- Study @.ralph/AGENTS.md for commands and patterns.
2
+ If @.ralph/guides/AGENTS.md exists, study it for commands and patterns.
3
3
  Study @.ralph/specs/$FEATURE.md for feature specification.
4
4
  Study @.ralph/specs/$FEATURE-implementation-plan.md for completed tasks.
5
5
 
@@ -9,6 +9,7 @@ Capture any review feedback patterns for future iterations.
9
9
 
10
10
  ## Task
11
11
  All implementation and E2E tasks are complete. Create PR for manual review.
12
+ Complete ALL steps in a single pass — do not end the session between steps.
12
13
 
13
14
  ### Step 1: Verify Ready State
14
15
  1. Check all tasks are complete in implementation plan (no `- [ ]` items)
@@ -59,6 +60,8 @@ cd {{appDir}} && gh pr create --base main --head feat/$FEATURE \
59
60
  ## E2E Test Results
60
61
  [Copy from implementation plan if E2E phase exists]
61
62
 
63
+ Closes #[Read the source issue number from the spec file metadata or context section]
64
+
62
65
  Generated with Claude Code
63
66
  EOF
64
67
  )"