@pageai/ralph-loop 1.0.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/.agent/PROMPT.md CHANGED
@@ -1,21 +1,19 @@
1
- @.agent/tasks.json
2
-
3
1
  ## Overview
4
2
 
5
3
  You are implementing the project described in @.agent/prd/SUMMARY.md
6
4
 
7
- Tasks are listed in @.agent/tasks.json
8
-
9
5
  ## Required Setup
10
6
 
11
- Run `npm run dev` (as background process).
12
- App will be running at http://localhost:6006
7
+ Run `npm run dev` (as background process) in `src` directory.
8
+ App will be running at http://localhost:3000
13
9
 
14
10
  ## Before Starting
15
11
 
16
12
  Check @.agent/STEERING.md for critical work. Complete items in sequence, remove when done. Only proceed to implement tasks if no critical work pending.
17
13
 
18
- ## Task Flow (ONE TASK ONLY)
14
+ ## Task Flow
15
+
16
+ Tasks are listed in @.agent/tasks.json
19
17
 
20
18
  1. Pick highest-priority task with `passes: false` in `tasks.json`
21
19
  2. Read full spec: `.agent/tasks/TASK-${ID}.json`
@@ -31,14 +29,15 @@ Check @.agent/STEERING.md for critical work. Complete items in sequence, remove
31
29
  8. All tests must pass. Broke unrelated test? Fix it before proceeding.
32
30
  9. When tests pass, set `passes: true` in `tasks.json` for the task you completed.
33
31
  10. Log entry → `.agent/logs/LOG.md` (date, brief summary, screenshot path)
34
- 11. Update `.agent/STRUCTURE.md` if dirs changed
32
+ 11. Update `.agent/STRUCTURE.md` if dirs changed. Exclude dotfiles, tests and config.
35
33
  12. Commit changes, using the Conventional Commit format.
36
34
 
37
35
  ## Rules
38
36
 
37
+ - **IMPORTANT**: only work on one task at a time and **exit closing all background processes**. **DO NOT** start another task.
39
38
  - No git init/remote changes. **No git push**.
40
39
  - Check the last 5 tasks in `.agent/logs/LOG.md` for past work
41
- - When ALL tasks pass → output `<promise>COMPLETE</promise>` and **nothing else**.
40
+ - **CRITICAL**: When ALL tasks pass → output `<promise>COMPLETE</promise>` and **nothing else**.
42
41
 
43
42
  ## Help Tags
44
43
 
@@ -112,8 +112,6 @@ Include the following sections:
112
112
  - Security considerations
113
113
  - Development phases/milestones
114
114
  - Assumptions and dependencies
115
- - Potential challenges and solutions
116
- - Future expansion possibilities
117
115
 
118
116
  Save as: `PROJECT_ROOT/.agent/prd/PRD.md`
119
117
 
package/README.md CHANGED
@@ -1,56 +1,33 @@
1
- # Ralph Loop
1
+ # A Ralph Wiggum Loop implementation that works™
2
2
 
3
- A long-running AI agent loop. Ralph automates software development tasks by iteratively working through a task list until completion.
3
+ Ralph is a long-running AI agent loop. Ralph automates software development tasks by iteratively working through a task list until completion.
4
4
 
5
- This is a hackable script so you can configure it to your env and favorite agentic AI CLI. It's set up by default to use Claude Code in a Docker sandbox.
5
+ This is an implementation that actually works, containing a hackable script so you can configure it to your env and favorite agentic AI CLI. It's set up by default to use Claude Code in a Docker sandbox.
6
6
 
7
7
  ![Ralph Wiggum Loop](https://github.com/user-attachments/assets/052d5290-7e83-4bfb-a6b5-6be761cbe890)
8
8
 
9
-
10
- ## Quick Start
11
-
12
- ```bash
13
- # Run the agent loop (default: 10 iterations)
14
- ./ralph.sh
15
-
16
- # Run with custom iteration limit
17
- ./ralph.sh 5
18
- ./ralph.sh -n 5
19
- ./ralph.sh --max-iterations 5
20
-
21
- # Run exactly one iteration
22
- ./ralph.sh --once
23
-
24
- # Show help
25
- ./ralph.sh --help
26
- ```
27
-
28
- > NB: you might need to run `chmod +x ralph.sh` to make the script executable.
29
-
30
- ## How It Works
31
-
32
- Each iteration, Ralph will:
33
- 1. Find the highest-priority incomplete task from `.agent/tasks.json`
34
- 2. Work through the task steps defined in `.agent/tasks/TASK-{ID}.json`
35
- 3. Run tests, linting, and type checking
36
- 4. Update task status and commit changes
37
- 5. Repeat until all tasks pass or max iterations reached
38
-
39
- ## How Is This Different from Other Ralphs?
40
-
41
- This was kept hackable so you can make it your own.<br/>
42
- The script follows the original concepts of the Ralph Wiggum Loop, working with fresh contexts and providing clear verifiable feedback.
43
-
44
- It also works generically with any task set.
45
-
46
- Besides that:
47
-
48
- - it allows you to dump unstructured requirements and have the agent create a PRD and task list for you.
49
- - it uses a task lookup table with individual detailed steps -> more scalable as you get 100s of tasks done.
50
- - it's sandboxed and more secure
51
- - it shows progress and stats so you can keep an eye on what's been done
52
- - it instructs the agent to write and run automated tests and screenshots per task
53
- - it provides observability and traceability of the agent's work, showing a stream of output and capturing full historical logs per iteration
9
+ - [Getting Started](#getting-started)
10
+ - [Step 1: Install Ralph](#step-1-install-ralph)
11
+ - [Step 2: Create a PRD + task list](#step-2-create-a-prd--task-list)
12
+ - [Step 3: Set up the agent inside Docker sandbox](#step-3-set-up-the-agent-inside-docker-sandbox)
13
+ - [Step 4: Run Ralph](#step-4-run-ralph)
14
+ - [(optional) Adjusting to your language/framework](#optional-adjusting-to-your-languageframework)
15
+ - [Run the loop](#run-the-loop)
16
+ - [How It Works](#how-it-works)
17
+ - [How Is This Different from Other Ralphs?](#how-is-this-different-from-other-ralphs)
18
+ - [Steering the Agent](#steering-the-agent)
19
+ - [Features](#features)
20
+ - [Support](#support)
21
+ - [Promise Tags](#promise-tags)
22
+ - [Exit Codes](#exit-codes)
23
+ - [Structure](#structure)
24
+ - [Skills](#skills)
25
+ - [Available Skills](#available-skills)
26
+ - [Skills Directory Structure](#skills-directory-structure)
27
+ - [Reference](#reference)
28
+ - [Playwright configuration](#playwright-configuration)
29
+ - [Vitest configuration](#vitest-configuration)
30
+ - [License](#license)
54
31
 
55
32
  ## Getting Started
56
33
 
@@ -94,7 +71,7 @@ Then follow the Skill's instructions and verify the PRD and then tasks.<br/>
94
71
  Authenticate inside the Docker sandbox before running Ralph. Run:
95
72
 
96
73
  ```bash
97
- docker sandbox run --credentials host claude
74
+ docker sandbox run claude .
98
75
  ```
99
76
 
100
77
  And follow the instructions to log in into Claude Code.
@@ -119,9 +96,9 @@ This script assumes the following are installed:
119
96
  I recommend using a CLI to bootstrap your project with the necessary tools and dependencies, e.g.:
120
97
 
121
98
  ```bash
122
- npx create-vite@latest my-app --template react-ts
99
+ npx create-vite@latest src --template react-ts
123
100
  # or
124
- npx create-next-app@latest my-app
101
+ npx create-next-app@latest src
125
102
  ```
126
103
 
127
104
  If you must start from a blank slate, which is not recommended, you can use the following commands to install the necessary tools and dependencies:
@@ -141,6 +118,51 @@ npm i @vitejs/plugin-react @testing-library/dom @testing-library/jest-dom @testi
141
118
 
142
119
  ⚠️ The default "mode" is "implementation". Depending on your use case, you might want to change `.agent/PROMPT.md` to a different mode, e.g. "refactor", "review", "test" etc.
143
120
 
121
+ ## Run the loop
122
+
123
+ ```bash
124
+ # Run the agent loop (default: 10 iterations)
125
+ ./ralph.sh
126
+
127
+ # Run with custom iteration limit
128
+ ./ralph.sh 5
129
+ ./ralph.sh -n 5
130
+ ./ralph.sh --max-iterations 5
131
+
132
+ # Run exactly one iteration
133
+ ./ralph.sh --once
134
+
135
+ # Show help
136
+ ./ralph.sh --help
137
+ ```
138
+
139
+ > NB: you might need to run `chmod +x ralph.sh` to make the script executable.
140
+
141
+ ## How It Works
142
+
143
+ Each iteration, Ralph will:
144
+ 1. Find the highest-priority incomplete task from `.agent/tasks.json`
145
+ 2. Work through the task steps defined in `.agent/tasks/TASK-{ID}.json`
146
+ 3. Run tests, linting, and type checking
147
+ 4. Update task status and commit changes
148
+ 5. Repeat until all tasks pass or max iterations reached
149
+
150
+ ## How Is This Different from Other Ralphs?
151
+
152
+ This was kept hackable so you can make it your own.<br/>
153
+ The script follows the original concepts of the Ralph Wiggum Loop, working with fresh contexts and providing clear verifiable feedback.
154
+
155
+ It also works generically with any task set.
156
+
157
+ Besides that:
158
+
159
+ - it allows you to dump unstructured requirements and have the agent create a PRD and task list for you.
160
+ - it uses a task lookup table with individual detailed steps -> more scalable as you get 100s of tasks done.
161
+ - it's sandboxed and more secure
162
+ - it shows progress and stats so you can keep an eye on what's been done
163
+ - it instructs the agent to write and run automated tests and screenshots per task
164
+ - it provides observability and traceability of the agent's work, showing a stream of output and capturing full historical logs per iteration
165
+
144
166
  ## Steering the Agent
145
167
 
146
168
  In some cases, you might notice the agent is having trouble, slowed down or struggling to overcome a blocker.
@@ -243,6 +265,107 @@ Skills are symlinked from `.agent/skills/` to multiple locations for cross-tool
243
265
  .cursor/skills/
244
266
  ```
245
267
 
268
+
269
+ ## Reference
270
+ ### Playwright configuration
271
+
272
+ If you are using Playwright, here is a recommended configuration:
273
+
274
+ ```typescript:playwright.config.ts
275
+ import { defineConfig, devices } from '@playwright/test';
276
+
277
+ /**
278
+ * See https://playwright.dev/docs/test-configuration.
279
+ */
280
+ export default defineConfig({
281
+ testDir: './tests',
282
+ /* Total timeout for the entire test run (30 minutes) */
283
+ globalTimeout: 30 * 60 * 1000,
284
+ /* Run tests in files in parallel */
285
+ fullyParallel: true,
286
+ /* Fail the build on CI if you accidentally left test.only in the source code. */
287
+ forbidOnly: !!process.env.CI,
288
+ /* Retry on failure - 2 on CI, 1 locally */
289
+ retries: process.env.CI ? 2 : 1,
290
+ /* Number of parallel workers */
291
+ workers: process.env.CI ? 1 : 6,
292
+ /* Reporter to use. See https://playwright.dev/docs/test-reporters */
293
+ reporter: 'html',
294
+ /* Shared settings for all the projects below. See https://playwright.dev/docs/api/class-testoptions. */
295
+ use: {
296
+ /* Base URL to use in actions like `await page.goto('')`. */
297
+ baseURL: 'http://localhost:3000',
298
+
299
+ /* Collect trace when retrying the failed test. See https://playwright.dev/docs/trace-viewer */
300
+ trace: 'on-first-retry',
301
+ },
302
+
303
+ // NB: Only test in Desktop Chrome and nothing else.
304
+ projects: [
305
+ {
306
+ name: 'chromium',
307
+ use: { ...devices['Desktop Chrome'] },
308
+ }
309
+ ],
310
+ });
311
+ ```
312
+
313
+ ### Vitest configuration
314
+
315
+ If you are using Vitest, here is a recommended configuration:
316
+
317
+ ```typescript:vitest.config.ts
318
+ import { defineConfig } from 'vitest/config'
319
+ import react from '@vitejs/plugin-react'
320
+ import path from 'path'
321
+
322
+ export default defineConfig({
323
+ plugins: [react()],
324
+ test: {
325
+ environment: 'jsdom',
326
+ globals: true,
327
+ setupFiles: ['./vitest.setup.ts'],
328
+ include: ['**/*.test.{ts,tsx}'],
329
+ exclude: ['node_modules', '.next', 'tests'],
330
+ },
331
+ resolve: {
332
+ alias: {
333
+ '@': path.resolve(__dirname, './'),
334
+ },
335
+ },
336
+ })
337
+
338
+ ```
339
+
340
+ And:
341
+
342
+ ```typescript:vitest.setup.ts
343
+ import '@testing-library/jest-dom/vitest'
344
+ import { vi } from 'vitest'
345
+ import React from 'react'
346
+
347
+ // Mock next/image
348
+ vi.mock('next/image', () => ({
349
+ default: ({ src, alt, ...props }: { src: string; alt: string }) => {
350
+ return React.createElement('img', { src, alt, ...props })
351
+ },
352
+ }))
353
+
354
+ // Mock next/link
355
+ vi.mock('next/link', () => ({
356
+ default: ({
357
+ children,
358
+ href,
359
+ ...props
360
+ }: {
361
+ children: React.ReactNode
362
+ href: string
363
+ }) => {
364
+ return React.createElement('a', { href, ...props }, children)
365
+ },
366
+ }))
367
+ ```
368
+
246
369
  ## License
247
370
 
248
371
  MIT
package/bin/cli.js CHANGED
@@ -8,11 +8,34 @@
8
8
 
9
9
  const fs = require('fs');
10
10
  const path = require('path');
11
+ const { execSync } = require('child_process');
11
12
  const display = require('./lib/display');
12
- const { copyFile, copyDir, mergeDir, exists } = require('./lib/copy');
13
+ const { copyFile, copyDir, mergeDir, exists, ensureDir } = require('./lib/copy');
13
14
 
14
15
  const PACKAGE_ROOT = path.resolve(__dirname, '..');
15
16
  const TARGET_DIR = process.cwd();
17
+ const DEFAULT_APP_DIR = 'src';
18
+
19
+ // Directories to ensure exist (created even if source doesn't exist)
20
+ const DIRS_TO_ENSURE = [
21
+ '.agent/history',
22
+ '.agents/skills',
23
+ '.claude/skills',
24
+ '.codex/skills',
25
+ '.cursor/skills',
26
+ ];
27
+
28
+ // Symlinks to create in skills directories (relative to each skills dir)
29
+ const SKILL_SYMLINKS = [
30
+ { name: 'component-refactoring', target: '../../.agent/skills/component-refactoring/' },
31
+ { name: 'e2e-tester', target: '../../.agent/skills/e2e-tester/' },
32
+ { name: 'frontend-code-review', target: '../../.agent/skills/frontend-code-review/' },
33
+ { name: 'frontend-testing', target: '../../.agent/skills/frontend-testing/' },
34
+ { name: 'prd-creator', target: '../../.agent/skills/prd-creator/' },
35
+ { name: 'skill-creator', target: '../../.agent/skills/skill-creator/' },
36
+ { name: 'vercel-react-best-practices', target: '../../.agent/skills/vercel-react-best-practices/' },
37
+ { name: 'web-design-guidelines', target: '../../.agent/skills/web-design-guidelines/' },
38
+ ];
16
39
 
17
40
  // Files to copy (always overwrite)
18
41
  const FILES_TO_COPY = [
@@ -27,7 +50,6 @@ const CONFIG_FILES = [
27
50
  '.agent/PROMPT.md',
28
51
  '.agent/STEERING.md',
29
52
  '.agent/tasks.json',
30
- '.agent/history/.gitignore',
31
53
  ];
32
54
 
33
55
  // Directories to fully copy (overwrite)
@@ -56,8 +78,11 @@ const EXTRA_DIR_FILES = [
56
78
  { dir: '.claude', exclude: ['settings.local.json', 'skills', 'agents', 'commands', 'hooks'] },
57
79
  ];
58
80
 
59
- function main() {
81
+ async function main() {
82
+ const clack = await import('@clack/prompts');
83
+
60
84
  display.showRalph();
85
+ clack.intro('Setting up your Ralph Loop');
61
86
 
62
87
  // Check if we're in the package directory itself
63
88
  if (path.resolve(TARGET_DIR) === path.resolve(PACKAGE_ROOT)) {
@@ -65,6 +90,52 @@ function main() {
65
90
  process.exit(1);
66
91
  }
67
92
 
93
+ // Prompt 1 — App source directory
94
+ const appDir = await clack.text({
95
+ message: 'Where does your app source code live? (e.g. src, public, etc.)',
96
+ placeholder: DEFAULT_APP_DIR,
97
+ defaultValue: DEFAULT_APP_DIR,
98
+ });
99
+
100
+ if (clack.isCancel(appDir)) {
101
+ clack.cancel('Setup cancelled.');
102
+ process.exit(0);
103
+ }
104
+
105
+ // Prompt 2 — Install Playwright
106
+ const installPlaywright = await clack.confirm({
107
+ message: 'Set up Playwright for E2E testing?',
108
+ initialValue: true,
109
+ });
110
+
111
+ if (clack.isCancel(installPlaywright)) {
112
+ clack.cancel('Setup cancelled.');
113
+ process.exit(0);
114
+ }
115
+
116
+ // Prompt 3 — Dev server address
117
+ const devServerRaw = await clack.text({
118
+ message: 'Where does your dev server run?',
119
+ placeholder: 'localhost:3000',
120
+ defaultValue: 'localhost:3000',
121
+ });
122
+
123
+ if (clack.isCancel(devServerRaw)) {
124
+ clack.cancel('Setup cancelled.');
125
+ process.exit(0);
126
+ }
127
+
128
+ // Normalize dev server URL
129
+ let devServerUrl = devServerRaw.trim();
130
+ if (!/^https?:\/\//.test(devServerUrl)) {
131
+ devServerUrl = `http://${devServerUrl}`;
132
+ }
133
+
134
+ // Create app directory if needed
135
+ if (appDir !== '.') {
136
+ ensureDir(path.join(TARGET_DIR, appDir));
137
+ }
138
+
68
139
  display.printLocation(TARGET_DIR);
69
140
 
70
141
  // Copy individual files
@@ -124,7 +195,41 @@ function main() {
124
195
  display.printSuccess(`${dir}/`);
125
196
  }
126
197
  } else {
127
- display.printWarning(`${dir}/ not found, skipping`);
198
+ // Source doesn't exist, but we'll create it in the ensure step
199
+ ensureDir(dest);
200
+ display.printSuccess(`${dir}/ (created)`);
201
+ }
202
+ }
203
+
204
+ // Ensure required directories exist and create symlinks
205
+ console.log();
206
+ display.printStep('🔧', 'Ensuring directories & symlinks');
207
+ for (const dir of DIRS_TO_ENSURE) {
208
+ const dest = path.join(TARGET_DIR, dir);
209
+ ensureDir(dest);
210
+
211
+ // Create symlinks for skills directories
212
+ if (dir.endsWith('/skills')) {
213
+ for (const link of SKILL_SYMLINKS) {
214
+ const linkPath = path.join(dest, link.name);
215
+ if (!exists(linkPath)) {
216
+ try {
217
+ fs.symlinkSync(link.target, linkPath);
218
+ } catch {
219
+ // Symlink might fail on some systems, that's ok
220
+ }
221
+ }
222
+ }
223
+ display.printSuccess(`${dir}/ (with symlinks)`);
224
+ } else if (dir === '.agent/history') {
225
+ // Create .gitignore for history folder
226
+ const gitignorePath = path.join(dest, '.gitignore');
227
+ if (!exists(gitignorePath)) {
228
+ fs.writeFileSync(gitignorePath, '*\n!.gitignore\n');
229
+ }
230
+ display.printSuccess(`${dir}/ (with .gitignore)`);
231
+ } else {
232
+ display.printSuccess(`${dir}/`);
128
233
  }
129
234
  }
130
235
 
@@ -149,8 +254,74 @@ function main() {
149
254
  }
150
255
  }
151
256
 
257
+ // Replace dev server URL in copied files if different from default
258
+ if (devServerUrl !== 'http://localhost:3000') {
259
+ const filesToPatch = [
260
+ path.join(TARGET_DIR, '.agent/PROMPT.md'),
261
+ path.join(TARGET_DIR, 'scripts/assets/playwright.config.ts'),
262
+ ];
263
+
264
+ console.log();
265
+ display.printStep('🌐', 'Dev server URL');
266
+ for (const file of filesToPatch) {
267
+ if (exists(file)) {
268
+ const content = fs.readFileSync(file, 'utf8');
269
+ const updated = content.replaceAll('http://localhost:3000', devServerUrl);
270
+ fs.writeFileSync(file, updated, 'utf8');
271
+ display.printSuccess(`${path.relative(TARGET_DIR, file)} → ${devServerUrl}`);
272
+ }
273
+ }
274
+ }
275
+
276
+ // Replace app directory in PROMPT.md if different from default
277
+ if (appDir !== DEFAULT_APP_DIR) {
278
+ const promptFile = path.join(TARGET_DIR, '.agent/PROMPT.md');
279
+ if (exists(promptFile)) {
280
+ console.log();
281
+ display.printStep('📂', 'App directory');
282
+ const content = fs.readFileSync(promptFile, 'utf8');
283
+ const updated = content.replaceAll('`' + DEFAULT_APP_DIR + '`', `\`${appDir}\``);
284
+ fs.writeFileSync(promptFile, updated, 'utf8');
285
+ display.printSuccess(`.agent/PROMPT.md → ${appDir}`);
286
+ }
287
+ }
288
+
289
+ // Playwright setup
290
+ if (installPlaywright) {
291
+ console.log();
292
+ display.printStep('🎭', 'Playwright setup');
293
+
294
+ // Copy playwright config to app directory
295
+ const playwrightSrc = path.join(TARGET_DIR, 'scripts/assets/playwright.config.ts');
296
+ const playwrightDest = path.join(TARGET_DIR, appDir, 'playwright.config.ts');
297
+
298
+ if (exists(playwrightSrc)) {
299
+ copyFile(playwrightSrc, playwrightDest);
300
+ display.printSuccess(`playwright.config.ts → ${appDir}/`);
301
+ }
302
+
303
+ // Install Playwright browsers
304
+ const s = clack.spinner();
305
+ s.start('Installing Playwright browsers (chromium)...');
306
+ try {
307
+ execSync('npx playwright install --with-deps chromium', {
308
+ cwd: path.join(TARGET_DIR, appDir),
309
+ stdio: 'pipe',
310
+ });
311
+ s.stop('Playwright browsers installed');
312
+ } catch (err) {
313
+ s.stop('Playwright browser install failed');
314
+ display.printWarning('Could not install Playwright browsers. Run manually:');
315
+ display.printWarning(` cd ${appDir} && npx playwright install --with-deps chromium`);
316
+ }
317
+ }
318
+
152
319
  // Success message
153
320
  display.showComplete();
321
+ clack.outro('Happy looping!');
154
322
  }
155
323
 
156
- main();
324
+ main().catch(err => {
325
+ display.printError(err.message);
326
+ process.exit(1);
327
+ });
@@ -32,7 +32,7 @@ function formatCatchphrase(text) {
32
32
  const leftPad = Math.floor(padding / 2);
33
33
  const rightPad = padding - leftPad;
34
34
 
35
- return prefix + ''.repeat(leftPad) + text + ''.repeat(rightPad) + suffix;
35
+ return prefix + ' '.repeat(leftPad) + text + ' '.repeat(rightPad) + suffix;
36
36
  }
37
37
 
38
38
  // Catchphrases (text only, will be formatted)
@@ -77,7 +77,7 @@ ${Y}██████╗ █████╗ ██╗ ██████
77
77
  ╚══════╝ ╚═════╝ ╚═════╝ ╚═╝${R}
78
78
 
79
79
  ${D}${emptyLine}${R}
80
- ${Y}${catchphrase}${R}
80
+ ${D}${catchphrase}${R}
81
81
  ${D}${emptyLine}${R}
82
82
  ${D}═══════════════════════════════════════════════════════════${R}
83
83
  ${C}Ralph Wiggum Loop${R} ${D}・${R} Long-running AI agents
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@pageai/ralph-loop",
3
- "version": "1.0.1",
3
+ "version": "1.2.0",
4
4
  "publishConfig": {
5
5
  "access": "public"
6
6
  },
@@ -61,5 +61,8 @@
61
61
  "bugs": {
62
62
  "url": "https://github.com/pageai-pro/ralph-loop/issues"
63
63
  },
64
- "homepage": "https://github.com/pageai-pro/ralph-loop#readme"
64
+ "homepage": "https://github.com/pageai-pro/ralph-loop#readme",
65
+ "dependencies": {
66
+ "@clack/prompts": "^1.0.0"
67
+ }
65
68
  }
package/ralph.sh CHANGED
@@ -98,7 +98,7 @@ $(cat $SCRIPT_DIR/.agent/PROMPT.md)"
98
98
  export PROMPT_CONTENT
99
99
  export DOCKER_DEFAULT_PLATFORM=linux/amd64 # Needed for Playwright.
100
100
 
101
- script -q "$OUTPUT_FILE" bash -c 'docker sandbox run --credentials host claude --model opus --output-format stream-json --verbose -p "$PROMPT_CONTENT"' >/dev/null 2>&1 &
101
+ script -q "$OUTPUT_FILE" bash -c 'docker sandbox run claude . -- --model opus --output-format stream-json --verbose -p "$PROMPT_CONTENT"' >/dev/null 2>&1 &
102
102
  AGENT_PID=$!
103
103
 
104
104
  # Track position in output file for incremental reading
@@ -176,7 +176,7 @@ $(cat $SCRIPT_DIR/.agent/PROMPT.md)"
176
176
  echo -e " Invalid API key. Please authenticate inside the Docker sandbox."
177
177
  echo -e ""
178
178
  echo -e " Run the following command and follow the login instructions:"
179
- echo -e " ${C}docker sandbox run --credentials host claude${R}"
179
+ echo -e " ${C}docker sandbox run claude . --${R}"
180
180
  echo -e "${RD}░░▒▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▒▒░░${R}"
181
181
  rm -f "$OUTPUT_FILE" "$FULL_OUTPUT_FILE"
182
182
  exit $EXIT_AUTH_ERROR
@@ -0,0 +1,27 @@
1
+ import { defineConfig, devices } from '@playwright/test';
2
+
3
+ /**
4
+ * See https://playwright.dev/docs/test-configuration.
5
+ */
6
+ export default defineConfig({
7
+ testDir: './tests',
8
+ fullyParallel: true,
9
+ globalTimeout: 30 * 60 * 1000,
10
+ forbidOnly: !!process.env.CI,
11
+ retries: process.env.CI ? 2 : 1,
12
+ workers: process.env.CI ? 3 : 6,
13
+ reporter: 'html',
14
+ use: {
15
+ baseURL: 'http://localhost:3000',
16
+ trace: 'on-first-retry',
17
+ },
18
+
19
+
20
+ // NB: only chromium will run in Docker (arm64).
21
+ projects: [
22
+ {
23
+ name: 'chromium',
24
+ use: { ...devices['Desktop Chrome'] },
25
+ }
26
+ ],
27
+ });
@@ -3,13 +3,14 @@
3
3
  # Step timing tracking and duration formatting
4
4
  # Dependencies: constants.sh
5
5
  #
6
- # Steps: Thinking, Reading code, Implementing, Writing tests, Testing, Linting,
7
- # Typechecking, Committing
6
+ # Steps: Thinking, Planning, Reading code, Web research, Implementing, Debugging,
7
+ # Writing tests, Testing, Linting, Typechecking, Installing, Verifying,
8
+ # Waiting, Committing
8
9
 
9
10
  # Step timing tracking (using indexed arrays for bash 3.x compatibility)
10
- STEP_NAMES=("Thinking" "Reading code" "Implementing" "Writing tests" "Testing" "Linting" "Typechecking" "Committing")
11
- ITERATION_STEP_VALUES=(0 0 0 0 0 0 0 0) # Step times for current iteration
12
- SESSION_STEP_VALUES=(0 0 0 0 0 0 0 0) # Accumulated step times across all iterations
11
+ STEP_NAMES=("Thinking" "Planning" "Reading code" "Web research" "Implementing" "Debugging" "Writing tests" "Testing" "Linting" "Typechecking" "Installing" "Verifying" "Waiting" "Committing")
12
+ ITERATION_STEP_VALUES=(0 0 0 0 0 0 0 0 0 0 0 0 0 0) # Step times for current iteration
13
+ SESSION_STEP_VALUES=(0 0 0 0 0 0 0 0 0 0 0 0 0 0) # Accumulated step times across all iterations
13
14
  CURRENT_STEP_NAME="" # Name of current step being timed
14
15
  CURRENT_STEP_START=0 # Timestamp when current step started
15
16
 
@@ -17,13 +18,19 @@ CURRENT_STEP_START=0 # Timestamp when current step started
17
18
  get_step_emoji() {
18
19
  case "$1" in
19
20
  "Thinking") echo "🤔" ;;
21
+ "Planning") echo "🗺️" ;;
20
22
  "Reading code") echo "📖" ;;
23
+ "Web research") echo "🌐" ;;
21
24
  "Implementing") echo "⚡" ;;
25
+ "Debugging") echo "🐛" ;;
22
26
  "Writing tests") echo "✍️" ;;
23
27
  "Testing") echo "🧪" ;;
24
28
  "Linting") echo "🧹" ;;
25
29
  "Typechecking") echo "📝" ;;
26
- "Committing") echo "📦" ;;
30
+ "Installing") echo "📦" ;;
31
+ "Verifying") echo "✅" ;;
32
+ "Waiting") echo "⏳" ;;
33
+ "Committing") echo "🚀" ;;
27
34
  *) echo "" ;;
28
35
  esac
29
36
  }
@@ -180,7 +187,7 @@ format_step_times() {
180
187
  # Usage: init_iteration_step_times
181
188
  init_iteration_step_times() {
182
189
  # Clear iteration step times (reset all to 0)
183
- ITERATION_STEP_VALUES=(0 0 0 0 0 0 0 0)
190
+ ITERATION_STEP_VALUES=(0 0 0 0 0 0 0 0 0 0 0 0 0 0)
184
191
 
185
192
  # Start timing with "Thinking" as default initial step
186
193
  CURRENT_STEP_NAME="Thinking"
@@ -198,25 +205,122 @@ display_session_step_totals() {
198
205
  }
199
206
 
200
207
  # Detect current step from output line
201
- # Returns: step name based on output patterns
208
+ # Returns: step name based on output patterns (exact match to STEP_NAMES)
209
+ # Priority: Implementing first, then other steps in order of specificity
202
210
  detect_step() {
203
211
  local line="$1"
204
212
 
205
- # Check patterns in order of specificity
206
- if echo "$line" | grep -qiE "(git commit|committing)"; then
207
- echo "Committing "
208
- elif echo "$line" | grep -qiE "(npm test|jest|vitest|testing|test.*pass|test.*fail)"; then
209
- echo "Testing "
210
- elif echo "$line" | grep -qiE "(eslint|lint|prettier|formatting)"; then
211
- echo "Linting "
212
- elif echo "$line" | grep -qiE "(npm run typecheck|tsc|typescript|typecheck)"; then
213
- echo "Typechecking "
214
- elif echo "$line" | grep -qiE "(\.test\.|\.spec\.|test file|writing test)"; then
215
- echo "Writing tests "
216
- elif echo "$line" | grep -qiE "(write|edit|creating|updating|modifying).*\.(ts|js|sh|json|md)"; then
217
- echo "Implementing "
218
- elif echo "$line" | grep -qiE "(read|glob|grep|searching|finding|looking)"; then
219
- echo "Reading code "
213
+ # IMPLEMENTING - highest priority (includes building)
214
+ # Tool-based: Write, Edit tools with file paths
215
+ # Natural: creating, writing, editing, modifying, updating files
216
+ # Building: npm run build, vite build, compiling, bundling
217
+ if echo "$line" | grep -qiE "(Write|Edit).*file_path"; then
218
+ echo "Implementing"
219
+ elif echo "$line" | grep -qiE "(creating|writing new|editing|modifying|updating|changing).*\.(ts|tsx|js|jsx|sh|py|go|rs|json|yaml|yml|toml|css|scss|html)"; then
220
+ echo "Implementing"
221
+ elif echo "$line" | grep -qiE "(npm|yarn|pnpm|bun) run build"; then
222
+ echo "Implementing"
223
+ elif echo "$line" | grep -qiE "(vite|webpack|esbuild|rollup|turbo|tsc) build"; then
224
+ echo "Implementing"
225
+ elif echo "$line" | grep -qiE "(compiling|bundling|transpiling)"; then
226
+ echo "Implementing"
227
+
228
+ # COMMITTING - git operations
229
+ elif echo "$line" | grep -qiE "(git commit|git add|committing|staged for commit)"; then
230
+ echo "Committing"
231
+
232
+ # TESTING - running test suites
233
+ elif echo "$line" | grep -qiE "(npm|yarn|pnpm) (run )?(test|e2e|spec)"; then
234
+ echo "Testing"
235
+ elif echo "$line" | grep -qiE "(jest|vitest|playwright|cypress|mocha|pytest)"; then
236
+ echo "Testing"
237
+ elif echo "$line" | grep -qiE "(test|spec).*(pass|fail|skip|pending)"; then
238
+ echo "Testing"
239
+ elif echo "$line" | grep -qiE "(running|executing) (tests|test suite)"; then
240
+ echo "Testing"
241
+
242
+ # DEBUGGING - investigating errors, fixing issues
243
+ elif echo "$line" | grep -qiE "(the|this) (error|issue|problem|bug) (is|seems|appears|was)"; then
244
+ echo "Debugging"
245
+ elif echo "$line" | grep -qiE "(investigating|debugging|diagnosing|troubleshooting)"; then
246
+ echo "Debugging"
247
+ elif echo "$line" | grep -qiE "(fails|failed|failing|broken) because"; then
248
+ echo "Debugging"
249
+ elif echo "$line" | grep -qiE "(let me|i'll) (check|see|figure out|understand) why"; then
250
+ echo "Debugging"
251
+ elif echo "$line" | grep -qiE "(root cause|stack trace|traceback|exception)"; then
252
+ echo "Debugging"
253
+
254
+ # LINTING - code style and formatting
255
+ elif echo "$line" | grep -qiE "(eslint|biome|lint|prettier|formatting|stylelint)"; then
256
+ echo "Linting"
257
+
258
+ # TYPECHECKING - type validation
259
+ elif echo "$line" | grep -qiE "(npm run typecheck|tsc|typescript|type.?check|mypy|pyright)"; then
260
+ echo "Typechecking"
261
+
262
+ # WRITING TESTS - creating test files
263
+ elif echo "$line" | grep -qiE "(\.test\.|\.spec\.|test file|writing test|adding test)"; then
264
+ echo "Writing tests"
265
+ elif echo "$line" | grep -qiE "(creating|writing).*(test|spec)"; then
266
+ echo "Writing tests"
267
+
268
+ # INSTALLING - package/dependency management
269
+ elif echo "$line" | grep -qiE "(npm|yarn|pnpm|bun) (install|add|i )"; then
270
+ echo "Installing"
271
+ elif echo "$line" | grep -qiE "(pip|poetry|cargo|go get|brew) install"; then
272
+ echo "Installing"
273
+ elif echo "$line" | grep -qiE "(installing|adding|updating) (dependency|dependencies|package)"; then
274
+ echo "Installing"
275
+
276
+ # WEB RESEARCH - fetching docs, searching web
277
+ elif echo "$line" | grep -qiE "(WebFetch|WebSearch)"; then
278
+ echo "Web research"
279
+ elif echo "$line" | grep -qiE "(fetching|looking up|searching).*(docs|documentation|api|web)"; then
280
+ echo "Web research"
281
+ elif echo "$line" | grep -qiE "(let me|i'll) search (for|the web|online)"; then
282
+ echo "Web research"
283
+
284
+ # VERIFYING - checking work, validation
285
+ elif echo "$line" | grep -qiE "(verifying|confirming|validating) (the|that|it)"; then
286
+ echo "Verifying"
287
+ elif echo "$line" | grep -qiE "(let me|i'll) (verify|confirm|make sure|double.?check)"; then
288
+ echo "Verifying"
289
+ elif echo "$line" | grep -qiE "(looks correct|works as expected|successful)"; then
290
+ echo "Verifying"
291
+
292
+ # WAITING - blocked on user input
293
+ elif echo "$line" | grep -qiE "(AskUserQuestion|waiting for|blocked on)"; then
294
+ echo "Waiting"
295
+ elif echo "$line" | grep -qiE "(need|require|awaiting) (input|clarification|confirmation|response)"; then
296
+ echo "Waiting"
297
+
298
+ # PLANNING - designing approach
299
+ elif echo "$line" | grep -qiE "(EnterPlanMode|ExitPlanMode|plan mode)"; then
300
+ echo "Planning"
301
+ elif echo "$line" | grep -qiE "(let me|i'll|i need to) (plan|outline|design|architect)"; then
302
+ echo "Planning"
303
+ elif echo "$line" | grep -qiE "(my|the) (approach|strategy|plan) (is|will be)"; then
304
+ echo "Planning"
305
+ elif echo "$line" | grep -qiE "(step [0-9]|first,|second,|finally,).*(i should|i need|i will|we need)"; then
306
+ echo "Planning"
307
+
308
+ # READING CODE - exploring codebase
309
+ elif echo "$line" | grep -qiE "Read.*file_path|Glob.*pattern|Grep.*pattern"; then
310
+ echo "Reading code"
311
+ elif echo "$line" | grep -qiE "(let me|i'll) (read|examine|look at|inspect|check) (the|this)"; then
312
+ echo "Reading code"
313
+ elif echo "$line" | grep -qiE "(searching|finding|looking) (for|at|through)"; then
314
+ echo "Reading code"
315
+ elif echo "$line" | grep -qiE "(reviewing|scanning|analyzing) (the )?(code|file|implementation|codebase)"; then
316
+ echo "Reading code"
317
+
318
+ # THINKING - deliberation, analysis (lowest priority - default fallback category)
319
+ elif echo "$line" | grep -qiE "(let me|i need to|i'll) (think|consider|analyze|understand)"; then
320
+ echo "Thinking"
321
+ elif echo "$line" | grep -qiE "(hmm|interesting|notably|this suggests|the question is)"; then
322
+ echo "Thinking"
323
+
220
324
  else
221
325
  echo ""
222
326
  fi