npm - @pageai/ralph-loop - Versions diffs - 1.0.1 → 1.2.0 - Mend

@pageai/ralph-loop 1.0.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/.agent/PROMPT.md +8 -9
package/.agent/skills/prd-creator/PRD.md +0 -2
package/README.md +174 -51
package/bin/cli.js +176 -5
package/bin/lib/display.js +2 -2
package/package.json +5 -2
package/ralph.sh +2 -2
package/scripts/assets/playwright.config.ts +27 -0
package/scripts/lib/timing.sh +127 -23

package/.agent/PROMPT.md CHANGED Viewed

@@ -1,21 +1,19 @@
-@.agent/tasks.json
 ## Overview
 You are implementing the project described in @.agent/prd/SUMMARY.md
-Tasks are listed in @.agent/tasks.json
 ## Required Setup
-Run `npm run dev` (as background process).
-App will be running at http://localhost:6006
+Run `npm run dev` (as background process) in `src` directory.
+App will be running at http://localhost:3000
 ## Before Starting
 Check @.agent/STEERING.md for critical work. Complete items in sequence, remove when done. Only proceed to implement tasks if no critical work pending.
-## Task Flow (ONE TASK ONLY)
+## Task Flow
+Tasks are listed in @.agent/tasks.json
 1. Pick highest-priority task with `passes: false` in `tasks.json`
 2. Read full spec: `.agent/tasks/TASK-${ID}.json`
@@ -31,14 +29,15 @@ Check @.agent/STEERING.md for critical work. Complete items in sequence, remove
 8. All tests must pass. Broke unrelated test? Fix it before proceeding.
 9. When tests pass, set `passes: true` in `tasks.json` for the task you completed.
 10. Log entry → `.agent/logs/LOG.md` (date, brief summary, screenshot path)
-11. Update `.agent/STRUCTURE.md` if dirs changed
+11. Update `.agent/STRUCTURE.md` if dirs changed. Exclude dotfiles, tests and config.
 12. Commit changes, using the Conventional Commit format.
 ## Rules
+- **IMPORTANT**: only work on one task at a time and **exit closing all background processes**. **DO NOT** start another task.
 - No git init/remote changes. **No git push**.
 - Check the last 5 tasks in `.agent/logs/LOG.md` for past work
-- When ALL tasks pass → output `<promise>COMPLETE</promise>` and **nothing else**.
+- **CRITICAL**: When ALL tasks pass → output `<promise>COMPLETE</promise>` and **nothing else**.
 ## Help Tags

package/.agent/skills/prd-creator/PRD.md CHANGED Viewed

@@ -112,8 +112,6 @@ Include the following sections:
 - Security considerations
 - Development phases/milestones
 - Assumptions and dependencies
-- Potential challenges and solutions
-- Future expansion possibilities
 Save as: `PROJECT_ROOT/.agent/prd/PRD.md`

package/README.md CHANGED Viewed

@@ -1,56 +1,33 @@
-# Ralph Loop
+# A Ralph Wiggum Loop implementation that works™
-A long-running AI agent loop. Ralph automates software development tasks by iteratively working through a task list until completion.
+Ralph is a long-running AI agent loop. Ralph automates software development tasks by iteratively working through a task list until completion.
-This is a hackable script so you can configure it to your env and favorite agentic AI CLI. It's set up by default to use Claude Code in a Docker sandbox.
+This is an implementation that actually works, containing a hackable script so you can configure it to your env and favorite agentic AI CLI. It's set up by default to use Claude Code in a Docker sandbox.
 ![Ralph Wiggum Loop](https://github.com/user-attachments/assets/052d5290-7e83-4bfb-a6b5-6be761cbe890)
-## Quick Start
-```bash
-# Run the agent loop (default: 10 iterations)
-./ralph.sh
-# Run with custom iteration limit
-./ralph.sh 5
-./ralph.sh -n 5
-./ralph.sh --max-iterations 5
-# Run exactly one iteration
-./ralph.sh --once
-# Show help
-./ralph.sh --help
-```
-> NB: you might need to run `chmod +x ralph.sh` to make the script executable.
-## How It Works
-Each iteration, Ralph will:
-1. Find the highest-priority incomplete task from `.agent/tasks.json`
-2. Work through the task steps defined in `.agent/tasks/TASK-{ID}.json`
-3. Run tests, linting, and type checking
-4. Update task status and commit changes
-5. Repeat until all tasks pass or max iterations reached
-## How Is This Different from Other Ralphs?
-This was kept hackable so you can make it your own.<br/>
-The script follows the original concepts of the Ralph Wiggum Loop, working with fresh contexts and providing clear verifiable feedback.
-It also works generically with any task set.
-Besides that:
-- it allows you to dump unstructured requirements and have the agent create a PRD and task list for you.
-- it uses a task lookup table with individual detailed steps -> more scalable as you get 100s of tasks done.
-- it's sandboxed and more secure
-- it shows progress and stats so you can keep an eye on what's been done
-- it instructs the agent to write and run automated tests and screenshots per task
-- it provides observability and traceability of the agent's work, showing a stream of output and capturing full historical logs per iteration
+- [Getting Started](#getting-started)
+  - [Step 1: Install Ralph](#step-1-install-ralph)
+  - [Step 2: Create a PRD + task list](#step-2-create-a-prd--task-list)
+  - [Step 3: Set up the agent inside Docker sandbox](#step-3-set-up-the-agent-inside-docker-sandbox)
+  - [Step 4: Run Ralph](#step-4-run-ralph)
+  - [(optional) Adjusting to your language/framework](#optional-adjusting-to-your-languageframework)
+- [Run the loop](#run-the-loop)
+- [How It Works](#how-it-works)
+- [How Is This Different from Other Ralphs?](#how-is-this-different-from-other-ralphs)
+- [Steering the Agent](#steering-the-agent)
+- [Features](#features)
+- [Support](#support)
+  - [Promise Tags](#promise-tags)
+  - [Exit Codes](#exit-codes)
+- [Structure](#structure)
+- [Skills](#skills)
+  - [Available Skills](#available-skills)
+  - [Skills Directory Structure](#skills-directory-structure)
+- [Reference](#reference)
+  - [Playwright configuration](#playwright-configuration)
+  - [Vitest configuration](#vitest-configuration)
+- [License](#license)
 ## Getting Started
@@ -94,7 +71,7 @@ Then follow the Skill's instructions and verify the PRD and then tasks.<br/>
 Authenticate inside the Docker sandbox before running Ralph. Run:
 ```bash
-docker sandbox run --credentials host claude
+docker sandbox run claude .
 ```
 And follow the instructions to log in into Claude Code.
@@ -119,9 +96,9 @@ This script assumes the following are installed:
 I recommend using a CLI to bootstrap your project with the necessary tools and dependencies, e.g.:
 ```bash
-npx create-vite@latest my-app --template react-ts
+npx create-vite@latest src --template react-ts
 # or
-npx create-next-app@latest my-app
+npx create-next-app@latest src
 ```
 If you must start from a blank slate, which is not recommended, you can use the following commands to install the necessary tools and dependencies:
@@ -141,6 +118,51 @@ npm i @vitejs/plugin-react @testing-library/dom @testing-library/jest-dom @testi
 ⚠️ The default "mode" is "implementation". Depending on your use case, you might want to change `.agent/PROMPT.md` to a different mode, e.g. "refactor", "review", "test" etc.
+## Run the loop
+```bash
+# Run the agent loop (default: 10 iterations)
+./ralph.sh
+# Run with custom iteration limit
+./ralph.sh 5
+./ralph.sh -n 5
+./ralph.sh --max-iterations 5
+# Run exactly one iteration
+./ralph.sh --once
+# Show help
+./ralph.sh --help
+```
+> NB: you might need to run `chmod +x ralph.sh` to make the script executable.
+## How It Works
+Each iteration, Ralph will:
+1. Find the highest-priority incomplete task from `.agent/tasks.json`
+2. Work through the task steps defined in `.agent/tasks/TASK-{ID}.json`
+3. Run tests, linting, and type checking
+4. Update task status and commit changes
+5. Repeat until all tasks pass or max iterations reached
+## How Is This Different from Other Ralphs?
+This was kept hackable so you can make it your own.<br/>
+The script follows the original concepts of the Ralph Wiggum Loop, working with fresh contexts and providing clear verifiable feedback.
+It also works generically with any task set.
+Besides that:
+- it allows you to dump unstructured requirements and have the agent create a PRD and task list for you.
+- it uses a task lookup table with individual detailed steps -> more scalable as you get 100s of tasks done.
+- it's sandboxed and more secure
+- it shows progress and stats so you can keep an eye on what's been done
+- it instructs the agent to write and run automated tests and screenshots per task
+- it provides observability and traceability of the agent's work, showing a stream of output and capturing full historical logs per iteration
 ## Steering the Agent
 In some cases, you might notice the agent is having trouble, slowed down or struggling to overcome a blocker.
@@ -243,6 +265,107 @@ Skills are symlinked from `.agent/skills/` to multiple locations for cross-tool
 .cursor/skills/
 ```
+## Reference
+### Playwright configuration
+If you are using Playwright, here is a recommended configuration:
+```typescript:playwright.config.ts
+import { defineConfig, devices } from '@playwright/test';
+/**
+ * See https://playwright.dev/docs/test-configuration.
+ */
+export default defineConfig({
+  testDir: './tests',
+  /* Total timeout for the entire test run (30 minutes) */
+  globalTimeout: 30 * 60 * 1000,
+  /* Run tests in files in parallel */
+  fullyParallel: true,
+  /* Fail the build on CI if you accidentally left test.only in the source code. */
+  forbidOnly: !!process.env.CI,
+  /* Retry on failure - 2 on CI, 1 locally */
+  retries: process.env.CI ? 2 : 1,
+  /* Number of parallel workers */
+  workers: process.env.CI ? 1 : 6,
+  /* Reporter to use. See https://playwright.dev/docs/test-reporters */
+  reporter: 'html',
+  /* Shared settings for all the projects below. See https://playwright.dev/docs/api/class-testoptions. */
+  use: {
+    /* Base URL to use in actions like `await page.goto('')`. */
+    baseURL: 'http://localhost:3000',
+    /* Collect trace when retrying the failed test. See https://playwright.dev/docs/trace-viewer */
+    trace: 'on-first-retry',
+  },
+  // NB: Only test in Desktop Chrome and nothing else.
+  projects: [
+    {
+      name: 'chromium',
+      use: { ...devices['Desktop Chrome'] },
+    }
+  ],
+});
+```
+### Vitest configuration
+If you are using Vitest, here is a recommended configuration:
+```typescript:vitest.config.ts
+import { defineConfig } from 'vitest/config'
+import react from '@vitejs/plugin-react'
+import path from 'path'
+export default defineConfig({
+  plugins: [react()],
+  test: {
+    environment: 'jsdom',
+    globals: true,
+    setupFiles: ['./vitest.setup.ts'],
+    include: ['**/*.test.{ts,tsx}'],
+    exclude: ['node_modules', '.next', 'tests'],
+  },
+  resolve: {
+    alias: {
+      '@': path.resolve(__dirname, './'),
+    },
+  },
+})
+```
+And:
+```typescript:vitest.setup.ts
+import '@testing-library/jest-dom/vitest'
+import { vi } from 'vitest'
+import React from 'react'
+// Mock next/image
+vi.mock('next/image', () => ({
+  default: ({ src, alt, ...props }: { src: string; alt: string }) => {
+    return React.createElement('img', { src, alt, ...props })
+  },
+}))
+// Mock next/link
+vi.mock('next/link', () => ({
+  default: ({
+    children,
+    href,
+    ...props
+  }: {
+    children: React.ReactNode
+    href: string
+  }) => {
+    return React.createElement('a', { href, ...props }, children)
+  },
+}))
+```
 ## License
 MIT

package/bin/cli.js CHANGED Viewed

@@ -8,11 +8,34 @@
 const fs = require('fs');
 const path = require('path');
+const { execSync } = require('child_process');
 const display = require('./lib/display');
-const { copyFile, copyDir, mergeDir, exists } = require('./lib/copy');
+const { copyFile, copyDir, mergeDir, exists, ensureDir } = require('./lib/copy');
 const PACKAGE_ROOT = path.resolve(__dirname, '..');
 const TARGET_DIR = process.cwd();
+const DEFAULT_APP_DIR = 'src';
+// Directories to ensure exist (created even if source doesn't exist)
+const DIRS_TO_ENSURE = [
+  '.agent/history',
+  '.agents/skills',
+  '.claude/skills',
+  '.codex/skills',
+  '.cursor/skills',
+];
+// Symlinks to create in skills directories (relative to each skills dir)
+const SKILL_SYMLINKS = [
+  { name: 'component-refactoring', target: '../../.agent/skills/component-refactoring/' },
+  { name: 'e2e-tester', target: '../../.agent/skills/e2e-tester/' },
+  { name: 'frontend-code-review', target: '../../.agent/skills/frontend-code-review/' },
+  { name: 'frontend-testing', target: '../../.agent/skills/frontend-testing/' },
+  { name: 'prd-creator', target: '../../.agent/skills/prd-creator/' },
+  { name: 'skill-creator', target: '../../.agent/skills/skill-creator/' },
+  { name: 'vercel-react-best-practices', target: '../../.agent/skills/vercel-react-best-practices/' },
+  { name: 'web-design-guidelines', target: '../../.agent/skills/web-design-guidelines/' },
+];
 // Files to copy (always overwrite)
 const FILES_TO_COPY = [
@@ -27,7 +50,6 @@ const CONFIG_FILES = [
   '.agent/PROMPT.md',
   '.agent/STEERING.md',
   '.agent/tasks.json',
-  '.agent/history/.gitignore',
 ];
 // Directories to fully copy (overwrite)
@@ -56,8 +78,11 @@ const EXTRA_DIR_FILES = [
   { dir: '.claude', exclude: ['settings.local.json', 'skills', 'agents', 'commands', 'hooks'] },
 ];
-function main() {
+async function main() {
+  const clack = await import('@clack/prompts');
   display.showRalph();
+  clack.intro('Setting up your Ralph Loop');
   // Check if we're in the package directory itself
   if (path.resolve(TARGET_DIR) === path.resolve(PACKAGE_ROOT)) {
@@ -65,6 +90,52 @@ function main() {
     process.exit(1);
   }
+  // Prompt 1 — App source directory
+  const appDir = await clack.text({
+    message: 'Where does your app source code live? (e.g. src, public, etc.)',
+    placeholder: DEFAULT_APP_DIR,
+    defaultValue: DEFAULT_APP_DIR,
+  });
+  if (clack.isCancel(appDir)) {
+    clack.cancel('Setup cancelled.');
+    process.exit(0);
+  }
+  // Prompt 2 — Install Playwright
+  const installPlaywright = await clack.confirm({
+    message: 'Set up Playwright for E2E testing?',
+    initialValue: true,
+  });
+  if (clack.isCancel(installPlaywright)) {
+    clack.cancel('Setup cancelled.');
+    process.exit(0);
+  }
+  // Prompt 3 — Dev server address
+  const devServerRaw = await clack.text({
+    message: 'Where does your dev server run?',
+    placeholder: 'localhost:3000',
+    defaultValue: 'localhost:3000',
+  });
+  if (clack.isCancel(devServerRaw)) {
+    clack.cancel('Setup cancelled.');
+    process.exit(0);
+  }
+  // Normalize dev server URL
+  let devServerUrl = devServerRaw.trim();
+  if (!/^https?:\/\//.test(devServerUrl)) {
+    devServerUrl = `http://${devServerUrl}`;
+  }
+  // Create app directory if needed
+  if (appDir !== '.') {
+    ensureDir(path.join(TARGET_DIR, appDir));
+  }
   display.printLocation(TARGET_DIR);
   // Copy individual files
@@ -124,7 +195,41 @@ function main() {
         display.printSuccess(`${dir}/`);
       }
     } else {
-      display.printWarning(`${dir}/ not found, skipping`);
+      // Source doesn't exist, but we'll create it in the ensure step
+      ensureDir(dest);
+      display.printSuccess(`${dir}/ (created)`);
+    }
+  }
+  // Ensure required directories exist and create symlinks
+  console.log();
+  display.printStep('🔧', 'Ensuring directories & symlinks');
+  for (const dir of DIRS_TO_ENSURE) {
+    const dest = path.join(TARGET_DIR, dir);
+    ensureDir(dest);
+    // Create symlinks for skills directories
+    if (dir.endsWith('/skills')) {
+      for (const link of SKILL_SYMLINKS) {
+        const linkPath = path.join(dest, link.name);
+        if (!exists(linkPath)) {
+          try {
+            fs.symlinkSync(link.target, linkPath);
+          } catch {
+            // Symlink might fail on some systems, that's ok
+          }
+        }
+      }
+      display.printSuccess(`${dir}/ (with symlinks)`);
+    } else if (dir === '.agent/history') {
+      // Create .gitignore for history folder
+      const gitignorePath = path.join(dest, '.gitignore');
+      if (!exists(gitignorePath)) {
+        fs.writeFileSync(gitignorePath, '*\n!.gitignore\n');
+      }
+      display.printSuccess(`${dir}/ (with .gitignore)`);
+    } else {
+      display.printSuccess(`${dir}/`);
     }
   }
@@ -149,8 +254,74 @@ function main() {
     }
   }
+  // Replace dev server URL in copied files if different from default
+  if (devServerUrl !== 'http://localhost:3000') {
+    const filesToPatch = [
+      path.join(TARGET_DIR, '.agent/PROMPT.md'),
+      path.join(TARGET_DIR, 'scripts/assets/playwright.config.ts'),
+    ];
+    console.log();
+    display.printStep('🌐', 'Dev server URL');
+    for (const file of filesToPatch) {
+      if (exists(file)) {
+        const content = fs.readFileSync(file, 'utf8');
+        const updated = content.replaceAll('http://localhost:3000', devServerUrl);
+        fs.writeFileSync(file, updated, 'utf8');
+        display.printSuccess(`${path.relative(TARGET_DIR, file)} → ${devServerUrl}`);
+      }
+    }
+  }
+  // Replace app directory in PROMPT.md if different from default
+  if (appDir !== DEFAULT_APP_DIR) {
+    const promptFile = path.join(TARGET_DIR, '.agent/PROMPT.md');
+    if (exists(promptFile)) {
+      console.log();
+      display.printStep('📂', 'App directory');
+      const content = fs.readFileSync(promptFile, 'utf8');
+      const updated = content.replaceAll('`' + DEFAULT_APP_DIR + '`', `\`${appDir}\``);
+      fs.writeFileSync(promptFile, updated, 'utf8');
+      display.printSuccess(`.agent/PROMPT.md → ${appDir}`);
+    }
+  }
+  // Playwright setup
+  if (installPlaywright) {
+    console.log();
+    display.printStep('🎭', 'Playwright setup');
+    // Copy playwright config to app directory
+    const playwrightSrc = path.join(TARGET_DIR, 'scripts/assets/playwright.config.ts');
+    const playwrightDest = path.join(TARGET_DIR, appDir, 'playwright.config.ts');
+    if (exists(playwrightSrc)) {
+      copyFile(playwrightSrc, playwrightDest);
+      display.printSuccess(`playwright.config.ts → ${appDir}/`);
+    }
+    // Install Playwright browsers
+    const s = clack.spinner();
+    s.start('Installing Playwright browsers (chromium)...');
+    try {
+      execSync('npx playwright install --with-deps chromium', {
+        cwd: path.join(TARGET_DIR, appDir),
+        stdio: 'pipe',
+      });
+      s.stop('Playwright browsers installed');
+    } catch (err) {
+      s.stop('Playwright browser install failed');
+      display.printWarning('Could not install Playwright browsers. Run manually:');
+      display.printWarning(`  cd ${appDir} && npx playwright install --with-deps chromium`);
+    }
+  }
   // Success message
   display.showComplete();
+  clack.outro('Happy looping!');
 }
-main();
+main().catch(err => {
+  display.printError(err.message);
+  process.exit(1);
+});

package/bin/lib/display.js CHANGED Viewed

@@ -32,7 +32,7 @@ function formatCatchphrase(text) {
   const leftPad = Math.floor(padding / 2);
   const rightPad = padding - leftPad;
-  return prefix + '▒'.repeat(leftPad) + text + '▒'.repeat(rightPad) + suffix;
+  return prefix + ' '.repeat(leftPad) + text + ' '.repeat(rightPad) + suffix;
 }
 // Catchphrases (text only, will be formatted)
@@ -77,7 +77,7 @@ ${Y}██████╗  █████╗ ██╗     ██████
 ╚══════╝ ╚═════╝  ╚═════╝ ╚═╝${R}
 ${D}${emptyLine}${R}
-${Y}${catchphrase}${R}
+${D}${catchphrase}${R}
 ${D}${emptyLine}${R}
 ${D}═══════════════════════════════════════════════════════════${R}
  ${C}Ralph Wiggum Loop${R} ${D}・${R} Long-running AI agents

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@pageai/ralph-loop",
-  "version": "1.0.1",
+  "version": "1.2.0",
   "publishConfig": {
     "access": "public"
   },
@@ -61,5 +61,8 @@
   "bugs": {
     "url": "https://github.com/pageai-pro/ralph-loop/issues"
   },
-  "homepage": "https://github.com/pageai-pro/ralph-loop#readme"
+  "homepage": "https://github.com/pageai-pro/ralph-loop#readme",
+  "dependencies": {
+    "@clack/prompts": "^1.0.0"
+  }
 }

package/ralph.sh CHANGED Viewed

@@ -98,7 +98,7 @@ $(cat $SCRIPT_DIR/.agent/PROMPT.md)"
   export PROMPT_CONTENT
   export DOCKER_DEFAULT_PLATFORM=linux/amd64 # Needed for Playwright.
-  script -q "$OUTPUT_FILE" bash -c 'docker sandbox run --credentials host claude --model opus --output-format stream-json --verbose -p "$PROMPT_CONTENT"' >/dev/null 2>&1 &
+  script -q "$OUTPUT_FILE" bash -c 'docker sandbox run claude . -- --model opus --output-format stream-json --verbose -p "$PROMPT_CONTENT"' >/dev/null 2>&1 &
   AGENT_PID=$!
   # Track position in output file for incremental reading
@@ -176,7 +176,7 @@ $(cat $SCRIPT_DIR/.agent/PROMPT.md)"
     echo -e "  Invalid API key. Please authenticate inside the Docker sandbox."
     echo -e ""
     echo -e "  Run the following command and follow the login instructions:"
-    echo -e "  ${C}docker sandbox run --credentials host claude${R}"
+    echo -e "  ${C}docker sandbox run claude . --${R}"
     echo -e "${RD}░░▒▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▒▒░░${R}"
     rm -f "$OUTPUT_FILE" "$FULL_OUTPUT_FILE"
     exit $EXIT_AUTH_ERROR

package/scripts/assets/playwright.config.ts ADDED Viewed

@@ -0,0 +1,27 @@
+import { defineConfig, devices } from '@playwright/test';
+/**
+ * See https://playwright.dev/docs/test-configuration.
+ */
+export default defineConfig({
+  testDir: './tests',
+  fullyParallel: true,
+  globalTimeout: 30 * 60 * 1000,
+  forbidOnly: !!process.env.CI,
+  retries: process.env.CI ? 2 : 1,
+  workers: process.env.CI ? 3 : 6,
+  reporter: 'html',
+  use: {
+    baseURL: 'http://localhost:3000',
+    trace: 'on-first-retry',
+  },
+  // NB: only chromium will run in Docker (arm64).
+  projects: [
+    {
+      name: 'chromium',
+      use: { ...devices['Desktop Chrome'] },
+    }
+  ],
+});

package/scripts/lib/timing.sh CHANGED Viewed

@@ -3,13 +3,14 @@
 # Step timing tracking and duration formatting
 # Dependencies: constants.sh
 #
-# Steps: Thinking, Reading code, Implementing, Writing tests, Testing, Linting,
-#        Typechecking, Committing
+# Steps: Thinking, Planning, Reading code, Web research, Implementing, Debugging,
+#        Writing tests, Testing, Linting, Typechecking, Installing, Verifying,
+#        Waiting, Committing
 # Step timing tracking (using indexed arrays for bash 3.x compatibility)
-STEP_NAMES=("Thinking" "Reading code" "Implementing" "Writing tests" "Testing" "Linting" "Typechecking" "Committing")
-ITERATION_STEP_VALUES=(0 0 0 0 0 0 0 0)   # Step times for current iteration
-SESSION_STEP_VALUES=(0 0 0 0 0 0 0 0)     # Accumulated step times across all iterations
+STEP_NAMES=("Thinking" "Planning" "Reading code" "Web research" "Implementing" "Debugging" "Writing tests" "Testing" "Linting" "Typechecking" "Installing" "Verifying" "Waiting" "Committing")
+ITERATION_STEP_VALUES=(0 0 0 0 0 0 0 0 0 0 0 0 0 0)   # Step times for current iteration
+SESSION_STEP_VALUES=(0 0 0 0 0 0 0 0 0 0 0 0 0 0)     # Accumulated step times across all iterations
 CURRENT_STEP_NAME=""              # Name of current step being timed
 CURRENT_STEP_START=0              # Timestamp when current step started
@@ -17,13 +18,19 @@ CURRENT_STEP_START=0              # Timestamp when current step started
 get_step_emoji() {
   case "$1" in
     "Thinking") echo "🤔" ;;
+    "Planning") echo "🗺️" ;;
     "Reading code") echo "📖" ;;
+    "Web research") echo "🌐" ;;
     "Implementing") echo "⚡" ;;
+    "Debugging") echo "🐛" ;;
     "Writing tests") echo "✍️" ;;
     "Testing") echo "🧪" ;;
     "Linting") echo "🧹" ;;
     "Typechecking") echo "📝" ;;
-    "Committing") echo "📦" ;;
+    "Installing") echo "📦" ;;
+    "Verifying") echo "✅" ;;
+    "Waiting") echo "⏳" ;;
+    "Committing") echo "🚀" ;;
     *) echo "" ;;
   esac
 }
@@ -180,7 +187,7 @@ format_step_times() {
 # Usage: init_iteration_step_times
 init_iteration_step_times() {
   # Clear iteration step times (reset all to 0)
-  ITERATION_STEP_VALUES=(0 0 0 0 0 0 0 0)
+  ITERATION_STEP_VALUES=(0 0 0 0 0 0 0 0 0 0 0 0 0 0)
   # Start timing with "Thinking" as default initial step
   CURRENT_STEP_NAME="Thinking"
@@ -198,25 +205,122 @@ display_session_step_totals() {
 }
 # Detect current step from output line
-# Returns: step name based on output patterns
+# Returns: step name based on output patterns (exact match to STEP_NAMES)
+# Priority: Implementing first, then other steps in order of specificity
 detect_step() {
   local line="$1"
-  # Check patterns in order of specificity
-  if echo "$line" | grep -qiE "(git commit|committing)"; then
-    echo "Committing       "
-  elif echo "$line" | grep -qiE "(npm test|jest|vitest|testing|test.*pass|test.*fail)"; then
-    echo "Testing          "
-  elif echo "$line" | grep -qiE "(eslint|lint|prettier|formatting)"; then
-    echo "Linting          "
-  elif echo "$line" | grep -qiE "(npm run typecheck|tsc|typescript|typecheck)"; then
-    echo "Typechecking     "
-  elif echo "$line" | grep -qiE "(\.test\.|\.spec\.|test file|writing test)"; then
-    echo "Writing tests    "
-  elif echo "$line" | grep -qiE "(write|edit|creating|updating|modifying).*\.(ts|js|sh|json|md)"; then
-    echo "Implementing     "
-  elif echo "$line" | grep -qiE "(read|glob|grep|searching|finding|looking)"; then
-    echo "Reading code     "
+  # IMPLEMENTING - highest priority (includes building)
+  # Tool-based: Write, Edit tools with file paths
+  # Natural: creating, writing, editing, modifying, updating files
+  # Building: npm run build, vite build, compiling, bundling
+  if echo "$line" | grep -qiE "(Write|Edit).*file_path"; then
+    echo "Implementing"
+  elif echo "$line" | grep -qiE "(creating|writing new|editing|modifying|updating|changing).*\.(ts|tsx|js|jsx|sh|py|go|rs|json|yaml|yml|toml|css|scss|html)"; then
+    echo "Implementing"
+  elif echo "$line" | grep -qiE "(npm|yarn|pnpm|bun) run build"; then
+    echo "Implementing"
+  elif echo "$line" | grep -qiE "(vite|webpack|esbuild|rollup|turbo|tsc) build"; then
+    echo "Implementing"
+  elif echo "$line" | grep -qiE "(compiling|bundling|transpiling)"; then
+    echo "Implementing"
+  # COMMITTING - git operations
+  elif echo "$line" | grep -qiE "(git commit|git add|committing|staged for commit)"; then
+    echo "Committing"
+  # TESTING - running test suites
+  elif echo "$line" | grep -qiE "(npm|yarn|pnpm) (run )?(test|e2e|spec)"; then
+    echo "Testing"
+  elif echo "$line" | grep -qiE "(jest|vitest|playwright|cypress|mocha|pytest)"; then
+    echo "Testing"
+  elif echo "$line" | grep -qiE "(test|spec).*(pass|fail|skip|pending)"; then
+    echo "Testing"
+  elif echo "$line" | grep -qiE "(running|executing) (tests|test suite)"; then
+    echo "Testing"
+  # DEBUGGING - investigating errors, fixing issues
+  elif echo "$line" | grep -qiE "(the|this) (error|issue|problem|bug) (is|seems|appears|was)"; then
+    echo "Debugging"
+  elif echo "$line" | grep -qiE "(investigating|debugging|diagnosing|troubleshooting)"; then
+    echo "Debugging"
+  elif echo "$line" | grep -qiE "(fails|failed|failing|broken) because"; then
+    echo "Debugging"
+  elif echo "$line" | grep -qiE "(let me|i'll) (check|see|figure out|understand) why"; then
+    echo "Debugging"
+  elif echo "$line" | grep -qiE "(root cause|stack trace|traceback|exception)"; then
+    echo "Debugging"
+  # LINTING - code style and formatting
+  elif echo "$line" | grep -qiE "(eslint|biome|lint|prettier|formatting|stylelint)"; then
+    echo "Linting"
+  # TYPECHECKING - type validation
+  elif echo "$line" | grep -qiE "(npm run typecheck|tsc|typescript|type.?check|mypy|pyright)"; then
+    echo "Typechecking"
+  # WRITING TESTS - creating test files
+  elif echo "$line" | grep -qiE "(\.test\.|\.spec\.|test file|writing test|adding test)"; then
+    echo "Writing tests"
+  elif echo "$line" | grep -qiE "(creating|writing).*(test|spec)"; then
+    echo "Writing tests"
+  # INSTALLING - package/dependency management
+  elif echo "$line" | grep -qiE "(npm|yarn|pnpm|bun) (install|add|i )"; then
+    echo "Installing"
+  elif echo "$line" | grep -qiE "(pip|poetry|cargo|go get|brew) install"; then
+    echo "Installing"
+  elif echo "$line" | grep -qiE "(installing|adding|updating) (dependency|dependencies|package)"; then
+    echo "Installing"
+  # WEB RESEARCH - fetching docs, searching web
+  elif echo "$line" | grep -qiE "(WebFetch|WebSearch)"; then
+    echo "Web research"
+  elif echo "$line" | grep -qiE "(fetching|looking up|searching).*(docs|documentation|api|web)"; then
+    echo "Web research"
+  elif echo "$line" | grep -qiE "(let me|i'll) search (for|the web|online)"; then
+    echo "Web research"
+  # VERIFYING - checking work, validation
+  elif echo "$line" | grep -qiE "(verifying|confirming|validating) (the|that|it)"; then
+    echo "Verifying"
+  elif echo "$line" | grep -qiE "(let me|i'll) (verify|confirm|make sure|double.?check)"; then
+    echo "Verifying"
+  elif echo "$line" | grep -qiE "(looks correct|works as expected|successful)"; then
+    echo "Verifying"
+  # WAITING - blocked on user input
+  elif echo "$line" | grep -qiE "(AskUserQuestion|waiting for|blocked on)"; then
+    echo "Waiting"
+  elif echo "$line" | grep -qiE "(need|require|awaiting) (input|clarification|confirmation|response)"; then
+    echo "Waiting"
+  # PLANNING - designing approach
+  elif echo "$line" | grep -qiE "(EnterPlanMode|ExitPlanMode|plan mode)"; then
+    echo "Planning"
+  elif echo "$line" | grep -qiE "(let me|i'll|i need to) (plan|outline|design|architect)"; then
+    echo "Planning"
+  elif echo "$line" | grep -qiE "(my|the) (approach|strategy|plan) (is|will be)"; then
+    echo "Planning"
+  elif echo "$line" | grep -qiE "(step [0-9]|first,|second,|finally,).*(i should|i need|i will|we need)"; then
+    echo "Planning"
+  # READING CODE - exploring codebase
+  elif echo "$line" | grep -qiE "Read.*file_path|Glob.*pattern|Grep.*pattern"; then
+    echo "Reading code"
+  elif echo "$line" | grep -qiE "(let me|i'll) (read|examine|look at|inspect|check) (the|this)"; then
+    echo "Reading code"
+  elif echo "$line" | grep -qiE "(searching|finding|looking) (for|at|through)"; then
+    echo "Reading code"
+  elif echo "$line" | grep -qiE "(reviewing|scanning|analyzing) (the )?(code|file|implementation|codebase)"; then
+    echo "Reading code"
+  # THINKING - deliberation, analysis (lowest priority - default fallback category)
+  elif echo "$line" | grep -qiE "(let me|i need to|i'll) (think|consider|analyze|understand)"; then
+    echo "Thinking"
+  elif echo "$line" | grep -qiE "(hmm|interesting|notably|this suggests|the question is)"; then
+    echo "Thinking"
   else
     echo ""
   fi