npm - agentic-loop - Versions diffs - 3.22.1 → 3.23.0 - Mend

agentic-loop 3.22.1 → 3.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/.claude/skills/my-dna/SKILL.md +3 -1
package/.claude/skills/prd/SKILL.md +72 -1
package/README.md +12 -5
package/package.json +1 -1
package/ralph/prd-check.sh +31 -0
package/ralph/setup.sh +138 -12
package/templates/signs.json +7 -0

package/.claude/skills/my-dna/SKILL.md CHANGED Viewed

@@ -95,8 +95,10 @@ Use a marker `<!-- my-dna -->` to identify the section. If marker exists, replac
 ### Core Values
 - [List their selected values]
-### Voice
+### Writing Style (responses and all file content)
 [Their style + any notes from writing sample]
+- Never use em dashes. Use commas, periods, or parentheses instead.
+Apply this style to everything: responses, code comments, docs, page copy, commit messages, and any content written to files.
 ### Project
 - **Priority:** [ship it / solid / beautiful / scale]

package/.claude/skills/prd/SKILL.md CHANGED Viewed

@@ -95,6 +95,26 @@ If user chooses **'append'**:
 - **Always use TASK- prefix** for new stories (e.g., if highest is US-005 or TASK-005, new stories start at TASK-006)
 - New stories will be added after existing ones
+### Step 3.5: Read Existing Test Infrastructure
+Before writing stories, discover the project's existing test setup so stories reference real fixtures, helpers, and patterns:
+```bash
+# Find test config and fixtures
+ls tests/conftest.py tests/fixtures/ src/__tests__/ e2e/ 2>/dev/null
+cat tests/conftest.py 2>/dev/null | head -50
+cat e2e/*.config.ts 2>/dev/null | head -30
+# Find existing test patterns
+grep -r "def test_\|async def test_\|it(\|describe(" tests/ src/__tests__/ e2e/ 2>/dev/null | head -20
+```
+Use what you find to:
+- Reference correct fixture names in story `notes` (e.g., "Use `db_session` and `client` fixtures from `conftest.py`")
+- Match existing test file organization (e.g., `tests/domains/auth/` not `tests/test_auth.py`)
+- Include specific test scenarios in `notes` based on patterns you see in existing tests
+- Reference real helpers (e.g., "Use `MockRequest` from `test_auth.py` for request mocking")
 ### Step 4: Split into Stories
 Break the idea into small, executable stories:
@@ -162,6 +182,28 @@ Does acceptanceCriteria include:
 - Does `constraints` include any rules this story must follow?
 - For frontend: Is `testUrl` set?
 - For frontend: Is `mcp` set to `["playwright", "devtools"]`?
+- For frontend: Does `notes` include Playwright MCP visual verification instructions? (See "Playwright MCP for Visual Verification" section below)
+#### 6f. E2E Coverage (MANDATORY for user-facing features)
+If the feature has ANY frontend stories that add or modify user-facing UI:
+- There MUST be at least one story with `"e2e"` in its `testing.types`
+- That story MUST have Playwright test files in `testing.files.e2e`
+- That story's `testSteps` MUST include `npx playwright test ...`
+- The E2E story should be the LAST story (depends on all others) to test the full integrated flow
+- If no E2E story exists, CREATE one as the final story
+#### 6g. Test Scenario Specificity
+Every story's `notes` field MUST include **3+ specific test scenarios** that describe what to test and how. Vague notes like "Test the service methods" are not acceptable.
+Good example:
+```
+"notes": "Test scenarios: (1) Exchange valid auth code → returns JWT with correct claims. (2) Exchange expired code → returns 401 with 'code_expired' error. (3) Exchange code with wrong redirect_uri → returns 400. (4) Verify nonce mismatch is rejected. Use existing test fixtures: db_session, client from conftest.py."
+```
+Bad example:
+```
+"notes": "Test the authentication service methods with proper mocking."
+```
 **Fix any issues you find:**
@@ -175,8 +217,12 @@ Does acceptanceCriteria include:
 | Frontend missing contextFiles | Add idea file + styleguide paths |
 | Frontend missing testUrl | Add URL from config |
 | Frontend missing mcp | Add `"mcp": ["playwright", "devtools"]` |
+| Frontend notes missing Playwright MCP guidance | Add visual verification instructions to notes (see Playwright MCP section) |
 | Story missing techStack | Add relevant subset of detected tech |
 | Story missing constraints | Add applicable rules for this story |
+| testSteps use import-checks (`python -c "from X import Y"`) | Replace with curl, pytest, or real behavioral tests |
+| No E2E story for user-facing feature | Add a final E2E story with Playwright tests |
+| Story notes lack specific test scenarios | Add 3+ concrete scenarios with inputs, expected outputs, and fixture references |
 ### Step 7: Reorder if Needed
@@ -583,9 +629,30 @@ Specify which MCP tools Claude should use for verification:
 | `devtools` | Console errors, network inspection, DOM debugging |
 | `postgres` | Database verification (future) |
-**Frontend stories** default to `["playwright", "devtools"]`.
+**Frontend stories** MUST have `"mcp": ["playwright", "devtools"]`.
 **Backend-only stories** can use `[]` or omit.
+### Playwright MCP for Visual Verification
+Frontend stories should include guidance in `notes` for using Playwright MCP during implementation. This is how Ralph visually verifies that UI changes actually render correctly — screenshots catch layout bugs, missing elements, and broken styles that unit tests miss.
+**Every frontend story's `notes` should include Playwright MCP instructions like:**
+```
+Use Playwright MCP to verify:
+1. Navigate to {testUrl} and take a screenshot
+2. Verify [specific element] is visible and correctly styled
+3. Click [interactive element] and verify [expected behavior]
+4. Check browser console for errors after interactions
+```
+**Example for a login page SSO button story:**
+```json
+"notes": "Use Playwright MCP to verify: navigate to /login, screenshot the page, confirm 'Sign in with Okta' button is visible below the email/password form with a divider. Click the button and verify it redirects to /api/v1/auth/okta/authorize. Check devtools console for errors."
+```
+This is NOT a replacement for automated Playwright tests — it's additional visual verification that Ralph performs during the implementation step using the MCP browser tools.
 ---
 ## Skills Reference
@@ -697,11 +764,15 @@ Ralph reads `.ralph/config.json` and expands `{config.urls.backend}` before runn
   "grep -q 'function createUser' app/services/user.py",  // ❌ PASSES if code exists, even if broken
   "grep -q 'export default' src/components/Dashboard.tsx", // ❌ PASSES even if component crashes
   "test -f src/api/users.ts",                            // ❌ PASSES if file exists, even if empty
+  "python -c \"from app.services.auth import AuthService\"", // ❌ PASSES if import works, says nothing about behavior
+  "python -c \"hasattr(AuthService, 'login')\"",          // ❌ PASSES if method exists, even if completely broken
   "Visit http://localhost:3000/dashboard",                // ❌ Not executable
   "User can see the dashboard"                            // ❌ Not executable
 ]
 ```
+**NEVER use import-checks (`python -c "from X import Y"` or `hasattr`) as test steps.** These only verify a symbol exists — they don't test behavior, error handling, or integration. A function that raises on every call still passes an import check.
 **NEVER use grep/test to verify behavior.** These will mark stories as PASSED when the feature is broken.
 **If a step can't be automated**, put it in `acceptanceCriteria` instead. Claude will verify it visually using MCP tools.

package/README.md CHANGED Viewed

@@ -15,15 +15,18 @@ You describe what you want to build. Claude Code writes a PRD (Product Requireme
 │  TERMINAL 1: Claude CLI                │  TERMINAL 2: Execute                    │
 ├────────────────────────────────────────┼─────────────────────────────────────────┤
 │                                        │                                         │
-│  claude --dangerously-skip-permissions │  npx agentic-loop run                   │
+│  claude                                │                                         │
+|(--dangerously-skip-permissions)        │  npx agentic-loop run                   │
 │                                        │                                         │
 │  PLAN FEATURES                         │  ┌─ prd-check (once) ───────────────┐   │
 │  /idea 'your feature or bugfix'        │  │ Validate all stories upfront     │   │
 │  → Claude asks questions               │  │ Auto-fix missing test steps      │   │
 │  → Explores codebase                   │  └──────────────────────────────────┘   │
 │  → Generates PRD                       │            ↓                            │
-│                                        │  ┌─ loop (per story) ───────────────┐   │
-│  ENHANCE AS YOU LEARN                  │  │                                  │   │
+|  (using claude code plan mode?)        │
+|  → use /prd filename.md                │
+|    on the plan file generated by claude|
+│  ENHANCE AS YOU LEARN                  │   ──────────────────────────────────    │
 │  → Add signs when Ralph repeats        │  │ Read prd.json → get next story   │   │
 │    the same mistake                    │  │ Load PROMPT.md, signs, config    │   │
 │  → Tune timeouts, retries, checks      │  │ Load last_failure.txt (if retry) │   │
@@ -63,14 +66,18 @@ npx agentic-loop setup
 **Terminal 1 - Claude Code:**
 ```bash
-claude --dangerously-skip-permissions
+claude  (optinally you cna use, but you don't need it --dangerously-skip-permissions)
 /tour                    # Guided walkthrough (recommended first time)
 /idea 'your feature'     # Generate a PRD
+---
+# Already have an idea/plan/PRD file?
+/prd   #Run the /prd command to turn any file into a prd.json
 ```
 **Terminal 2 - Ralph Loop:**
 ```bash
-npx agentic-loop run         # Execute PRDs autonomously
+npx agentic-loop run         # Execute PRDs autonomously (Spins up claude -p)
 npx agentic-loop run --quiet # Same, but suppress activity feed
 ```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agentic-loop",
-  "version": "3.22.1",
+  "version": "3.23.0",
   "description": "Autonomous AI coding loop - PRD-driven development with Claude Code",
   "author": "Allie Jones <allie@allthrive.ai>",
   "license": "MIT",

package/ralph/prd-check.sh CHANGED Viewed

@@ -405,6 +405,20 @@ _check_story_issues() {
     fi
   fi
+  # Import-check anti-pattern: python -c "from X import Y" or hasattr()
+  if [[ -n "$test_steps" ]] && echo "$test_steps" | grep -qE '(python[3]? -c .*(from |import |hasattr))'; then
+    echo "testSteps use import-checks (python -c 'from/import/hasattr') — replace with real behavioral tests"
+  fi
+  # Frontend stories must include Playwright MCP visual verification guidance in notes
+  if [[ "$story_type" == "frontend" ]]; then
+    local story_notes
+    story_notes=$(jq -r --arg id "$story_id" '.stories[] | select(.id==$id) | .notes // ""' "$prd_file")
+    if ! echo "$story_notes" | grep -qiE "(playwright.*mcp|mcp.*playwright|visual.*verif|screenshot|navigate.*screenshot)"; then
+      echo "frontend notes should include Playwright MCP visual verification guidance"
+    fi
+  fi
   # All testSteps are server-dependent
   if [[ -n "$test_steps" ]]; then
     local has_offline=false has_server=false
@@ -443,6 +457,8 @@ _validate_and_fix_stories() {
   local cnt_naming_convention=0 cnt_bare_pytest=0 cnt_bare_python=0
   local cnt_server_only=0
   local cnt_custom=0
+  local cnt_import_check=0
+  local cnt_playwright_notes=0
   echo "  Checking test coverage..."
@@ -508,6 +524,8 @@ _validate_and_fix_stories() {
         "missing pagination criteria") cnt_list_pagination=$((cnt_list_pagination + 1)) ;;
         "API consumer needs camelCase transformation note") cnt_naming_convention=$((cnt_naming_convention + 1)) ;;
         "all testSteps need a live server"*) cnt_server_only=$((cnt_server_only + 1)) ;;
+        "testSteps use import-checks"*) cnt_import_check=$((cnt_import_check + 1)) ;;
+        "frontend notes should include Playwright"*) cnt_playwright_notes=$((cnt_playwright_notes + 1)) ;;
       esac
     done <<< "$shared_issues"
@@ -539,6 +557,17 @@ _validate_and_fix_stories() {
     fi
   done <<< "$story_ids"
+  # Global check: if any frontend stories exist, at least one story should have E2E tests
+  local has_frontend_stories has_e2e_story
+  has_frontend_stories=$(jq -r '[.stories[] | select(.type == "frontend")] | length' "$prd_file" 2>/dev/null)
+  has_e2e_story=$(jq -r '[.stories[] | select(.testing.types[]? == "e2e")] | length' "$prd_file" 2>/dev/null)
+  if [[ "$has_frontend_stories" -gt 0 && "$has_e2e_story" == "0" ]]; then
+    echo ""
+    print_warning "No E2E story found — frontend features should have at least one Playwright E2E story"
+    echo "  Add a final story with testing.types: [\"e2e\"] and Playwright testSteps"
+    echo ""
+  fi
   # If issues found, show summary and attempt fix
   if [[ "$needs_fix" == "true" ]]; then
     echo "  Optimizing test coverage for $story_count stories..."
@@ -558,6 +587,8 @@ _validate_and_fix_stories() {
     [[ $cnt_bare_pytest -gt 0 ]] && echo "    ${cnt_bare_pytest}x use '${py_runner:-python3} pytest' not bare 'pytest'"
     [[ $cnt_bare_python -gt 0 ]] && echo "    ${cnt_bare_python}x use 'python3' not bare 'python' (macOS compatibility)"
     [[ $cnt_server_only -gt 0 ]] && echo "    ${cnt_server_only}x all testSteps need live server (add offline fallback)"
+    [[ $cnt_import_check -gt 0 ]] && echo "    ${cnt_import_check}x testSteps use import-checks (replace with real tests)"
+    [[ $cnt_playwright_notes -gt 0 ]] && echo "    ${cnt_playwright_notes}x frontend: add Playwright MCP visual verification to notes"
     [[ $cnt_custom -gt 0 ]] && echo "    ${cnt_custom} stories with custom check issues"
     # Skip auto-fix in dry-run mode

package/ralph/setup.sh CHANGED Viewed

@@ -169,6 +169,7 @@ ralph_setup() {
   setup_custom_checks
   setup_gitignore
   setup_claude_hooks "$pkg_root"
+  setup_claude_hud
   setup_slash_commands "$pkg_root"
   setup_claude_md
   setup_mcp
@@ -446,6 +447,106 @@ setup_claude_hooks() {
   echo "  Configured .claude/settings.json"
 }
+# Install claude-hud statusline plugin for Claude Code
+setup_claude_hud() {
+  local hud_dir="$HOME/.claude/plugins/claude-hud"
+  local marketplaces_file="$HOME/.claude/plugins/known_marketplaces.json"
+  # Check dependencies
+  if ! command -v git &>/dev/null; then
+    print_warning "git not found — skipping claude-hud install"
+    return 0
+  fi
+  if ! command -v node &>/dev/null; then
+    print_warning "node not found — skipping claude-hud install"
+    return 0
+  fi
+  if ! command -v jq &>/dev/null; then
+    print_warning "jq not found — skipping claude-hud install"
+    return 0
+  fi
+  echo "Setting up claude-hud statusline..."
+  # Clone if not already present
+  if [[ ! -d "$hud_dir" ]]; then
+    mkdir -p "$HOME/.claude/plugins"
+    if ! git clone --depth 1 https://github.com/jarrodwatts/claude-hud.git "$hud_dir" 2>/dev/null; then
+      print_warning "Failed to clone claude-hud — skipping"
+      return 0
+    fi
+    echo "  Cloned claude-hud plugin"
+  else
+    echo "  claude-hud already installed"
+  fi
+  # Build if dist/index.js is missing
+  if [[ ! -f "$hud_dir/dist/index.js" ]]; then
+    echo "  Building claude-hud..."
+    if ! (cd "$hud_dir" && npm ci --production=false --ignore-scripts 2>/dev/null && npm run build 2>/dev/null); then
+      print_warning "Failed to build claude-hud — statusline may not work"
+      echo "    Try manually: cd $hud_dir && npm ci && npm run build"
+      return 0
+    fi
+    echo "  Built successfully"
+  fi
+  # Register in known_marketplaces.json
+  if [[ -f "$marketplaces_file" ]]; then
+    if ! jq -e '."claude-hud"' "$marketplaces_file" > /dev/null 2>&1; then
+      local tmp
+      tmp=$(mktemp)
+      jq '."claude-hud" = {
+        "source": {
+          "source": "github",
+          "repo": "jarrodwatts/claude-hud"
+        },
+        "installLocation": ($ENV.HOME + "/.claude/plugins/claude-hud"),
+        "lastUpdated": (now | todate)
+      }' "$marketplaces_file" > "$tmp" && mv "$tmp" "$marketplaces_file"
+      echo "  Registered in known_marketplaces.json"
+    fi
+  else
+    mkdir -p "$(dirname "$marketplaces_file")"
+    jq -n '{
+      "claude-hud": {
+        "source": {
+          "source": "github",
+          "repo": "jarrodwatts/claude-hud"
+        },
+        "installLocation": ($ENV.HOME + "/.claude/plugins/claude-hud"),
+        "lastUpdated": (now | todate)
+      }
+    }' > "$marketplaces_file"
+    echo "  Created known_marketplaces.json with claude-hud"
+  fi
+  # Write agentic-loop-tuned config
+  cat > "$hud_dir/config.json" << 'EOF'
+{
+  "lineLayout": "expanded",
+  "pathLevels": 2,
+  "gitStatus": {
+    "enabled": true,
+    "showDirty": true,
+    "showAheadBehind": true
+  },
+  "display": {
+    "showContextBar": true,
+    "contextValue": "percent",
+    "showModel": true,
+    "showDuration": true,
+    "showSpeed": false,
+    "showUsage": true,
+    "showTools": true,
+    "showAgents": true,
+    "showTodos": true
+  }
+}
+EOF
+  echo "  Configured claude-hud for agentic-loop (expanded layout, all panels)"
+}
 # Copy slash commands (skills format for Claude Code)
 setup_slash_commands() {
   local pkg_root="$1"
@@ -569,6 +670,8 @@ ${python_runner:+- Python: Use \`$python_runner\` (not bare \`python\`)}
     framework_template="$pkg_root/templates/examples/CLAUDE-${framework_type}.md"
   fi
+  local style_marker="<!-- vibe-and-thrive-detected -->"
   if [[ -f "CLAUDE.md" ]]; then
     # Append framework template if it exists and not already included
     if [[ -n "$framework_template" && -f "$framework_template" ]]; then
@@ -581,6 +684,16 @@ ${python_runner:+- Python: Use \`$python_runner\` (not bare \`python\`)}
         echo "  Appended ${framework_type} conventions to CLAUDE.md"
       fi
     fi
+    # Add writing style defaults if not already present
+    if ! grep -q "$style_marker" "CLAUDE.md" 2>/dev/null; then
+      cat >> CLAUDE.md << EOF
+$style_marker
+## Writing Style
+- Active voice only. Never use passive voice.
+- Never use em dashes. Use commas, periods, or parentheses instead.
+EOF
+    fi
     echo "$detected_section" >> CLAUDE.md
     echo "  Updated CLAUDE.md"
   else
@@ -590,6 +703,11 @@ ${python_runner:+- Python: Use \`$python_runner\` (not bare \`python\`)}
 ## Your Rules
 <!-- Add your project-specific rules, patterns, and conventions here -->
+<!-- vibe-and-thrive-detected -->
+## Writing Style
+- Active voice only. Never use passive voice.
+- Never use em dashes. Use commas, periods, or parentheses instead.
 EOF
     # Include framework template if available
@@ -715,20 +833,28 @@ setup_precommit_hooks() {
   # Create .pre-commit-config.yaml if it doesn't exist
   if [[ ! -f ".pre-commit-config.yaml" ]]; then
     cat > .pre-commit-config.yaml << 'EOF'
+# Pre-commit hooks powered by agentic-loop vibe-check
+# Runs on staged files before each commit
 repos:
-  - repo: https://github.com/allierays/agentic-loop
-    rev: v1.0.0
+  - repo: local
+    hooks:
+      - id: vibe-check
+        name: Vibe Check
+        entry: npx vibe-check --fail-on error
+        language: node
+        types_or: [javascript, ts, python, json]
+        pass_filenames: true
+  # Standard hooks
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.5.0
     hooks:
-      - id: backup-db
-        name: Backup database before commit
-      - id: check-secrets
-        name: Check for hardcoded secrets
-      - id: check-hardcoded-urls
-        name: Check for hardcoded URLs
-      - id: check-debug
-        name: Check for debug statements
-      - id: check-signs-secrets
-        name: Check signs.json for credentials
+      - id: trailing-whitespace
+        exclude: '\.txt$'
+      - id: end-of-file-fixer
+        exclude: '\.txt$'
+      - id: check-yaml
+      - id: check-added-large-files
 EOF
     echo "  Created .pre-commit-config.yaml"
   else

package/templates/signs.json CHANGED Viewed

@@ -48,6 +48,13 @@
       "category": "testing",
       "learnedFrom": null,
       "createdAt": "2026-01-25T00:00:00-08:00"
+    },
+    {
+      "id": "sign-008",
+      "pattern": "NEVER use import-checks as test steps (python -c 'from X import Y', hasattr, test -f). These only verify a symbol exists, not that it works. A function that raises on every call still passes an import check. Always use real behavioral tests: curl for API endpoints, pytest with actual function calls, or Playwright for UI.",
+      "category": "testing",
+      "learnedFrom": "Okta SSO PRD review — all 4 backend stories had import-check testSteps that would pass even with completely broken code",
+      "createdAt": "2026-02-17T00:00:00-08:00"
     }
   ]
 }