agentic-loop 3.22.1 → 3.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -95,8 +95,10 @@ Use a marker `<!-- my-dna -->` to identify the section. If marker exists, replac
95
95
  ### Core Values
96
96
  - [List their selected values]
97
97
 
98
- ### Voice
98
+ ### Writing Style (responses and all file content)
99
99
  [Their style + any notes from writing sample]
100
+ - Never use em dashes. Use commas, periods, or parentheses instead.
101
+ Apply this style to everything: responses, code comments, docs, page copy, commit messages, and any content written to files.
100
102
 
101
103
  ### Project
102
104
  - **Priority:** [ship it / solid / beautiful / scale]
@@ -95,6 +95,26 @@ If user chooses **'append'**:
95
95
  - **Always use TASK- prefix** for new stories (e.g., if highest is US-005 or TASK-005, new stories start at TASK-006)
96
96
  - New stories will be added after existing ones
97
97
 
98
+ ### Step 3.5: Read Existing Test Infrastructure
99
+
100
+ Before writing stories, discover the project's existing test setup so stories reference real fixtures, helpers, and patterns:
101
+
102
+ ```bash
103
+ # Find test config and fixtures
104
+ ls tests/conftest.py tests/fixtures/ src/__tests__/ e2e/ 2>/dev/null
105
+ cat tests/conftest.py 2>/dev/null | head -50
106
+ cat e2e/*.config.ts 2>/dev/null | head -30
107
+
108
+ # Find existing test patterns
109
+ grep -r "def test_\|async def test_\|it(\|describe(" tests/ src/__tests__/ e2e/ 2>/dev/null | head -20
110
+ ```
111
+
112
+ Use what you find to:
113
+ - Reference correct fixture names in story `notes` (e.g., "Use `db_session` and `client` fixtures from `conftest.py`")
114
+ - Match existing test file organization (e.g., `tests/domains/auth/` not `tests/test_auth.py`)
115
+ - Include specific test scenarios in `notes` based on patterns you see in existing tests
116
+ - Reference real helpers (e.g., "Use `MockRequest` from `test_auth.py` for request mocking")
117
+
98
118
  ### Step 4: Split into Stories
99
119
 
100
120
  Break the idea into small, executable stories:
@@ -162,6 +182,28 @@ Does acceptanceCriteria include:
162
182
  - Does `constraints` include any rules this story must follow?
163
183
  - For frontend: Is `testUrl` set?
164
184
  - For frontend: Is `mcp` set to `["playwright", "devtools"]`?
185
+ - For frontend: Does `notes` include Playwright MCP visual verification instructions? (See "Playwright MCP for Visual Verification" section below)
186
+
187
+ #### 6f. E2E Coverage (MANDATORY for user-facing features)
188
+ If the feature has ANY frontend stories that add or modify user-facing UI:
189
+ - There MUST be at least one story with `"e2e"` in its `testing.types`
190
+ - That story MUST have Playwright test files in `testing.files.e2e`
191
+ - That story's `testSteps` MUST include `npx playwright test ...`
192
+ - The E2E story should be the LAST story (depends on all others) to test the full integrated flow
193
+ - If no E2E story exists, CREATE one as the final story
194
+
195
+ #### 6g. Test Scenario Specificity
196
+ Every story's `notes` field MUST include **3+ specific test scenarios** that describe what to test and how. Vague notes like "Test the service methods" are not acceptable.
197
+
198
+ Good example:
199
+ ```
200
+ "notes": "Test scenarios: (1) Exchange valid auth code → returns JWT with correct claims. (2) Exchange expired code → returns 401 with 'code_expired' error. (3) Exchange code with wrong redirect_uri → returns 400. (4) Verify nonce mismatch is rejected. Use existing test fixtures: db_session, client from conftest.py."
201
+ ```
202
+
203
+ Bad example:
204
+ ```
205
+ "notes": "Test the authentication service methods with proper mocking."
206
+ ```
165
207
 
166
208
  **Fix any issues you find:**
167
209
 
@@ -175,8 +217,12 @@ Does acceptanceCriteria include:
175
217
  | Frontend missing contextFiles | Add idea file + styleguide paths |
176
218
  | Frontend missing testUrl | Add URL from config |
177
219
  | Frontend missing mcp | Add `"mcp": ["playwright", "devtools"]` |
220
+ | Frontend notes missing Playwright MCP guidance | Add visual verification instructions to notes (see Playwright MCP section) |
178
221
  | Story missing techStack | Add relevant subset of detected tech |
179
222
  | Story missing constraints | Add applicable rules for this story |
223
+ | testSteps use import-checks (`python -c "from X import Y"`) | Replace with curl, pytest, or real behavioral tests |
224
+ | No E2E story for user-facing feature | Add a final E2E story with Playwright tests |
225
+ | Story notes lack specific test scenarios | Add 3+ concrete scenarios with inputs, expected outputs, and fixture references |
180
226
 
181
227
  ### Step 7: Reorder if Needed
182
228
 
@@ -583,9 +629,30 @@ Specify which MCP tools Claude should use for verification:
583
629
  | `devtools` | Console errors, network inspection, DOM debugging |
584
630
  | `postgres` | Database verification (future) |
585
631
 
586
- **Frontend stories** default to `["playwright", "devtools"]`.
632
+ **Frontend stories** MUST have `"mcp": ["playwright", "devtools"]`.
587
633
  **Backend-only stories** can use `[]` or omit.
588
634
 
635
+ ### Playwright MCP for Visual Verification
636
+
637
+ Frontend stories should include guidance in `notes` for using Playwright MCP during implementation. This is how Ralph visually verifies that UI changes actually render correctly — screenshots catch layout bugs, missing elements, and broken styles that unit tests miss.
638
+
639
+ **Every frontend story's `notes` should include Playwright MCP instructions like:**
640
+
641
+ ```
642
+ Use Playwright MCP to verify:
643
+ 1. Navigate to {testUrl} and take a screenshot
644
+ 2. Verify [specific element] is visible and correctly styled
645
+ 3. Click [interactive element] and verify [expected behavior]
646
+ 4. Check browser console for errors after interactions
647
+ ```
648
+
649
+ **Example for a login page SSO button story:**
650
+ ```json
651
+ "notes": "Use Playwright MCP to verify: navigate to /login, screenshot the page, confirm 'Sign in with Okta' button is visible below the email/password form with a divider. Click the button and verify it redirects to /api/v1/auth/okta/authorize. Check devtools console for errors."
652
+ ```
653
+
654
+ This is NOT a replacement for automated Playwright tests — it's additional visual verification that Ralph performs during the implementation step using the MCP browser tools.
655
+
589
656
  ---
590
657
 
591
658
  ## Skills Reference
@@ -697,11 +764,15 @@ Ralph reads `.ralph/config.json` and expands `{config.urls.backend}` before runn
697
764
  "grep -q 'function createUser' app/services/user.py", // ❌ PASSES if code exists, even if broken
698
765
  "grep -q 'export default' src/components/Dashboard.tsx", // ❌ PASSES even if component crashes
699
766
  "test -f src/api/users.ts", // ❌ PASSES if file exists, even if empty
767
+ "python -c \"from app.services.auth import AuthService\"", // ❌ PASSES if import works, says nothing about behavior
768
+ "python -c \"hasattr(AuthService, 'login')\"", // ❌ PASSES if method exists, even if completely broken
700
769
  "Visit http://localhost:3000/dashboard", // ❌ Not executable
701
770
  "User can see the dashboard" // ❌ Not executable
702
771
  ]
703
772
  ```
704
773
 
774
+ **NEVER use import-checks (`python -c "from X import Y"` or `hasattr`) as test steps.** These only verify a symbol exists — they don't test behavior, error handling, or integration. A function that raises on every call still passes an import check.
775
+
705
776
  **NEVER use grep/test to verify behavior.** These will mark stories as PASSED when the feature is broken.
706
777
 
707
778
  **If a step can't be automated**, put it in `acceptanceCriteria` instead. Claude will verify it visually using MCP tools.
package/README.md CHANGED
@@ -15,15 +15,18 @@ You describe what you want to build. Claude Code writes a PRD (Product Requireme
15
15
  │ TERMINAL 1: Claude CLI │ TERMINAL 2: Execute │
16
16
  ├────────────────────────────────────────┼─────────────────────────────────────────┤
17
17
  │ │ │
18
- │ claude --dangerously-skip-permissions npx agentic-loop run
18
+ │ claude
19
+ |(--dangerously-skip-permissions) │ npx agentic-loop run │
19
20
  │ │ │
20
21
  │ PLAN FEATURES │ ┌─ prd-check (once) ───────────────┐ │
21
22
  │ /idea 'your feature or bugfix' │ │ Validate all stories upfront │ │
22
23
  │ → Claude asks questions │ │ Auto-fix missing test steps │ │
23
24
  │ → Explores codebase │ └──────────────────────────────────┘ │
24
25
  │ → Generates PRD │ ↓ │
25
- │ │ ┌─ loop (per story) ───────────────┐
26
- ENHANCE AS YOU LEARN │ │ │
26
+ | (using claude code plan mode?)
27
+ | use /prd filename.md
28
+ | on the plan file generated by claude|
29
+ │ ENHANCE AS YOU LEARN │ ────────────────────────────────── │
27
30
  │ → Add signs when Ralph repeats │ │ Read prd.json → get next story │ │
28
31
  │ the same mistake │ │ Load PROMPT.md, signs, config │ │
29
32
  │ → Tune timeouts, retries, checks │ │ Load last_failure.txt (if retry) │ │
@@ -63,14 +66,18 @@ npx agentic-loop setup
63
66
 
64
67
  **Terminal 1 - Claude Code:**
65
68
  ```bash
66
- claude --dangerously-skip-permissions
69
+ claude (optinally you cna use, but you don't need it --dangerously-skip-permissions)
67
70
  /tour # Guided walkthrough (recommended first time)
68
71
  /idea 'your feature' # Generate a PRD
72
+
73
+ ---
74
+ # Already have an idea/plan/PRD file?
75
+ /prd #Run the /prd command to turn any file into a prd.json
69
76
  ```
70
77
 
71
78
  **Terminal 2 - Ralph Loop:**
72
79
  ```bash
73
- npx agentic-loop run # Execute PRDs autonomously
80
+ npx agentic-loop run # Execute PRDs autonomously (Spins up claude -p)
74
81
  npx agentic-loop run --quiet # Same, but suppress activity feed
75
82
  ```
76
83
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentic-loop",
3
- "version": "3.22.1",
3
+ "version": "3.23.0",
4
4
  "description": "Autonomous AI coding loop - PRD-driven development with Claude Code",
5
5
  "author": "Allie Jones <allie@allthrive.ai>",
6
6
  "license": "MIT",
@@ -405,6 +405,20 @@ _check_story_issues() {
405
405
  fi
406
406
  fi
407
407
 
408
+ # Import-check anti-pattern: python -c "from X import Y" or hasattr()
409
+ if [[ -n "$test_steps" ]] && echo "$test_steps" | grep -qE '(python[3]? -c .*(from |import |hasattr))'; then
410
+ echo "testSteps use import-checks (python -c 'from/import/hasattr') — replace with real behavioral tests"
411
+ fi
412
+
413
+ # Frontend stories must include Playwright MCP visual verification guidance in notes
414
+ if [[ "$story_type" == "frontend" ]]; then
415
+ local story_notes
416
+ story_notes=$(jq -r --arg id "$story_id" '.stories[] | select(.id==$id) | .notes // ""' "$prd_file")
417
+ if ! echo "$story_notes" | grep -qiE "(playwright.*mcp|mcp.*playwright|visual.*verif|screenshot|navigate.*screenshot)"; then
418
+ echo "frontend notes should include Playwright MCP visual verification guidance"
419
+ fi
420
+ fi
421
+
408
422
  # All testSteps are server-dependent
409
423
  if [[ -n "$test_steps" ]]; then
410
424
  local has_offline=false has_server=false
@@ -443,6 +457,8 @@ _validate_and_fix_stories() {
443
457
  local cnt_naming_convention=0 cnt_bare_pytest=0 cnt_bare_python=0
444
458
  local cnt_server_only=0
445
459
  local cnt_custom=0
460
+ local cnt_import_check=0
461
+ local cnt_playwright_notes=0
446
462
 
447
463
  echo " Checking test coverage..."
448
464
 
@@ -508,6 +524,8 @@ _validate_and_fix_stories() {
508
524
  "missing pagination criteria") cnt_list_pagination=$((cnt_list_pagination + 1)) ;;
509
525
  "API consumer needs camelCase transformation note") cnt_naming_convention=$((cnt_naming_convention + 1)) ;;
510
526
  "all testSteps need a live server"*) cnt_server_only=$((cnt_server_only + 1)) ;;
527
+ "testSteps use import-checks"*) cnt_import_check=$((cnt_import_check + 1)) ;;
528
+ "frontend notes should include Playwright"*) cnt_playwright_notes=$((cnt_playwright_notes + 1)) ;;
511
529
  esac
512
530
  done <<< "$shared_issues"
513
531
 
@@ -539,6 +557,17 @@ _validate_and_fix_stories() {
539
557
  fi
540
558
  done <<< "$story_ids"
541
559
 
560
+ # Global check: if any frontend stories exist, at least one story should have E2E tests
561
+ local has_frontend_stories has_e2e_story
562
+ has_frontend_stories=$(jq -r '[.stories[] | select(.type == "frontend")] | length' "$prd_file" 2>/dev/null)
563
+ has_e2e_story=$(jq -r '[.stories[] | select(.testing.types[]? == "e2e")] | length' "$prd_file" 2>/dev/null)
564
+ if [[ "$has_frontend_stories" -gt 0 && "$has_e2e_story" == "0" ]]; then
565
+ echo ""
566
+ print_warning "No E2E story found — frontend features should have at least one Playwright E2E story"
567
+ echo " Add a final story with testing.types: [\"e2e\"] and Playwright testSteps"
568
+ echo ""
569
+ fi
570
+
542
571
  # If issues found, show summary and attempt fix
543
572
  if [[ "$needs_fix" == "true" ]]; then
544
573
  echo " Optimizing test coverage for $story_count stories..."
@@ -558,6 +587,8 @@ _validate_and_fix_stories() {
558
587
  [[ $cnt_bare_pytest -gt 0 ]] && echo " ${cnt_bare_pytest}x use '${py_runner:-python3} pytest' not bare 'pytest'"
559
588
  [[ $cnt_bare_python -gt 0 ]] && echo " ${cnt_bare_python}x use 'python3' not bare 'python' (macOS compatibility)"
560
589
  [[ $cnt_server_only -gt 0 ]] && echo " ${cnt_server_only}x all testSteps need live server (add offline fallback)"
590
+ [[ $cnt_import_check -gt 0 ]] && echo " ${cnt_import_check}x testSteps use import-checks (replace with real tests)"
591
+ [[ $cnt_playwright_notes -gt 0 ]] && echo " ${cnt_playwright_notes}x frontend: add Playwright MCP visual verification to notes"
561
592
  [[ $cnt_custom -gt 0 ]] && echo " ${cnt_custom} stories with custom check issues"
562
593
 
563
594
  # Skip auto-fix in dry-run mode
package/ralph/setup.sh CHANGED
@@ -169,6 +169,7 @@ ralph_setup() {
169
169
  setup_custom_checks
170
170
  setup_gitignore
171
171
  setup_claude_hooks "$pkg_root"
172
+ setup_claude_hud
172
173
  setup_slash_commands "$pkg_root"
173
174
  setup_claude_md
174
175
  setup_mcp
@@ -446,6 +447,106 @@ setup_claude_hooks() {
446
447
  echo " Configured .claude/settings.json"
447
448
  }
448
449
 
450
+ # Install claude-hud statusline plugin for Claude Code
451
+ setup_claude_hud() {
452
+ local hud_dir="$HOME/.claude/plugins/claude-hud"
453
+ local marketplaces_file="$HOME/.claude/plugins/known_marketplaces.json"
454
+
455
+ # Check dependencies
456
+ if ! command -v git &>/dev/null; then
457
+ print_warning "git not found — skipping claude-hud install"
458
+ return 0
459
+ fi
460
+ if ! command -v node &>/dev/null; then
461
+ print_warning "node not found — skipping claude-hud install"
462
+ return 0
463
+ fi
464
+ if ! command -v jq &>/dev/null; then
465
+ print_warning "jq not found — skipping claude-hud install"
466
+ return 0
467
+ fi
468
+
469
+ echo "Setting up claude-hud statusline..."
470
+
471
+ # Clone if not already present
472
+ if [[ ! -d "$hud_dir" ]]; then
473
+ mkdir -p "$HOME/.claude/plugins"
474
+ if ! git clone --depth 1 https://github.com/jarrodwatts/claude-hud.git "$hud_dir" 2>/dev/null; then
475
+ print_warning "Failed to clone claude-hud — skipping"
476
+ return 0
477
+ fi
478
+ echo " Cloned claude-hud plugin"
479
+ else
480
+ echo " claude-hud already installed"
481
+ fi
482
+
483
+ # Build if dist/index.js is missing
484
+ if [[ ! -f "$hud_dir/dist/index.js" ]]; then
485
+ echo " Building claude-hud..."
486
+ if ! (cd "$hud_dir" && npm ci --production=false --ignore-scripts 2>/dev/null && npm run build 2>/dev/null); then
487
+ print_warning "Failed to build claude-hud — statusline may not work"
488
+ echo " Try manually: cd $hud_dir && npm ci && npm run build"
489
+ return 0
490
+ fi
491
+ echo " Built successfully"
492
+ fi
493
+
494
+ # Register in known_marketplaces.json
495
+ if [[ -f "$marketplaces_file" ]]; then
496
+ if ! jq -e '."claude-hud"' "$marketplaces_file" > /dev/null 2>&1; then
497
+ local tmp
498
+ tmp=$(mktemp)
499
+ jq '."claude-hud" = {
500
+ "source": {
501
+ "source": "github",
502
+ "repo": "jarrodwatts/claude-hud"
503
+ },
504
+ "installLocation": ($ENV.HOME + "/.claude/plugins/claude-hud"),
505
+ "lastUpdated": (now | todate)
506
+ }' "$marketplaces_file" > "$tmp" && mv "$tmp" "$marketplaces_file"
507
+ echo " Registered in known_marketplaces.json"
508
+ fi
509
+ else
510
+ mkdir -p "$(dirname "$marketplaces_file")"
511
+ jq -n '{
512
+ "claude-hud": {
513
+ "source": {
514
+ "source": "github",
515
+ "repo": "jarrodwatts/claude-hud"
516
+ },
517
+ "installLocation": ($ENV.HOME + "/.claude/plugins/claude-hud"),
518
+ "lastUpdated": (now | todate)
519
+ }
520
+ }' > "$marketplaces_file"
521
+ echo " Created known_marketplaces.json with claude-hud"
522
+ fi
523
+
524
+ # Write agentic-loop-tuned config
525
+ cat > "$hud_dir/config.json" << 'EOF'
526
+ {
527
+ "lineLayout": "expanded",
528
+ "pathLevels": 2,
529
+ "gitStatus": {
530
+ "enabled": true,
531
+ "showDirty": true,
532
+ "showAheadBehind": true
533
+ },
534
+ "display": {
535
+ "showContextBar": true,
536
+ "contextValue": "percent",
537
+ "showModel": true,
538
+ "showDuration": true,
539
+ "showSpeed": false,
540
+ "showUsage": true,
541
+ "showTools": true,
542
+ "showAgents": true,
543
+ "showTodos": true
544
+ }
545
+ }
546
+ EOF
547
+ echo " Configured claude-hud for agentic-loop (expanded layout, all panels)"
548
+ }
549
+
449
550
  # Copy slash commands (skills format for Claude Code)
450
551
  setup_slash_commands() {
451
552
  local pkg_root="$1"
@@ -569,6 +670,8 @@ ${python_runner:+- Python: Use \`$python_runner\` (not bare \`python\`)}
569
670
  framework_template="$pkg_root/templates/examples/CLAUDE-${framework_type}.md"
570
671
  fi
571
672
 
673
+ local style_marker="<!-- vibe-and-thrive-detected -->"
674
+
572
675
  if [[ -f "CLAUDE.md" ]]; then
573
676
  # Append framework template if it exists and not already included
574
677
  if [[ -n "$framework_template" && -f "$framework_template" ]]; then
@@ -581,6 +684,16 @@ ${python_runner:+- Python: Use \`$python_runner\` (not bare \`python\`)}
581
684
  echo " Appended ${framework_type} conventions to CLAUDE.md"
582
685
  fi
583
686
  fi
687
+ # Add writing style defaults if not already present
688
+ if ! grep -q "$style_marker" "CLAUDE.md" 2>/dev/null; then
689
+ cat >> CLAUDE.md << EOF
690
+
691
+ $style_marker
692
+ ## Writing Style
693
+ - Active voice only. Never use passive voice.
694
+ - Never use em dashes. Use commas, periods, or parentheses instead.
695
+ EOF
696
+ fi
584
697
  echo "$detected_section" >> CLAUDE.md
585
698
  echo " Updated CLAUDE.md"
586
699
  else
@@ -590,6 +703,11 @@ ${python_runner:+- Python: Use \`$python_runner\` (not bare \`python\`)}
590
703
 
591
704
  ## Your Rules
592
705
  <!-- Add your project-specific rules, patterns, and conventions here -->
706
+
707
+ <!-- vibe-and-thrive-detected -->
708
+ ## Writing Style
709
+ - Active voice only. Never use passive voice.
710
+ - Never use em dashes. Use commas, periods, or parentheses instead.
593
711
  EOF
594
712
 
595
713
  # Include framework template if available
@@ -715,20 +833,28 @@ setup_precommit_hooks() {
715
833
  # Create .pre-commit-config.yaml if it doesn't exist
716
834
  if [[ ! -f ".pre-commit-config.yaml" ]]; then
717
835
  cat > .pre-commit-config.yaml << 'EOF'
836
+ # Pre-commit hooks powered by agentic-loop vibe-check
837
+ # Runs on staged files before each commit
718
838
  repos:
719
- - repo: https://github.com/allierays/agentic-loop
720
- rev: v1.0.0
839
+ - repo: local
840
+ hooks:
841
+ - id: vibe-check
842
+ name: Vibe Check
843
+ entry: npx vibe-check --fail-on error
844
+ language: node
845
+ types_or: [javascript, ts, python, json]
846
+ pass_filenames: true
847
+
848
+ # Standard hooks
849
+ - repo: https://github.com/pre-commit/pre-commit-hooks
850
+ rev: v4.5.0
721
851
  hooks:
722
- - id: backup-db
723
- name: Backup database before commit
724
- - id: check-secrets
725
- name: Check for hardcoded secrets
726
- - id: check-hardcoded-urls
727
- name: Check for hardcoded URLs
728
- - id: check-debug
729
- name: Check for debug statements
730
- - id: check-signs-secrets
731
- name: Check signs.json for credentials
852
+ - id: trailing-whitespace
853
+ exclude: '\.txt$'
854
+ - id: end-of-file-fixer
855
+ exclude: '\.txt$'
856
+ - id: check-yaml
857
+ - id: check-added-large-files
732
858
  EOF
733
859
  echo " Created .pre-commit-config.yaml"
734
860
  else
@@ -48,6 +48,13 @@
48
48
  "category": "testing",
49
49
  "learnedFrom": null,
50
50
  "createdAt": "2026-01-25T00:00:00-08:00"
51
+ },
52
+ {
53
+ "id": "sign-008",
54
+ "pattern": "NEVER use import-checks as test steps (python -c 'from X import Y', hasattr, test -f). These only verify a symbol exists, not that it works. A function that raises on every call still passes an import check. Always use real behavioral tests: curl for API endpoints, pytest with actual function calls, or Playwright for UI.",
55
+ "category": "testing",
56
+ "learnedFrom": "Okta SSO PRD review — all 4 backend stories had import-check testSteps that would pass even with completely broken code",
57
+ "createdAt": "2026-02-17T00:00:00-08:00"
51
58
  }
52
59
  ]
53
60
  }