@graypark/loophaus 3.8.1 → 3.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -10,37 +10,70 @@
10
10
  <a href="https://github.com/vcz-Gray/loophaus/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square" alt="license" /></a>
11
11
  <img src="https://img.shields.io/badge/node-%3E%3D20-brightgreen.svg?style=flat-square" alt="node version" />
12
12
  <img src="https://img.shields.io/badge/platform-Claude%20Code%20%7C%20Codex%20CLI%20%7C%20Kiro%20CLI-purple.svg?style=flat-square" alt="platform" />
13
- <img src="https://img.shields.io/badge/tests-296%20passing-brightgreen.svg?style=flat-square" alt="tests" />
13
+ <img src="https://img.shields.io/badge/tests-367%20passing-brightgreen.svg?style=flat-square" alt="tests" />
14
14
  </p>
15
15
 
16
- <h3 align="center">Control plane for coding agents iterative dev loops across Claude Code, Codex CLI, and Kiro CLI.</h3>
16
+ <h3 align="center">Run AI coding agents in autonomous loops fresh context each iteration, PRD-tracked progress, automatic quality gates.</h3>
17
17
 
18
18
  <p align="center">
19
19
  <sub>Based on <a href="https://ghuntley.com/ralph/">Geoffrey Huntley's Ralph Wiggum technique</a></sub>
20
20
  </p>
21
21
 
22
+ <p align="center">
23
+ <img src="https://raw.githubusercontent.com/vcz-Gray/loophaus/main/assets/demo-video/demo.gif" alt="loophaus demo" width="700" />
24
+ </p>
25
+
22
26
  ---
23
27
 
24
- ## Why loophaus?
28
+ ## The Problem
25
29
 
26
- AI coding agents struggle with fundamental problems that get worse over long sessions:
30
+ AI coding agents struggle with long tasks:
27
31
 
28
- | Problem | What happens |
29
- |---------|-------------|
30
- | **Context rot** | Long conversations accumulate noise, the agent gets confused |
31
- | **No checkpoints** | All-or-nothing execution can't resume after interruption |
32
- | **Lost learnings** | Previous iterations' insights overwritten by new context |
33
- | **Completion ambiguity** | Agent says "done" but tests still fail |
34
- | **Platform lock-in** | Techniques that work in one agent don't transfer to others |
35
-
36
- loophaus solves this:
37
-
38
- - **Fresh context per iteration** — Each cycle reads PRD + progress from disk, zero degradation
39
- - **Git-enforced safety** — Atomic commits per story, rollback at any point
40
- - **Append-only learnings** — `progress.txt` accumulates knowledge across iterations
41
- - **Test-verified completion** — Agent can only exit when `<promise>COMPLETE</promise>` is genuinely true
32
+ - **Context rot** agent gets confused after 10+ iterations
33
+ - **Goal drift** — agent forgets the spec and solves the wrong problem
34
+ - **No quality signal** agent says "done" but tests still fail
35
+ - **Token waste** you re-explain the same context every time
36
+
37
+ ## The Solution
38
+
39
+ - **Fresh context per iteration** — Each cycle reads PRD + progress from disk, zero degradation even after 20+ iterations
40
+ - **PRD-linked progress tracking** — Stories are tracked in `prd.json` with pass/fail state, not "I think I'm done"
41
+ - **Quality scoring with keep/discard** — Autoresearch-inspired refinement loop measures quality (0-100) and reverts regressions
42
42
  - **Universal stop hook** — One Node.js hook works across Claude Code, Codex CLI, and Kiro CLI
43
43
 
44
+ ## Quick Start
45
+
46
+ ```bash
47
+ npm install -g @graypark/loophaus
48
+ loophaus install
49
+ ```
50
+
51
+ > **Note:** `npx @graypark/loophaus install` may fail on some npm versions due to a bin resolution cache bug. Use the global install above for reliable setup.
52
+
53
+ The installer auto-detects your host (Claude Code, Codex CLI, or Kiro CLI) and sets up everything — stop hook, commands, and skills.
54
+
55
+ Then in your AI coding session:
56
+
57
+ ```
58
+ /loop-plan Add user authentication with JWT, bcrypt, and login UI
59
+ ```
60
+
61
+ That's it. The interview generates a PRD, activates the loop, and starts implementing story by story.
62
+
63
+ ## Safety
64
+
65
+ - Every iteration creates a **git checkpoint** — atomic revert anytime
66
+ - **Max iterations limit** (default 20, configurable)
67
+ - **Quality threshold = circuit breaker** — score < 80 triggers refine or stop
68
+ - **Cost tracking** with policy enforcement (max $5, max 30 min)
69
+ - `loophaus clean` for data lifecycle management
70
+
71
+ ## Why not just script this?
72
+
73
+ 1. **Fresh context isolation** — no degradation after 20 iterations; each cycle starts from disk, not from a decaying conversation
74
+ 2. **PRD-linked progress tracking** — structured `prd.json` with pass/fail per story, not "I think I'm done"
75
+ 3. **Quality scoring with keep/discard** — autoresearch pattern: measure, keep improvements, revert regressions
76
+
44
77
  ## How it works
45
78
 
46
79
  An AI agent works on a task in a continuous loop. Each iteration starts with fresh context — reading the PRD and progress files to decide what to do next. The agent implements one story, commits, updates progress, and exits. The stop hook intercepts the exit and re-injects the prompt. Repeat until all stories pass.
@@ -80,33 +113,53 @@ An AI agent works on a task in a continuous loop. Each iteration starts with fre
80
113
  └─────────────────────────────────┘
81
114
  ```
82
115
 
83
- ## Quick Start
116
+ ## Commands
84
117
 
85
- ```bash
86
- npm install -g @graypark/loophaus
87
- loophaus install
88
- ```
118
+ | Command | Description |
119
+ |---------|-------------|
120
+ | `/loop-plan` | Interactive interview — asks targeted questions, generates PRD, activates loop |
121
+ | `/loop` | Start iterative dev loop directly (when you already have a PRD or custom prompt) |
122
+ | `/loop-stop` | Stop the active loop immediately |
123
+ | `/loop-pulse` | Check current loop status, iteration count, and progress |
89
124
 
90
- > **Note:** `npx @graypark/loophaus install` may fail on some npm versions due to a bin resolution cache bug. Use the global install above for reliable setup.
125
+ ## Quality Loop (v3.4.0+)
91
126
 
92
- The installer auto-detects your host (Claude Code, Codex CLI, or Kiro CLI) and sets up everything — stop hook, commands, and skills.
127
+ loophaus v3.4.0 introduces the **Quality Loop** inspired by [karpathy/autoresearch](https://github.com/karpathy/autoresearch)'s experiment-measure-keep/discard pattern.
93
128
 
94
- Then in your AI coding session:
129
+ Instead of simply marking a story as "done" when tests pass, `/loop-plan` now **measures quality** (0-100) and **iteratively refines** until the score meets the threshold.
95
130
 
96
131
  ```
97
- /loop-plan Add user authentication with JWT, bcrypt, and login UI
132
+ Phase 4: Implement
133
+
134
+ Phase 5: Evaluate (score 0-100)
135
+ ↓ ↑
136
+ Phase 6: Refine Loop
137
+ score improved? → keep (commit)
138
+ score declined? → discard (git reset)
139
+ max attempts reached? → move on
140
+
141
+ Phase 7: Report (with quality scores)
98
142
  ```
99
143
 
100
- That's it. The interview generates a PRD, activates the loop, and starts implementing story by story.
144
+ | autoresearch | loophaus |
145
+ |-------------|----------|
146
+ | `val_bpb` | quality score (weighted: tests, typecheck, lint, verify, diff, custom) |
147
+ | `results.tsv` | `.loophaus/results.tsv` |
148
+ | keep → advance | score improved → commit |
149
+ | discard → revert | score declined → `git reset --hard` |
150
+ | NEVER STOP | max 3 attempts per story (configurable) |
101
151
 
102
- ## Commands
152
+ ### Configuration
103
153
 
104
- | Command | Description |
105
- |---------|-------------|
106
- | `/loop-plan` | Interactive interview — asks targeted questions, generates PRD, activates loop |
107
- | `/loop` | Start iterative dev loop directly (when you already have a PRD or custom prompt) |
108
- | `/loop-stop` | Stop the active loop immediately |
109
- | `/loop-pulse` | Check current loop status, iteration count, and progress |
154
+ ```json
155
+ {
156
+ "qualityThreshold": 80,
157
+ "maxRefineAttempts": 3,
158
+ "qualityConfig": {
159
+ "weights": { "tests": 30, "typecheck": 25, "lint": 15, "verify": 15, "diff": 10, "custom": 5 }
160
+ }
161
+ }
162
+ ```
110
163
 
111
164
  ## Platform Support
112
165
 
@@ -165,55 +218,13 @@ loophaus install # Install to detected host
165
218
  loophaus status # Show current loop state and active host
166
219
  loophaus stats # Iteration history and completion metrics
167
220
  loophaus quality # Run quality scoring on current stories
221
+ loophaus demo # Run interactive demo
222
+ loophaus config # Show/edit configuration
223
+ loophaus update-check # Check for new versions
224
+ loophaus upgrade # Upgrade to latest version
168
225
  loophaus uninstall # Clean removal from all hosts
169
226
  ```
170
227
 
171
- ## Quality Loop (v3.4.0+)
172
-
173
- loophaus v3.4.0 introduces the **Quality Loop** — inspired by [karpathy/autoresearch](https://github.com/karpathy/autoresearch)'s experiment→measure→keep/discard pattern.
174
-
175
- Instead of simply marking a story as "done" when tests pass, `/loop-plan` now **measures quality** (0-100) and **iteratively refines** until the score meets the threshold.
176
-
177
- ```
178
- Phase 4: Implement
179
-
180
- Phase 5: Evaluate (score 0-100)
181
- ↓ ↑
182
- Phase 6: Refine Loop
183
- score improved? → keep (commit)
184
- score declined? → discard (git reset)
185
- max attempts reached? → move on
186
-
187
- Phase 7: Report (with quality scores)
188
- ```
189
-
190
- | autoresearch | loophaus |
191
- |-------------|----------|
192
- | `val_bpb` | quality score (weighted: tests, typecheck, lint, verify, diff, custom) |
193
- | `results.tsv` | `.loophaus/results.tsv` |
194
- | keep → advance | score improved → commit |
195
- | discard → revert | score declined → `git reset --hard` |
196
- | NEVER STOP | max 3 attempts per story (configurable) |
197
-
198
- ### Configuration
199
-
200
- ```json
201
- {
202
- "qualityThreshold": 80,
203
- "maxRefineAttempts": 3,
204
- "qualityConfig": {
205
- "weights": { "tests": 30, "typecheck": 25, "lint": 15, "verify": 15, "diff": 10, "custom": 5 }
206
- }
207
- }
208
- ```
209
-
210
- ### CLI
211
-
212
- ```bash
213
- loophaus quality # Score all stories
214
- loophaus quality --story US-001 # Score a specific story
215
- ```
216
-
217
228
  ## Architecture
218
229
 
219
230
  ```
@@ -253,7 +264,7 @@ loophaus/
253
264
  ├── .claude-plugin/
254
265
  │ └── plugin.json # Claude Code marketplace manifest
255
266
  ├── dist/ # Compiled output (tsc)
256
- └── tests/ # 296 test cases (vitest)
267
+ └── tests/ # 359 test cases (vitest)
257
268
  ```
258
269
 
259
270
  ## PRD Format
@@ -311,10 +322,10 @@ npm uninstall -g @graypark/loophaus
311
322
  git clone https://github.com/vcz-Gray/loophaus.git
312
323
  cd loophaus
313
324
  npm install
314
- npm test
315
- npm run typecheck # TypeScript strict mode
316
- npm run build # Compile to dist/
317
- npx vitest # watch mode
325
+ npm test # 359 tests
326
+ npm run typecheck # TypeScript strict mode
327
+ npm run build # Compile to dist/
328
+ npx vitest # watch mode
318
329
  ```
319
330
 
320
331
  ## License