@lnilluv/pi-ralph-loop 1.2.0 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -0,0 +1,326 @@
|
|
|
1
|
+
# Config Cookbook
|
|
2
|
+
|
|
3
|
+
Frontmatter recipes for common scenarios. Copy, adjust, and run.
|
|
4
|
+
|
|
5
|
+
## Minimal loop
|
|
6
|
+
|
|
7
|
+
The simplest useful loop. Just a prompt and a max.
|
|
8
|
+
|
|
9
|
+
```yaml
|
|
10
|
+
---
|
|
11
|
+
max_iterations: 10
|
|
12
|
+
---
|
|
13
|
+
Read TODO.md and implement the next task.
|
|
14
|
+
Commit when done.
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
When to use: Quick tasks where you trust the agent to know when it's done, or you'll stop it manually with `/ralph-stop`.
|
|
18
|
+
|
|
19
|
+
## Self-healing loop
|
|
20
|
+
|
|
21
|
+
The workhorse pattern. Commands feed evidence, the completion gate stops the loop.
|
|
22
|
+
|
|
23
|
+
```yaml
|
|
24
|
+
---
|
|
25
|
+
commands:
|
|
26
|
+
- name: tests
|
|
27
|
+
run: npm test
|
|
28
|
+
timeout: 60
|
|
29
|
+
- name: git-log
|
|
30
|
+
run: git log --oneline -10
|
|
31
|
+
max_iterations: 20
|
|
32
|
+
completion_promise: DONE
|
|
33
|
+
guardrails:
|
|
34
|
+
block_commands:
|
|
35
|
+
- 'git\s+push'
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
Fix the failing auth tests for {{ args.owner }}.
|
|
39
|
+
|
|
40
|
+
{{ commands.tests }}
|
|
41
|
+
|
|
42
|
+
{{ commands.git-log }}
|
|
43
|
+
|
|
44
|
+
If tests are failing, fix them before starting new work.
|
|
45
|
+
Stop with <promise>DONE</promise> when all tests pass.
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
When to use: Bug fixing, test writing, any task where command output shows current state.
|
|
49
|
+
|
|
50
|
+
## Gated completion loop
|
|
51
|
+
|
|
52
|
+
Adds `required_outputs` to the completion gate. The loop won't stop until both the promise appears AND the files exist.
|
|
53
|
+
|
|
54
|
+
```yaml
|
|
55
|
+
---
|
|
56
|
+
commands:
|
|
57
|
+
- name: build
|
|
58
|
+
run: npm run build
|
|
59
|
+
timeout: 60
|
|
60
|
+
- name: tests
|
|
61
|
+
run: npm test
|
|
62
|
+
timeout: 120
|
|
63
|
+
max_iterations: 25
|
|
64
|
+
completion_promise: DONE
|
|
65
|
+
required_outputs:
|
|
66
|
+
- MIGRATION_NOTES.md
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
Migrate one module per iteration from the legacy API to the new one.
|
|
70
|
+
|
|
71
|
+
{{ commands.build }}
|
|
72
|
+
|
|
73
|
+
{{ commands.tests }}
|
|
74
|
+
|
|
75
|
+
Stop with <promise>DONE</promise> only when all tests pass
|
|
76
|
+
and MIGRATION_NOTES.md exists with a summary of changes.
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
When to use: Migration, documentation, research — any task where "done" means a deliverable file exists, not just that the agent says it's done.
|
|
80
|
+
|
|
81
|
+
## Resilient loop
|
|
82
|
+
|
|
83
|
+
Continues past errors. Use when individual iterations may fail but the overall task should keep going.
|
|
84
|
+
|
|
85
|
+
```yaml
|
|
86
|
+
---
|
|
87
|
+
commands:
|
|
88
|
+
- name: tests
|
|
89
|
+
run: npm test
|
|
90
|
+
timeout: 120
|
|
91
|
+
- name: lint
|
|
92
|
+
run: npm run lint
|
|
93
|
+
timeout: 30
|
|
94
|
+
max_iterations: 30
|
|
95
|
+
completion_promise: DONE
|
|
96
|
+
stop_on_error: false
|
|
97
|
+
required_outputs:
|
|
98
|
+
- REFACTOR_LOG.md
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
Refactor one module per iteration.
|
|
102
|
+
|
|
103
|
+
{{ commands.tests }}
|
|
104
|
+
|
|
105
|
+
{{ commands.lint }}
|
|
106
|
+
|
|
107
|
+
If an iteration fails, note it in REFACTOR_LOG.md and move to the next module.
|
|
108
|
+
Stop with <promise>DONE</promise> when REFACTOR_LOG.md covers all modules.
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
When to use: Migration across many files, batch operations where some items may fail, long-running tasks that need resilience.
|
|
112
|
+
|
|
113
|
+
## Parameterized loop
|
|
114
|
+
|
|
115
|
+
Accepts runtime arguments via `--arg`. Makes the loop reusable across different targets.
|
|
116
|
+
|
|
117
|
+
```yaml
|
|
118
|
+
---
|
|
119
|
+
args:
|
|
120
|
+
- env
|
|
121
|
+
- target
|
|
122
|
+
commands:
|
|
123
|
+
- name: tests
|
|
124
|
+
run: npm test -- --env={{ args.env }}
|
|
125
|
+
timeout: 120
|
|
126
|
+
- name: coverage
|
|
127
|
+
run: npm run test:coverage -- --env={{ args.env }}
|
|
128
|
+
timeout: 120
|
|
129
|
+
max_iterations: 15
|
|
130
|
+
completion_promise: DONE
|
|
131
|
+
guardrails:
|
|
132
|
+
protected_files:
|
|
133
|
+
- 'policy:secret-bearing-paths'
|
|
134
|
+
---
|
|
135
|
+
|
|
136
|
+
Environment: {{ args.env }}
|
|
137
|
+
Target: {{ args.target }}
|
|
138
|
+
|
|
139
|
+
{{ commands.tests }}
|
|
140
|
+
|
|
141
|
+
{{ commands.coverage }}
|
|
142
|
+
|
|
143
|
+
Increase test coverage for {{ args.target }}.
|
|
144
|
+
Stop with <promise>DONE</promise> when coverage exceeds 80%.
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
Run with: `/ralph --path my-task --arg env=staging --arg target="src/auth"`
|
|
148
|
+
|
|
149
|
+
When to use: Reusable loops for different environments, targets, or configurations.
|
|
150
|
+
|
|
151
|
+
## Security audit loop
|
|
152
|
+
|
|
153
|
+
Strict guardrails with evidence-driven improvement.
|
|
154
|
+
|
|
155
|
+
```yaml
|
|
156
|
+
---
|
|
157
|
+
commands:
|
|
158
|
+
- name: scan
|
|
159
|
+
run: npm audit --audit-level=moderate
|
|
160
|
+
timeout: 60
|
|
161
|
+
- name: tests
|
|
162
|
+
run: npm test
|
|
163
|
+
timeout: 120
|
|
164
|
+
- name: git-log
|
|
165
|
+
run: git log --oneline -10
|
|
166
|
+
max_iterations: 20
|
|
167
|
+
completion_promise: DONE
|
|
168
|
+
required_outputs:
|
|
169
|
+
- SECURITY_FINDINGS.md
|
|
170
|
+
guardrails:
|
|
171
|
+
block_commands:
|
|
172
|
+
- 'git\s+push'
|
|
173
|
+
- 'npm\s+publish'
|
|
174
|
+
protected_files:
|
|
175
|
+
- '.env*'
|
|
176
|
+
- '*.pem'
|
|
177
|
+
- '*.key'
|
|
178
|
+
- 'policy:secret-bearing-paths'
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
Find and fix security vulnerabilities.
|
|
182
|
+
|
|
183
|
+
{{ commands.scan }}
|
|
184
|
+
|
|
185
|
+
{{ commands.tests }}
|
|
186
|
+
|
|
187
|
+
{{ commands.git-log }}
|
|
188
|
+
|
|
189
|
+
Pick one finding and fix it.
|
|
190
|
+
Log everything in SECURITY_FINDINGS.md.
|
|
191
|
+
Stop with <promise>DONE</promise> when the scan is clean and SECURITY_FINDINGS.md is complete.
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
When to use: Security audits, compliance checks, any task that handles sensitive data.
|
|
195
|
+
|
|
196
|
+
## Research loop
|
|
197
|
+
|
|
198
|
+
Long timeout, minimal commands, progress lives in files.
|
|
199
|
+
|
|
200
|
+
```yaml
|
|
201
|
+
---
|
|
202
|
+
commands:
|
|
203
|
+
- name: git-log
|
|
204
|
+
run: git log --oneline -15
|
|
205
|
+
max_iterations: 20
|
|
206
|
+
timeout: 300
|
|
207
|
+
completion_promise: DONE
|
|
208
|
+
required_outputs:
|
|
209
|
+
- REPORT.md
|
|
210
|
+
---
|
|
211
|
+
|
|
212
|
+
Build a comprehensive research report on {{ args.topic }}.
|
|
213
|
+
|
|
214
|
+
{{ commands.git-log }}
|
|
215
|
+
|
|
216
|
+
Each iteration:
|
|
217
|
+
1. Read REPORT.md to see what exists
|
|
218
|
+
2. Identify the weakest section
|
|
219
|
+
3. Research and write findings
|
|
220
|
+
4. Commit your changes
|
|
221
|
+
|
|
222
|
+
Stop with <promise>DONE</promise> when REPORT.md exists
|
|
223
|
+
and all referenced sections have substantial content.
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
When to use: Deep research, documentation generation, knowledge base construction.
|
|
227
|
+
|
|
228
|
+
## High-autonomy loop
|
|
229
|
+
|
|
230
|
+
Trust the agent, minimize constraints. Use when you've verified the loop works correctly and want it to run freely.
|
|
231
|
+
|
|
232
|
+
```yaml
|
|
233
|
+
---
|
|
234
|
+
commands:
|
|
235
|
+
- name: tests
|
|
236
|
+
run: npm test
|
|
237
|
+
timeout: 120
|
|
238
|
+
- name: git-log
|
|
239
|
+
run: git log --oneline -10
|
|
240
|
+
max_iterations: 50
|
|
241
|
+
completion_promise: DONE
|
|
242
|
+
stop_on_error: false
|
|
243
|
+
guardrails:
|
|
244
|
+
block_commands:
|
|
245
|
+
- 'git\s+push'
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
{{ commands.tests }}
|
|
249
|
+
|
|
250
|
+
{{ commands.git-log }}
|
|
251
|
+
|
|
252
|
+
Read TODO.md and implement the next task.
|
|
253
|
+
Stop with <promise>DONE</promise> when all items are complete.
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
When to use: Long autonomous runs where you want maximum flexibility. Always keep at least `git push` blocked.
|
|
257
|
+
|
|
258
|
+
## Low-autonomy loop
|
|
259
|
+
|
|
260
|
+
Strict constraints, stop on any error. Use when exploring or when mistakes are costly.
|
|
261
|
+
|
|
262
|
+
```yaml
|
|
263
|
+
---
|
|
264
|
+
commands:
|
|
265
|
+
- name: tests
|
|
266
|
+
run: npm test
|
|
267
|
+
timeout: 60
|
|
268
|
+
- name: lint
|
|
269
|
+
run: npm run lint
|
|
270
|
+
timeout: 30
|
|
271
|
+
max_iterations: 10
|
|
272
|
+
completion_promise: DONE
|
|
273
|
+
stop_on_error: true
|
|
274
|
+
guardrails:
|
|
275
|
+
block_commands:
|
|
276
|
+
- 'git\s+push'
|
|
277
|
+
- 'npm\s+publish'
|
|
278
|
+
- 'rm\s+-rf'
|
|
279
|
+
protected_files:
|
|
280
|
+
- '.env*'
|
|
281
|
+
- '*.pem'
|
|
282
|
+
- '*.key'
|
|
283
|
+
- 'config/production.*'
|
|
284
|
+
- 'policy:secret-bearing-paths'
|
|
285
|
+
---
|
|
286
|
+
|
|
287
|
+
Carefully improve one thing per iteration.
|
|
288
|
+
|
|
289
|
+
{{ commands.test }}
|
|
290
|
+
|
|
291
|
+
{{ commands.lint }}
|
|
292
|
+
|
|
293
|
+
If anything is broken, fix it before doing anything else.
|
|
294
|
+
Stop with <promise>DONE</promise> when all tests and lint pass.
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
When to use: Sensitive codebases, production-adjacent work, first loops on a new project.
|
|
298
|
+
|
|
299
|
+
## Choosing stop_on_error
|
|
300
|
+
|
|
301
|
+
| Value | Behavior | When to use |
|
|
302
|
+
|---|---|---|
|
|
303
|
+
| `true` (default) | Stop on any RPC error or timeout | Bug fixing, single-target tasks, cautious loops |
|
|
304
|
+
| `false` | Continue past errors | Migration, batch operations, research, long autonomous runs |
|
|
305
|
+
|
|
306
|
+
## Choosing guardrails
|
|
307
|
+
|
|
308
|
+
| Guardrail | What it blocks | When to use |
|
|
309
|
+
|---|---|---|
|
|
310
|
+
| `git\s+push` | Pushes to remote | Almost always — prevents accidental publishes |
|
|
311
|
+
| `npm\s+publish` | Package publishes | When working on published packages |
|
|
312
|
+
| `rm\s+-rf\s+/` | Destructive root deletes | Always worth including |
|
|
313
|
+
| `.env*` | Environment files | Any task touching config or deployment |
|
|
314
|
+
| `*.pem`, `*.key` | Certificate and key files | Security-related tasks |
|
|
315
|
+
| `policy:secret-bearing-paths` | All secret-bearing paths | Default good practice |
|
|
316
|
+
|
|
317
|
+
## Choosing required_outputs
|
|
318
|
+
|
|
319
|
+
| Scenario | required_outputs |
|
|
320
|
+
|---|---|
|
|
321
|
+
| Bug fixing | None — test pass is sufficient |
|
|
322
|
+
| Migration | `[MIGRATION_NOTES.md]` — deliverable summary |
|
|
323
|
+
| Documentation | `[DOCS_INDEX.md]` — proof of coverage |
|
|
324
|
+
| Security audit | `[SECURITY_FINDINGS.md]` — audit deliverable |
|
|
325
|
+
| Research | `[REPORT.md]` — final synthesis |
|
|
326
|
+
| Test coverage | None — coverage command shows progress |
|
|
@@ -0,0 +1,405 @@
|
|
|
1
|
+
# Prompt Patterns
|
|
2
|
+
|
|
3
|
+
Detailed patterns for writing effective RALPH.md prompts. Each pattern includes the structure, when to use it, and a fully annotated example.
|
|
4
|
+
|
|
5
|
+
## The five-section structure
|
|
6
|
+
|
|
7
|
+
Every effective prompt follows the same skeleton:
|
|
8
|
+
|
|
9
|
+
```markdown
|
|
10
|
+
---
|
|
11
|
+
frontmatter
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Orientation
|
|
15
|
+
Who you are and how the loop works.
|
|
16
|
+
|
|
17
|
+
## Evidence
|
|
18
|
+
{{ commands.* }} — current state.
|
|
19
|
+
|
|
20
|
+
## Task
|
|
21
|
+
One specific thing to do this iteration.
|
|
22
|
+
|
|
23
|
+
## Rules
|
|
24
|
+
Constraints on what you can and can't do.
|
|
25
|
+
|
|
26
|
+
## Completion
|
|
27
|
+
When to stop and what <promise>DONE</promise> means.
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
Not every section needs a heading. Short prompts fold rules into the task. But every effective prompt addresses all five.
|
|
31
|
+
|
|
32
|
+
## Pattern: Self-healing test loop
|
|
33
|
+
|
|
34
|
+
The most common pattern. Run tests, see failures, fix them, verify. The command output is both evidence and a natural stopping signal.
|
|
35
|
+
|
|
36
|
+
```yaml
|
|
37
|
+
---
|
|
38
|
+
commands:
|
|
39
|
+
- name: tests
|
|
40
|
+
run: npm test
|
|
41
|
+
timeout: 60
|
|
42
|
+
- name: git-log
|
|
43
|
+
run: git log --oneline -10
|
|
44
|
+
max_iterations: 20
|
|
45
|
+
completion_promise: DONE
|
|
46
|
+
guardrails:
|
|
47
|
+
block_commands:
|
|
48
|
+
- 'git\s+push'
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
You are an autonomous coding agent running in a loop.
|
|
52
|
+
Each iteration starts with a fresh context.
|
|
53
|
+
Your progress lives in the code and git history.
|
|
54
|
+
|
|
55
|
+
## Test results
|
|
56
|
+
|
|
57
|
+
{{ commands.tests }}
|
|
58
|
+
|
|
59
|
+
## Recent commits
|
|
60
|
+
|
|
61
|
+
{{ commands.git-log }}
|
|
62
|
+
|
|
63
|
+
If tests are failing, fix them before starting new work.
|
|
64
|
+
Then pick the next task from TODO.md.
|
|
65
|
+
|
|
66
|
+
## Rules
|
|
67
|
+
|
|
68
|
+
- One task per iteration
|
|
69
|
+
- No placeholder code — full, working implementations only
|
|
70
|
+
- Run tests before committing
|
|
71
|
+
- Commit with descriptive messages: `fix: ...`, `feat: ...`, `test: ...`
|
|
72
|
+
|
|
73
|
+
## Completion
|
|
74
|
+
|
|
75
|
+
Stop with <promise>DONE</promise> when all tests pass and TODO.md has no remaining items.
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
**Why it works:**
|
|
79
|
+
- `tests` command gives the agent live evidence each iteration
|
|
80
|
+
- `git-log` command reminds the agent what it already did
|
|
81
|
+
- `completion_promise: DONE` gives a clear stop signal
|
|
82
|
+
- `guardrails` prevents accidental pushes
|
|
83
|
+
|
|
84
|
+
## Pattern: Ordered task list
|
|
85
|
+
|
|
86
|
+
When work can be decomposed into a checklist. The agent works through items one at a time, marking them done.
|
|
87
|
+
|
|
88
|
+
```yaml
|
|
89
|
+
---
|
|
90
|
+
commands:
|
|
91
|
+
- name: build
|
|
92
|
+
run: npm run build
|
|
93
|
+
timeout: 60
|
|
94
|
+
- name: tests
|
|
95
|
+
run: npm test
|
|
96
|
+
timeout: 120
|
|
97
|
+
max_iterations: 30
|
|
98
|
+
completion_promise: DONE
|
|
99
|
+
required_outputs:
|
|
100
|
+
- MIGRATION_NOTES.md
|
|
101
|
+
stop_on_error: false
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
You are migrating from REST to GraphQL. Work through MIGRATION_TODO.md one item at a time.
|
|
105
|
+
|
|
106
|
+
## Build
|
|
107
|
+
|
|
108
|
+
{{ commands.build }}
|
|
109
|
+
|
|
110
|
+
## Tests
|
|
111
|
+
|
|
112
|
+
{{ commands.tests }}
|
|
113
|
+
|
|
114
|
+
If the build fails or tests fail, fix the issue before continuing migration.
|
|
115
|
+
|
|
116
|
+
## Each iteration
|
|
117
|
+
|
|
118
|
+
1. Read MIGRATION_TODO.md, pick the first incomplete item
|
|
119
|
+
2. Migrate that one endpoint
|
|
120
|
+
3. Verify the build passes and tests pass
|
|
121
|
+
4. Mark the item complete in MIGRATION_TODO.md
|
|
122
|
+
5. Commit: `refactor: migrate <endpoint> from REST to GraphQL`
|
|
123
|
+
|
|
124
|
+
## Completion
|
|
125
|
+
|
|
126
|
+
Stop with <promise>DONE</promise> when MIGRATION_TODO.md has no remaining items AND MIGRATION_NOTES.md exists with a summary of all changes.
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
**Why it works:**
|
|
130
|
+
- `stop_on_error: false` — migration often has transient failures; the loop should keep going
|
|
131
|
+
- `required_outputs` gates completion on an actual deliverable file
|
|
132
|
+
- The task list in MIGRATION_TODO.md is the progress memory
|
|
133
|
+
|
|
134
|
+
## Pattern: Evidence-driven improvement
|
|
135
|
+
|
|
136
|
+
When there's no fixed checklist — the agent discovers what to improve from running commands.
|
|
137
|
+
|
|
138
|
+
```yaml
|
|
139
|
+
---
|
|
140
|
+
commands:
|
|
141
|
+
- name: tests
|
|
142
|
+
run: uv run pytest -x
|
|
143
|
+
timeout: 120
|
|
144
|
+
- name: coverage
|
|
145
|
+
run: uv run pytest --cov=src --cov-report=term-missing -q
|
|
146
|
+
timeout: 120
|
|
147
|
+
- name: git-log
|
|
148
|
+
run: git log --oneline -10
|
|
149
|
+
max_iterations: 15
|
|
150
|
+
completion_promise: DONE
|
|
151
|
+
args:
|
|
152
|
+
- target
|
|
153
|
+
---
|
|
154
|
+
|
|
155
|
+
You are increasing test coverage for {{ args.target }}.
|
|
156
|
+
|
|
157
|
+
## Coverage report
|
|
158
|
+
|
|
159
|
+
{{ commands.coverage }}
|
|
160
|
+
|
|
161
|
+
## Test results
|
|
162
|
+
|
|
163
|
+
{{ commands.tests }}
|
|
164
|
+
|
|
165
|
+
## Recent commits
|
|
166
|
+
|
|
167
|
+
{{ commands.git-log }}
|
|
168
|
+
|
|
169
|
+
Pick the module with the most missing lines from the coverage report.
|
|
170
|
+
Read the source code, understand what it does, and write thorough tests.
|
|
171
|
+
Commit with `test: add coverage for <module>`.
|
|
172
|
+
|
|
173
|
+
## Rules
|
|
174
|
+
|
|
175
|
+
- One module per iteration
|
|
176
|
+
- Write tests that verify behavior, not just hit lines
|
|
177
|
+
- All existing tests must still pass
|
|
178
|
+
- Do not add `# pragma: no cover` comments
|
|
179
|
+
|
|
180
|
+
## Completion
|
|
181
|
+
|
|
182
|
+
Stop with <promise>DONE</promise> when coverage for {{ args.target }} exceeds 80%.
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
**Why it works:**
|
|
186
|
+
- Two evidence commands give the agent both test results and coverage data
|
|
187
|
+
- `args.target` makes the loop reusable for different modules
|
|
188
|
+
- The coverage report naturally focuses the agent on the biggest gap
|
|
189
|
+
|
|
190
|
+
## Pattern: Research and synthesis
|
|
191
|
+
|
|
192
|
+
For tasks that produce a document rather than code. The loop writes files, and `required_outputs` gates completion.
|
|
193
|
+
|
|
194
|
+
```yaml
|
|
195
|
+
---
|
|
196
|
+
commands:
|
|
197
|
+
- name: git-log
|
|
198
|
+
run: git log --oneline -15
|
|
199
|
+
max_iterations: 20
|
|
200
|
+
timeout: 300
|
|
201
|
+
completion_promise: DONE
|
|
202
|
+
required_outputs:
|
|
203
|
+
- REPORT.md
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
You are a research agent building a comprehensive report.
|
|
207
|
+
|
|
208
|
+
## Recent changes
|
|
209
|
+
|
|
210
|
+
{{ commands.git-log }}
|
|
211
|
+
|
|
212
|
+
## Each iteration
|
|
213
|
+
|
|
214
|
+
1. Read REPORT.md to see what exists
|
|
215
|
+
2. Identify the weakest or most incomplete section
|
|
216
|
+
3. Research the topic using available tools
|
|
217
|
+
4. Write detailed findings into the appropriate section
|
|
218
|
+
5. Update the report outline if needed
|
|
219
|
+
|
|
220
|
+
## Rules
|
|
221
|
+
|
|
222
|
+
- Write to REPORT.md and section files in research/
|
|
223
|
+
- One section per iteration
|
|
224
|
+
- Cite sources with URLs
|
|
225
|
+
- Do not fabricate references
|
|
226
|
+
|
|
227
|
+
## Completion
|
|
228
|
+
|
|
229
|
+
Stop with <promise>DONE</promise> when REPORT.md exists and all sections referenced in its table of contents have corresponding files with substantial content (>500 words each).
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
**Why it works:**
|
|
233
|
+
- `timeout: 300` — research iterations need more time
|
|
234
|
+
- `required_outputs: [REPORT.md]` — completion gated on the deliverable
|
|
235
|
+
- Minimal commands — the agent does its own research each iteration
|
|
236
|
+
- Progress lives in the files, not in commands
|
|
237
|
+
|
|
238
|
+
## Pattern: Security audit
|
|
239
|
+
|
|
240
|
+
Strict guardrails with self-healing.
|
|
241
|
+
|
|
242
|
+
```yaml
|
|
243
|
+
---
|
|
244
|
+
commands:
|
|
245
|
+
- name: scan
|
|
246
|
+
run: npx audit-ci --moderate
|
|
247
|
+
timeout: 60
|
|
248
|
+
- name: tests
|
|
249
|
+
run: npm test
|
|
250
|
+
timeout: 120
|
|
251
|
+
- name: git-log
|
|
252
|
+
run: git log --oneline -10
|
|
253
|
+
max_iterations: 20
|
|
254
|
+
completion_promise: DONE
|
|
255
|
+
required_outputs:
|
|
256
|
+
- SECURITY_FINDINGS.md
|
|
257
|
+
guardrails:
|
|
258
|
+
block_commands:
|
|
259
|
+
- 'git\s+push'
|
|
260
|
+
- 'npm\s+publish'
|
|
261
|
+
protected_files:
|
|
262
|
+
- '.env*'
|
|
263
|
+
- '*.pem'
|
|
264
|
+
- '*.key'
|
|
265
|
+
- 'policy:secret-bearing-paths'
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
You are a security auditor. Find and fix vulnerabilities.
|
|
269
|
+
|
|
270
|
+
## Vulnerability scan
|
|
271
|
+
|
|
272
|
+
{{ commands.scan }}
|
|
273
|
+
|
|
274
|
+
## Test results
|
|
275
|
+
|
|
276
|
+
{{ commands.tests }}
|
|
277
|
+
|
|
278
|
+
## Recent commits
|
|
279
|
+
|
|
280
|
+
{{ commands.git-log }}
|
|
281
|
+
|
|
282
|
+
If tests are failing, fix them before addressing security findings.
|
|
283
|
+
|
|
284
|
+
## Each iteration
|
|
285
|
+
|
|
286
|
+
1. Review the vulnerability scan above
|
|
287
|
+
2. Pick one finding
|
|
288
|
+
3. Fix the underlying issue (do not suppress warnings)
|
|
289
|
+
4. Verify tests still pass
|
|
290
|
+
5. Log the finding in SECURITY_FINDINGS.md with: severity, location, description, resolution
|
|
291
|
+
|
|
292
|
+
## Rules
|
|
293
|
+
|
|
294
|
+
- One finding per iteration
|
|
295
|
+
- Fix root causes, never suppress warnings
|
|
296
|
+
- Never modify .env, .pem, or .key files
|
|
297
|
+
- Commit with `security: fix <description>`
|
|
298
|
+
|
|
299
|
+
## Completion
|
|
300
|
+
|
|
301
|
+
Stop with <promise>DONE</promise> when SECURITY_FINDINGS.md exists and the vulnerability scan reports no moderate or high issues.
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
**Why it works:**
|
|
305
|
+
- Heavy guardrails — this is a security task, block pushes and protect secrets
|
|
306
|
+
- `policy:secret-bearing-paths` as a catch-all for credential files
|
|
307
|
+
- `required_outputs` gates on the audit deliverable
|
|
308
|
+
- Self-healing: if code changes break tests, fix those first
|
|
309
|
+
|
|
310
|
+
## Anti-patterns to avoid
|
|
311
|
+
|
|
312
|
+
### Vague goals
|
|
313
|
+
|
|
314
|
+
```markdown
|
|
315
|
+
# Bad
|
|
316
|
+
Improve the codebase.
|
|
317
|
+
|
|
318
|
+
# Good
|
|
319
|
+
Find the module with the lowest test coverage and write tests for it.
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### Missing evidence
|
|
323
|
+
|
|
324
|
+
```markdown
|
|
325
|
+
# Bad — no commands, the agent is blind
|
|
326
|
+
---
|
|
327
|
+
max_iterations: 10
|
|
328
|
+
---
|
|
329
|
+
Fix the failing tests.
|
|
330
|
+
|
|
331
|
+
# Good — agent sees current state each iteration
|
|
332
|
+
---
|
|
333
|
+
commands:
|
|
334
|
+
- name: tests
|
|
335
|
+
run: npm test
|
|
336
|
+
timeout: 60
|
|
337
|
+
---
|
|
338
|
+
{{ commands.tests }}
|
|
339
|
+
|
|
340
|
+
Fix the failing tests.
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
### No completion criteria
|
|
344
|
+
|
|
345
|
+
```markdown
|
|
346
|
+
# Bad — the loop never knows when it's done
|
|
347
|
+
---
|
|
348
|
+
max_iterations: 20
|
|
349
|
+
---
|
|
350
|
+
Write tests until coverage is good.
|
|
351
|
+
|
|
352
|
+
# Good — explicit completion gate
|
|
353
|
+
---
|
|
354
|
+
max_iterations: 20
|
|
355
|
+
completion_promise: DONE
|
|
356
|
+
required_outputs:
|
|
357
|
+
- COVERAGE_REPORT.md
|
|
358
|
+
---
|
|
359
|
+
Stop with <promise>DONE</promise> when all tests pass
|
|
360
|
+
and COVERAGE_REPORT.md exists.
|
|
361
|
+
```
|
|
362
|
+
|
|
363
|
+
### Too many tasks per iteration
|
|
364
|
+
|
|
365
|
+
```markdown
|
|
366
|
+
# Bad — the agent tries to do everything at once
|
|
367
|
+
Fix all the bugs, write docs, and refactor the API.
|
|
368
|
+
|
|
369
|
+
# Good — one thing per iteration
|
|
370
|
+
Pick the highest-priority bug from BUGS.md and fix it.
|
|
371
|
+
Write a regression test that proves the fix.
|
|
372
|
+
```
|
|
373
|
+
|
|
374
|
+
### Missing progress memory
|
|
375
|
+
|
|
376
|
+
Without RALPH_PROGRESS.md, the agent re-does work across iterations. Add a section:
|
|
377
|
+
|
|
378
|
+
```markdown
|
|
379
|
+
## Progress
|
|
380
|
+
|
|
381
|
+
At the end of each iteration, append a one-line summary to RALPH_PROGRESS.md:
|
|
382
|
+
- What you did
|
|
383
|
+
- What files changed
|
|
384
|
+
- What still needs doing
|
|
385
|
+
```
|
|
386
|
+
|
|
387
|
+
### Walls of text
|
|
388
|
+
|
|
389
|
+
Keep prompts under 200 lines. The loop re-reads the entire prompt every iteration. Long prompts waste context window and dilute focus. If you're writing a novel, you're overthinking it.
|
|
390
|
+
|
|
391
|
+
## Command selection guide
|
|
392
|
+
|
|
393
|
+
| What you're measuring | Command to use |
|
|
394
|
+
|---|---|
|
|
395
|
+
| Test results | `npm test`, `pytest`, `go test ./...` |
|
|
396
|
+
| Type checking | `tsc --noEmit`, `mypy .`, `go vet ./...` |
|
|
397
|
+
| Linting | `eslint .`, `ruff check .`, `golangci-lint run` |
|
|
398
|
+
| Build status | `npm run build`, `cargo build`, `go build ./...` |
|
|
399
|
+
| Coverage | `pytest --cov`, `go test -cover` |
|
|
400
|
+
| Git history | `git log --oneline -10` |
|
|
401
|
+
| Changed files | `git diff --name-only HEAD~5` |
|
|
402
|
+
| Vulnerability scan | `npm audit`, `bandit -r src/` |
|
|
403
|
+
| Custom metrics | `./scripts/check-coverage.sh` |
|
|
404
|
+
|
|
405
|
+
Commands starting with `./` run from the task directory. Others run from the project root.
|