@ghostwater/soulforge 0.7.0 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -9,15 +9,15 @@ description: |
9
9
  verifier checks → integration testing → PR → final review.
10
10
 
11
11
  defaults:
12
- executor: claude-code
13
- model: opus
12
+ executor: "{{executor:codex-cli}}"
13
+ model: "{{model:gpt-5.3-codex}}"
14
14
  timeout: 600
15
15
  max_retries: 2
16
16
 
17
17
  steps:
18
18
  - id: plan
19
- executor: claude-code
20
- model: opus
19
+ executor: "{{executor:codex-cli}}"
20
+ model: "{{model:gpt-5.3-codex}}"
21
21
  workdir: "{{workdir}}"
22
22
  output_schema:
23
23
  fields:
@@ -59,17 +59,17 @@ steps:
59
59
 
60
60
  Instructions:
61
61
  1. Explore the codebase to understand the stack, conventions, and patterns
62
- 2. Break the task into small user stories (max 20)
62
+ 2. Break the task into the minimum number of small user stories needed.
63
+ 1-3 stories is perfectly fine for simple tasks, and a single story is explicitly allowed when the task is simple.
64
+ Most tasks should be on the lower end (typically 1-5 stories).
65
+ 20 stories is an absolute hard ceiling, not a target.
66
+ Do NOT pad with documentation, logging, config, or cleanup stories unless explicitly requested.
63
67
  3. Order by dependency: schema/DB first, backend, frontend, integration
64
68
  4. Each story must fit in one developer session (one context window)
65
69
  5. Every acceptance criterion must be mechanically verifiable
66
70
  6. Always include "Typecheck passes" as the last criterion in every story
67
71
  7. Every story MUST include test criteria
68
72
 
69
- Reply with:
70
- STATUS: done
71
- STORIES_JSON: [ ... array of story objects ... ]
72
- expects: "STATUS: done"
73
73
 
74
74
  - id: review-plan
75
75
  executor: self
@@ -90,13 +90,13 @@ steps:
90
90
  STORIES: {{stories_json}}
91
91
 
92
92
  - id: implement
93
- executor: claude-code
94
- model: opus
93
+ executor: "{{executor:codex-cli}}"
94
+ model: "{{model:gpt-5.3-codex}}"
95
95
  workdir: "{{workdir}}"
96
96
  notify: [on_complete, on_fail]
97
97
  type: loop
98
98
  loop:
99
- over: stories
99
+ over: plan.stories
100
100
  completion: all_done
101
101
  fresh_session: true
102
102
  verify_each: true
@@ -120,8 +120,7 @@ steps:
120
120
 
121
121
  TASK (overall): {{task}}
122
122
  WORKDIR: {{workdir}}
123
- BUILD_CMD: {{build_cmd}}
124
- TEST_CMD: {{test_cmd}}
123
+ Build/Test commands: discover from AGENTS.md and repo scripts.
125
124
 
126
125
  CURRENT STORY:
127
126
  {{current_story}}
@@ -142,15 +141,10 @@ steps:
142
141
  5. Run tests
143
142
  6. Commit: feat: {{current_story_id}} - {{current_story_title}}
144
143
 
145
- Reply with:
146
- STATUS: done
147
- CHANGES: what you implemented
148
- TESTS: what tests you wrote
149
- expects: "STATUS: done"
150
144
 
151
145
  - id: verify
152
- executor: claude-code
153
- model: opus
146
+ executor: "{{executor:codex-cli}}"
147
+ model: "{{model:gpt-5.3-codex}}"
154
148
  workdir: "{{workdir}}"
155
149
  output_schema:
156
150
  fields:
@@ -175,7 +169,7 @@ steps:
175
169
 
176
170
  WORKDIR: {{workdir}}
177
171
  CHANGES: {{changes}}
178
- TEST_CMD: {{test_cmd}}
172
+ Build/Test commands: discover from AGENTS.md and repo scripts.
179
173
 
180
174
  CURRENT STORY:
181
175
  {{current_story}}
@@ -184,22 +178,13 @@ steps:
184
178
  1. Code exists (not just TODOs or placeholders)
185
179
  2. Each acceptance criterion is met
186
180
  3. Tests were written
187
- 4. Tests pass (run {{test_cmd}})
188
- 5. Typecheck passes
181
+ 4. Tests pass (using the project's standard test command)
182
+ 5. Typecheck/build passes (using the project's standard build/typecheck command)
189
183
 
190
- Reply with:
191
- STATUS: done
192
- VERIFIED: What you confirmed
193
-
194
- Or if incomplete:
195
- STATUS: retry
196
- ISSUES:
197
- - What's missing
198
- expects: "STATUS: done"
199
184
 
200
185
  - id: test
201
- executor: claude-code
202
- model: opus
186
+ executor: "{{executor:codex-cli}}"
187
+ model: "{{model:gpt-5.3-codex}}"
203
188
  workdir: "{{workdir}}"
204
189
  output_schema:
205
190
  fields:
@@ -216,20 +201,16 @@ steps:
216
201
 
217
202
  TASK: {{task}}
218
203
  WORKDIR: {{workdir}}
219
- TEST_CMD: {{test_cmd}}
204
+ Build/Test commands: discover from AGENTS.md and repo scripts.
220
205
 
221
206
  1. Run the full test suite
222
207
  2. Look for integration issues between stories
223
208
  3. Check error handling and edge cases
224
209
 
225
- Reply with:
226
- STATUS: done
227
- RESULTS: What you tested and outcomes
228
- expects: "STATUS: done"
229
210
 
230
211
  - id: pr
231
- executor: claude-code
232
- model: opus
212
+ executor: "{{executor:codex-cli}}"
213
+ model: "{{model:gpt-5.3-codex}}"
233
214
  workdir: "{{workdir}}"
234
215
  notify: on_complete
235
216
  output_schema:
@@ -255,16 +236,11 @@ steps:
255
236
 
256
237
  Create a PR with gh pr create. Capture the PR number.
257
238
 
258
- Reply with:
259
- STATUS: done
260
- PR: URL to the pull request
261
- PR_NUMBER: <number only, e.g. 42>
262
- expects: "STATUS: done"
263
239
 
264
240
  # ── Automated Review Loop ──────────────────────────────────────────
265
241
  - id: code-review
266
- executor: claude-code
267
- model: opus
242
+ executor: "{{executor:codex-cli}}"
243
+ model: "{{model:gpt-5.3-codex}}"
268
244
  workdir: "{{workdir}}"
269
245
  output_schema:
270
246
  fields:
@@ -285,15 +261,14 @@ steps:
285
261
 
286
262
  WORKDIR: {{workdir}}
287
263
  TASK: {{task}}
288
- BUILD_CMD: {{build_cmd}}
289
- TEST_CMD: {{test_cmd}}
264
+ Build/Test commands: discover from AGENTS.md and repo scripts.
290
265
 
291
266
  Instructions:
292
267
  1. Read ALL changed files in the PR
293
268
  2. Check for correctness, edge cases, potential bugs
294
269
  3. Check test quality and coverage
295
- 4. Run build: {{build_cmd}}
296
- 5. Run tests: {{test_cmd}}
270
+ 4. Run the project's standard build/typecheck command
271
+ 5. Run the project's standard test command
297
272
  6. Post ALL findings as a single comment on the PR:
298
273
  gh pr comment {{pr_number}} --body "<your review>"
299
274
  7. Flag severity for each finding (High/Medium/Low)
@@ -303,7 +278,6 @@ steps:
303
278
  Reply with exactly one of:
304
279
  REVIEW_DECISION: pass
305
280
  REVIEW_DECISION: fix
306
- expects: "REVIEW_DECISION:"
307
281
 
308
282
  - id: review-gate
309
283
  type: gate
@@ -345,8 +319,8 @@ steps:
345
319
  max_loops: 5
346
320
 
347
321
  - id: review-fix
348
- executor: claude-code
349
- model: opus
322
+ executor: "{{executor:codex-cli}}"
323
+ model: "{{model:gpt-5.3-codex}}"
350
324
  workdir: "{{workdir}}"
351
325
  output_schema:
352
326
  fields:
@@ -368,14 +342,13 @@ steps:
368
342
  gh pr view {{pr_number}} --comments --json comments --jq '.comments[-1].body'
369
343
 
370
344
  WORKDIR: {{workdir}}
371
- BUILD_CMD: {{build_cmd}}
372
- TEST_CMD: {{test_cmd}}
345
+ Build/Test commands: discover from AGENTS.md and repo scripts.
373
346
 
374
347
  Fix ONLY the items marked for fixing. Do not touch anything else.
375
348
 
376
349
  After fixing:
377
- 1. Run build: {{build_cmd}}
378
- 2. Run tests: {{test_cmd}}
350
+ 1. Run the project's standard build/typecheck command
351
+ 2. Run the project's standard test command
379
352
  3. Commit and push
380
353
 
381
354
  DO NOT:
@@ -383,10 +356,6 @@ steps:
383
356
  - Re-implement the original feature
384
357
  - Touch files not related to the review findings
385
358
 
386
- Reply with:
387
- STATUS: done
388
- CHANGES: what you fixed
389
- expects: "STATUS: done"
390
359
  next: code-review
391
360
 
392
361
  - id: final-review