ralphctl 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -29,10 +29,28 @@ verification criteria and the codebase?" If not, the task needs work.
29
29
 
30
30
  ### Task Sizing
31
31
 
32
- Completable in a single AI session: 1-3 primary files (up to 5-7 total with tests), ~50-200 lines of meaningful
33
- changes, one logical change per task. Split if too large, merge if too small.
32
+ The unit is **one coherent feature or vertical slice** a change that can be picked up cold, implemented in a single
33
+ session, and verified end-to-end against its criteria. Size is driven by coherence, not line count. Modern agents are
34
+ capable; artificial fragmentation creates serial chains, duplicate context reloads, and merge conflicts that cost far
35
+ more than they save.
34
36
 
35
- Too granular (three tasks that should be one):
37
+ **Do not split when:**
38
+
39
+ - A utility and its first caller would be separated — create-and-use is always one task
40
+ - A feature and its tests would be separated
41
+ - The same pattern applies across N call sites — it is one refactor, not N tasks
42
+
43
+ **Do split when:**
44
+
45
+ - Two chunks can run in parallel (different `projectPath`, or independent files with no shared contract)
46
+ - A clean, verifiable boundary exists partway through (e.g. schema + migration land first, then consumer wiring — the
47
+ schema is independently testable and unblocks parallel consumers)
48
+ - The change spans multiple repositories — one task per repo, connected via `blockedBy`
49
+
50
+ **Soft ceiling, not a target:** if a task looks like it will touch more than ~10 files or ~500 lines of meaningful
51
+ change AND a natural split point exists, split it. No natural split point? Keep it whole.
52
+
53
+ Too granular (one task, not three):
36
54
 
37
55
  - "Create date formatting utility"
38
56
  - "Refactor experience module to use date utility"
@@ -91,7 +109,8 @@ the evaluator will attempt visual verification using Playwright or browser tools
91
109
 
92
110
  1. **Outcome-oriented** — Each task delivers a testable result
93
111
  2. **Merge create+use** — Never separate "create X" from "use X" — that is one task
94
- 3. **Target 5-15 tasks** per scope, not 20-30 micro-tasks
112
+ 3. **Let scope drive task count** do not aim for a specific number. Fewer, larger coherent tasks beat many
113
+ micro-tasks; split only when parallelism or a clean boundary justifies it
95
114
  4. **Merge serial chains** — If tasks only make sense when run in sequence, fold them into one task
96
115
 
97
116
  ### Anti-Patterns
@@ -11,7 +11,12 @@ When finished, emit a signal from the `<signals>` block below.
11
11
 
12
12
  - **Stay within scope** — fix only what the critique flags; keep edits local to the files and lines the critique
13
13
  calls out. Do not expand the task or refactor neighboring code.
14
- - **Fix, don't rewrite** — make minimal targeted changes; preserve the existing implementation structure where possible.
14
+ - **Default to minimal fix** — make targeted changes; preserve the existing implementation structure where possible.
15
+ - **Pivot when the critique is structural, not local** — if the findings point at a fundamentally wrong approach
16
+ (wrong abstraction, wrong data flow, wrong contract) rather than localized bugs, a patch over the existing
17
+ implementation will likely fail re-evaluation on related grounds. In that case, replace the affected section
18
+ with a correct approach instead of repeatedly patching it. Use this judgement sparingly — most critiques are
19
+ genuinely local.
15
20
  - **Treat reviewer findings as authoritative** — apply the fix they describe rather than rewriting the approach. If a
16
21
  finding is genuinely wrong, signal `<task-blocked>` so a human can decide; do not silently ignore it.
17
22
 
@@ -7,7 +7,7 @@ Before writing the JSON output, verify EVERY item:
7
7
  3. **Foundations before dependents** — tasks are ordered so prerequisites come first
8
8
  4. **Valid dependencies** — every `blockedBy` reference points to an earlier task with a real code dependency
9
9
  5. **Maximized parallelism** — independent tasks run in parallel; use `blockedBy` only when there is a genuine code dependency
10
- 6. **Precise steps** — every task has 3+ specific, actionable steps with file references
10
+ 6. **Precise steps** — every task has specific, actionable steps with file references — as many as the scope needs (a small task may have 2 steps, a larger coherent one may have 8+)
11
11
  7. **Verification steps** — every task ends with project-appropriate verification commands
12
12
  8. **`projectPath` assigned** — every task uses a path from the available repositories
13
13
  9. **Verification criteria** — every task has 2-4 `verificationCriteria` that are testable and unambiguous
@@ -2,8 +2,8 @@
2
2
  import {
3
3
  parseSprintStartArgs,
4
4
  sprintStartCommand
5
- } from "./chunk-CDOPLXFK.mjs";
6
- import "./chunk-HL4ZMHCQ.mjs";
5
+ } from "./chunk-JYCGQA2D.mjs";
6
+ import "./chunk-JOQO4HMM.mjs";
7
7
  import "./chunk-CFUVE2BP.mjs";
8
8
  import "./chunk-747KW2RW.mjs";
9
9
  import "./chunk-YCDUVPRT.mjs";
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ralphctl",
3
- "version": "0.3.0",
3
+ "version": "0.4.0",
4
4
  "description": "Agent harness for long-running AI coding tasks — orchestrates Claude Code & GitHub Copilot across repositories",
5
5
  "homepage": "https://github.com/lukas-grigis/ralphctl",
6
6
  "type": "module",