ralphctl 0.4.1 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. package/README.md +13 -11
  2. package/dist/{add-CIM72NE3.mjs → add-MG26JWBP.mjs} +6 -6
  3. package/dist/{add-GX7P7XTT.mjs → add-ZZYL4BSF.mjs} +5 -4
  4. package/dist/chunk-2FT37OZX.mjs +1071 -0
  5. package/dist/{chunk-CTP2A436.mjs → chunk-D2HWXEHH.mjs} +9 -2
  6. package/dist/{chunk-JOQO4HMM.mjs → chunk-EGUFQNRB.mjs} +10 -10
  7. package/dist/{chunk-3HJNVQ7N.mjs → chunk-LCY32RW4.mjs} +621 -976
  8. package/dist/{chunk-NUYQK5MN.mjs → chunk-LDSG7G2T.mjs} +1 -1
  9. package/dist/{chunk-7JLZQICD.mjs → chunk-MDE6KPJQ.mjs} +6 -6
  10. package/dist/{chunk-3QBEBKMZ.mjs → chunk-Q4AVHUZL.mjs} +7 -7
  11. package/dist/{chunk-YCDUVPRT.mjs → chunk-RQGD5WS6.mjs} +4 -72
  12. package/dist/{chunk-D2YGPLIV.mjs → chunk-TDBEEHTS.mjs} +213 -8
  13. package/dist/{chunk-SM4GGZSU.mjs → chunk-WOMGKKZY.mjs} +152 -179
  14. package/dist/{chunk-FKMKOWLA.mjs → chunk-WZTY77GY.mjs} +75 -1
  15. package/dist/cli.mjs +68 -19
  16. package/dist/{create-7WFSCMP4.mjs → create-PQK6KKRD.mjs} +5 -5
  17. package/dist/{handle-BBAZJ44Y.mjs → handle-SYVCFI6Y.mjs} +1 -1
  18. package/dist/{mount-2N6H5CWA.mjs → mount-2ANLHHQE.mjs} +556 -318
  19. package/dist/{project-2IE7VWDB.mjs → project-JF47ZWMF.mjs} +2 -2
  20. package/dist/prompts/check-script-discover.md +69 -0
  21. package/dist/prompts/ideate-auto.md +26 -1
  22. package/dist/prompts/ideate.md +5 -1
  23. package/dist/prompts/plan-auto.md +30 -2
  24. package/dist/prompts/plan-common-examples.md +82 -0
  25. package/dist/prompts/plan-common.md +26 -78
  26. package/dist/prompts/plan-interactive.md +6 -2
  27. package/dist/prompts/repo-onboard.md +111 -0
  28. package/dist/prompts/sprint-feedback.md +6 -2
  29. package/dist/prompts/task-evaluation.md +25 -10
  30. package/dist/prompts/task-execution.md +13 -13
  31. package/dist/prompts/ticket-refine.md +4 -0
  32. package/dist/prompts/validation-checklist.md +4 -0
  33. package/dist/{resolver-EOE5WUMV.mjs → resolver-PG2DZEBX.mjs} +3 -3
  34. package/dist/{sprint-OGOFEJJH.mjs → sprint-54DOSIJK.mjs} +3 -3
  35. package/dist/{start-IUDCXIEA.mjs → start-2SZTBKGF.mjs} +7 -5
  36. package/package.json +6 -6
@@ -58,10 +58,16 @@ Now apply semantic judgment to what the computational checks cannot catch:
58
58
  2. **Read the changed files carefully** — understand the full implementation, not just the diff.
59
59
  3. **Read surrounding code** — check that the implementation follows existing patterns and conventions.
60
60
  4. **Augment the Project Tooling section above** — the section lists detected subagents, skills, and MCP servers.
61
- Additionally skim `package.json` scripts, `playwright.config.*`, `cypress.config.*`, `vitest.config.*`, `.storybook/`,
62
- `CLAUDE.md`, and `.github/copilot-instructions.md` for the test/verification stack and any conventions the section
63
- didn't surface. Note which application type this is (backend API / CLI / frontend SPA / fullstack / library) — it
64
- determines which verification methods apply.
61
+ Additionally skim repository config for the test/verification stack and any conventions the section didn't surface.
62
+ Note which application type this is (backend API / CLI / frontend SPA / fullstack / library) it determines which
63
+ verification methods apply.
64
+
65
+ <examples>
66
+ Representative files to scan when present — not an exhaustive list, adapt to the ecosystem:
67
+ `package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `playwright.config.*`, `cypress.config.*`,
68
+ `vitest.config.*`, `.storybook/`, `CLAUDE.md`, `AGENTS.md`, `.github/copilot-instructions.md`.
69
+ </examples>
70
+
65
71
  5. **Run extended verification when the detected tooling makes it cheap and deterministic:**
66
72
  - **Frontend/UI tasks** — if Playwright or Cypress is configured, run a targeted e2e test or use a browser MCP to
67
73
  verify the changed UI renders correctly (console errors, layout, interactive behaviour).
@@ -78,14 +84,15 @@ Evaluate the implementation across the dimensions below. Each dimension is pass/
78
84
  dimension fails, the overall evaluation fails. The first four are the floor — every task is graded on them. The
79
85
  planner may have flagged additional task-specific dimensions; when present, they are graded on top of the floor.
80
86
 
81
- **Dimension 1 — Correctness**
87
+ <dimension name="Correctness" floor="true">
82
88
  Does the implementation do what the specification says? Check for:
83
89
 
84
90
  - Logical errors, off-by-one, race conditions, type issues
85
91
  - Behavior matches each verification criterion (grade each one explicitly)
86
92
  - Edge cases handled where specified
93
+ </dimension>
87
94
 
88
- **Dimension 2 — Completeness**
95
+ <dimension name="Completeness" floor="true">
89
96
  Is the full specification implemented? Check for:
90
97
 
91
98
  - Every verification criterion is satisfied (not just most)
@@ -93,25 +100,29 @@ Is the full specification implemented? Check for:
93
100
  - No TODO/FIXME/HACK markers left behind that indicate unfinished work
94
101
  - Uncommitted changes that look like incomplete work (WIP diffs, stashed edits) — committing is expected unless the
95
102
  task's contract says otherwise
103
+ </dimension>
96
104
 
97
- **Dimension 3 — Safety**
105
+ <dimension name="Safety" floor="true">
98
106
  Are there security or reliability issues? Check for:
99
107
 
100
108
  - Injection vulnerabilities (SQL, command, XSS)
101
109
  - Validation gaps on external input
102
110
  - Exposed secrets, hardcoded credentials
103
111
  - Unsafe error handling that leaks internals
112
+ </dimension>
104
113
 
105
- **Dimension 4 — Consistency**
114
+ <dimension name="Consistency" floor="true">
106
115
  Does the implementation fit the codebase? Check for:
107
116
 
108
117
  - Follows existing patterns and conventions (naming, structure, error handling)
109
118
  - Uses existing utilities instead of reinventing them
110
119
  - No unnecessary changes outside the task scope — spec drift
111
120
  - Test patterns match the project's existing test style
121
+ </dimension>
112
122
  {{EXTRA_DIMENSIONS_SECTION}}
113
- Evaluate only what was asked vs what was delivered — suggesting improvements beyond the task scope creates noise that
114
- distracts from the actual pass/fail decision.
123
+
124
+ Evaluate only what was asked vs what was delivered — suggesting improvements beyond the task scope creates noise that
125
+ distracts from the actual pass/fail decision.
115
126
 
116
127
  ### Pass Bar
117
128
 
@@ -165,6 +176,8 @@ Each issue must reference which dimension it violates.]
165
176
 
166
177
  ### Calibration Examples
167
178
 
179
+ <examples>
180
+
168
181
  **Example of a correct PASS:**
169
182
 
170
183
  > Task: "Add date validation to export endpoint"
@@ -193,6 +206,8 @@ Each issue must reference which dimension it violates.]
193
206
  > 2. [Safety] `src/repositories/users.ts:23` — `WHERE name LIKE '%${query}%'` is SQL injection. Use parameterized
194
207
  > query: `WHERE name LIKE $1` with `%${query}%` as parameter.
195
208
 
209
+ </examples>
210
+
196
211
  Be direct and specific — point to files, lines, and concrete problems.
197
212
 
198
213
  {{SIGNALS}}
@@ -15,16 +15,17 @@ When finished, emit a signal from the `<signals>` block below.
15
15
  - **Respect task boundaries** — complete exactly the declared steps for this one task, then stop. Other agents may be
16
16
  working on neighboring tasks in parallel; skipping steps, improvising, or editing files outside the declared set
17
17
  causes merge conflicts with their work.
18
- - **Prefer fixing the code over the test** — a failing test usually indicates a bug in the implementation. Update a
19
- test only when the declared steps intentionally change the behaviour it asserts (e.g. a regression fix, a contract
20
- change). Do not remove, skip, or weaken a test to make a failure go awaythat masks real bugs. If the right move
21
- is genuinely ambiguous, signal `<task-blocked>` so a human can decide.
18
+ - **Prefer fixing the code over the test** — a failing test usually indicates a bug in the implementation. Update
19
+ tests only when the declared steps intentionally change the asserted behaviour (e.g. a contract change, a regression
20
+ fix). If the right move is genuinely ambiguous, signal `<task-blocked>` so a human can decidedo not silently
21
+ weaken a test to make a failure go away.
22
22
  - **Verify before completing** — the harness runs a post-task check gate; unverified work will be caught and rejected.
23
23
  - **Append progress, never overwrite** — append each progress entry at the end of the progress file. Overwriting
24
24
  erases context that downstream tasks depend on.
25
25
  - **Leave {{CONTEXT_FILE}} and task definitions alone** — the context file is cleaned up by the harness (committing it
26
26
  pollutes the repo); the task name, description, steps, and other task files are immutable.
27
- {{COMMIT_CONSTRAINT}}
27
+
28
+ {{COMMIT_CONSTRAINT}}
28
29
 
29
30
  </constraints>
30
31
 
@@ -93,7 +94,8 @@ Complete these steps IN ORDER:
93
94
  1. **Confirm all steps done** — Every task step has been completed
94
95
  2. **Run ALL verification commands** — Execute every verification command (see Check Script section in the context file
95
96
  or project instructions). Fix any failures before proceeding. The harness runs the check script as a post-task
96
- gate — your task is not marked done unless it passes.{{COMMIT_STEP}}
97
+ gate — your task is not marked done unless it passes.
98
+ {{COMMIT_STEP}}
97
99
  3. **Update progress file** — Append to {{PROGRESS_FILE}} using this format:
98
100
 
99
101
  ```markdown
@@ -142,17 +144,15 @@ Complete these steps IN ORDER:
142
144
  - The WHERE clause builder in src/repositories/base.ts can be extended for future filters
143
145
  ```
144
146
 
145
- 4. **Output verification results:**
147
+ 4. **Output verification results** — use the actual commands the harness ran; the examples below are illustrative:
146
148
 
147
149
  <!-- prettier-ignore -->
148
150
  ```
149
151
  <task-verified>
150
- $ pnpm typecheck
151
- No type errors
152
- $ pnpm lint
153
- No lint errors
154
- $ pnpm test
155
- 47 tests passed
152
+ $ <check-command-1>
153
+ <output>
154
+ $ <check-command-2>
155
+ <output>
156
156
  </task-verified>
157
157
  ```
158
158
 
@@ -223,10 +223,14 @@ The `ref` field should match either:
223
223
  - The ticket's internal ID
224
224
  - The exact ticket title
225
225
 
226
+ <task-specification>
227
+
226
228
  ## Ticket to Refine
227
229
 
228
230
  {{TICKET}}
229
231
 
232
+ </task-specification>
233
+
230
234
  {{ISSUE_CONTEXT}}
231
235
 
232
236
  ---
@@ -1,3 +1,5 @@
1
+ <validation-checklist>
2
+
1
3
  ## Pre-Output Validation
2
4
 
3
5
  Before writing the JSON output, verify EVERY item:
@@ -12,3 +14,5 @@ Before writing the JSON output, verify EVERY item:
12
14
  8. **`projectPath` assigned** — every task uses a path from the available repositories
13
15
  9. **Verification criteria** — every task has 2-4 `verificationCriteria` that are testable and unambiguous
14
16
  10. **Raw JSON output** — the output is valid JSON matching the schema exactly; the harness parses the output directly as JSON, so emit it without markdown fences, commentary, or surrounding prose
17
+
18
+ </validation-checklist>
@@ -11,7 +11,7 @@ var dynamicResolvers = {
11
11
  "--project": async () => {
12
12
  const result = await wrapAsync(
13
13
  async () => {
14
- const { listProjects } = await import("./project-2IE7VWDB.mjs");
14
+ const { listProjects } = await import("./project-JF47ZWMF.mjs");
15
15
  return listProjects();
16
16
  },
17
17
  (err) => new IOError("Failed to load projects for completion", err instanceof Error ? err : void 0)
@@ -45,7 +45,7 @@ var configValueCompletions = {
45
45
  async function getSprintCompletions() {
46
46
  const result = await wrapAsync(
47
47
  async () => {
48
- const { listSprints } = await import("./sprint-OGOFEJJH.mjs");
48
+ const { listSprints } = await import("./sprint-54DOSIJK.mjs");
49
49
  return listSprints();
50
50
  },
51
51
  (err) => new IOError("Failed to load sprints for completion", err instanceof Error ? err : void 0)
@@ -133,7 +133,7 @@ async function resolveCompletions(program, ctx) {
133
133
  function getCommandPath(cmd) {
134
134
  const parts = [];
135
135
  let current = cmd;
136
- while (current?.parent) {
136
+ while (current.parent) {
137
137
  parts.unshift(current.name());
138
138
  current = current.parent;
139
139
  }
@@ -11,10 +11,10 @@ import {
11
11
  logSprintBaselines,
12
12
  resolveSprintId,
13
13
  saveSprint
14
- } from "./chunk-YCDUVPRT.mjs";
15
- import "./chunk-FKMKOWLA.mjs";
14
+ } from "./chunk-RQGD5WS6.mjs";
15
+ import "./chunk-WZTY77GY.mjs";
16
16
  import "./chunk-IWXBJD2D.mjs";
17
- import "./chunk-CTP2A436.mjs";
17
+ import "./chunk-D2HWXEHH.mjs";
18
18
  import {
19
19
  NoCurrentSprintError,
20
20
  SprintNotFoundError,
@@ -2,14 +2,16 @@
2
2
  import {
3
3
  parseSprintStartArgs,
4
4
  sprintStartCommand
5
- } from "./chunk-3HJNVQ7N.mjs";
6
- import "./chunk-JOQO4HMM.mjs";
5
+ } from "./chunk-LCY32RW4.mjs";
6
+ import "./chunk-2FT37OZX.mjs";
7
+ import "./chunk-EGUFQNRB.mjs";
7
8
  import "./chunk-CFUVE2BP.mjs";
8
9
  import "./chunk-747KW2RW.mjs";
9
- import "./chunk-YCDUVPRT.mjs";
10
- import "./chunk-FKMKOWLA.mjs";
10
+ import "./chunk-LDSG7G2T.mjs";
11
+ import "./chunk-RQGD5WS6.mjs";
12
+ import "./chunk-WZTY77GY.mjs";
11
13
  import "./chunk-IWXBJD2D.mjs";
12
- import "./chunk-CTP2A436.mjs";
14
+ import "./chunk-D2HWXEHH.mjs";
13
15
  import "./chunk-57UWLHRH.mjs";
14
16
  export {
15
17
  parseSprintStartArgs,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ralphctl",
3
- "version": "0.4.1",
3
+ "version": "0.4.3",
4
4
  "description": "Agent harness for long-running AI coding tasks — orchestrates Claude Code & GitHub Copilot across repositories",
5
5
  "homepage": "https://github.com/lukas-grigis/ralphctl",
6
6
  "type": "module",
@@ -53,8 +53,8 @@
53
53
  "@types/node": "^25.6.0",
54
54
  "@types/react": "^19.2.14",
55
55
  "@types/tabtab": "^3.0.4",
56
- "@vitest/coverage-v8": "^4.1.4",
57
- "eslint": "^10.2.0",
56
+ "@vitest/coverage-v8": "^4.1.5",
57
+ "eslint": "^10.2.1",
58
58
  "eslint-config-prettier": "^10.1.8",
59
59
  "globals": "^17.5.0",
60
60
  "husky": "^9.1.7",
@@ -64,11 +64,11 @@
64
64
  "tsup": "^8.5.1",
65
65
  "tsx": "^4.21.0",
66
66
  "typescript": "^5.9.3",
67
- "typescript-eslint": "^8.58.2",
68
- "vitest": "^4.1.4"
67
+ "typescript-eslint": "^8.59.0",
68
+ "vitest": "^4.1.5"
69
69
  },
70
70
  "lint-staged": {
71
- "*.ts": [
71
+ "*.{ts,tsx}": [
72
72
  "eslint --cache --fix",
73
73
  "prettier --write"
74
74
  ],