ralphctl 0.4.1 → 0.4.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +13 -11
- package/dist/{add-CIM72NE3.mjs → add-MG26JWBP.mjs} +6 -6
- package/dist/{add-GX7P7XTT.mjs → add-ZZYL4BSF.mjs} +5 -4
- package/dist/chunk-2FT37OZX.mjs +1071 -0
- package/dist/{chunk-CTP2A436.mjs → chunk-D2HWXEHH.mjs} +9 -2
- package/dist/{chunk-JOQO4HMM.mjs → chunk-EGUFQNRB.mjs} +10 -10
- package/dist/{chunk-3HJNVQ7N.mjs → chunk-LCY32RW4.mjs} +621 -976
- package/dist/{chunk-NUYQK5MN.mjs → chunk-LDSG7G2T.mjs} +1 -1
- package/dist/{chunk-7JLZQICD.mjs → chunk-MDE6KPJQ.mjs} +6 -6
- package/dist/{chunk-3QBEBKMZ.mjs → chunk-Q4AVHUZL.mjs} +7 -7
- package/dist/{chunk-YCDUVPRT.mjs → chunk-RQGD5WS6.mjs} +4 -72
- package/dist/{chunk-D2YGPLIV.mjs → chunk-TDBEEHTS.mjs} +213 -8
- package/dist/{chunk-SM4GGZSU.mjs → chunk-WOMGKKZY.mjs} +152 -179
- package/dist/{chunk-FKMKOWLA.mjs → chunk-WZTY77GY.mjs} +75 -1
- package/dist/cli.mjs +68 -19
- package/dist/{create-7WFSCMP4.mjs → create-PQK6KKRD.mjs} +5 -5
- package/dist/{handle-BBAZJ44Y.mjs → handle-SYVCFI6Y.mjs} +1 -1
- package/dist/{mount-2N6H5CWA.mjs → mount-2ANLHHQE.mjs} +556 -318
- package/dist/{project-2IE7VWDB.mjs → project-JF47ZWMF.mjs} +2 -2
- package/dist/prompts/check-script-discover.md +69 -0
- package/dist/prompts/ideate-auto.md +26 -1
- package/dist/prompts/ideate.md +5 -1
- package/dist/prompts/plan-auto.md +30 -2
- package/dist/prompts/plan-common-examples.md +82 -0
- package/dist/prompts/plan-common.md +26 -78
- package/dist/prompts/plan-interactive.md +6 -2
- package/dist/prompts/repo-onboard.md +111 -0
- package/dist/prompts/sprint-feedback.md +6 -2
- package/dist/prompts/task-evaluation.md +25 -10
- package/dist/prompts/task-execution.md +13 -13
- package/dist/prompts/ticket-refine.md +4 -0
- package/dist/prompts/validation-checklist.md +4 -0
- package/dist/{resolver-EOE5WUMV.mjs → resolver-PG2DZEBX.mjs} +3 -3
- package/dist/{sprint-OGOFEJJH.mjs → sprint-54DOSIJK.mjs} +3 -3
- package/dist/{start-IUDCXIEA.mjs → start-2SZTBKGF.mjs} +7 -5
- package/package.json +6 -6
|
@@ -58,10 +58,16 @@ Now apply semantic judgment to what the computational checks cannot catch:
|
|
|
58
58
|
2. **Read the changed files carefully** — understand the full implementation, not just the diff.
|
|
59
59
|
3. **Read surrounding code** — check that the implementation follows existing patterns and conventions.
|
|
60
60
|
4. **Augment the Project Tooling section above** — the section lists detected subagents, skills, and MCP servers.
|
|
61
|
-
Additionally skim
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
61
|
+
Additionally skim repository config for the test/verification stack and any conventions the section didn't surface.
|
|
62
|
+
Note which application type this is (backend API / CLI / frontend SPA / fullstack / library) — it determines which
|
|
63
|
+
verification methods apply.
|
|
64
|
+
|
|
65
|
+
<examples>
|
|
66
|
+
Representative files to scan when present — not an exhaustive list, adapt to the ecosystem:
|
|
67
|
+
`package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `playwright.config.*`, `cypress.config.*`,
|
|
68
|
+
`vitest.config.*`, `.storybook/`, `CLAUDE.md`, `AGENTS.md`, `.github/copilot-instructions.md`.
|
|
69
|
+
</examples>
|
|
70
|
+
|
|
65
71
|
5. **Run extended verification when the detected tooling makes it cheap and deterministic:**
|
|
66
72
|
- **Frontend/UI tasks** — if Playwright or Cypress is configured, run a targeted e2e test or use a browser MCP to
|
|
67
73
|
verify the changed UI renders correctly (console errors, layout, interactive behaviour).
|
|
@@ -78,14 +84,15 @@ Evaluate the implementation across the dimensions below. Each dimension is pass/
|
|
|
78
84
|
dimension fails, the overall evaluation fails. The first four are the floor — every task is graded on them. The
|
|
79
85
|
planner may have flagged additional task-specific dimensions; when present, they are graded on top of the floor.
|
|
80
86
|
|
|
81
|
-
|
|
87
|
+
<dimension name="Correctness" floor="true">
|
|
82
88
|
Does the implementation do what the specification says? Check for:
|
|
83
89
|
|
|
84
90
|
- Logical errors, off-by-one, race conditions, type issues
|
|
85
91
|
- Behavior matches each verification criterion (grade each one explicitly)
|
|
86
92
|
- Edge cases handled where specified
|
|
93
|
+
</dimension>
|
|
87
94
|
|
|
88
|
-
|
|
95
|
+
<dimension name="Completeness" floor="true">
|
|
89
96
|
Is the full specification implemented? Check for:
|
|
90
97
|
|
|
91
98
|
- Every verification criterion is satisfied (not just most)
|
|
@@ -93,25 +100,29 @@ Is the full specification implemented? Check for:
|
|
|
93
100
|
- No TODO/FIXME/HACK markers left behind that indicate unfinished work
|
|
94
101
|
- Uncommitted changes that look like incomplete work (WIP diffs, stashed edits) — committing is expected unless the
|
|
95
102
|
task's contract says otherwise
|
|
103
|
+
</dimension>
|
|
96
104
|
|
|
97
|
-
|
|
105
|
+
<dimension name="Safety" floor="true">
|
|
98
106
|
Are there security or reliability issues? Check for:
|
|
99
107
|
|
|
100
108
|
- Injection vulnerabilities (SQL, command, XSS)
|
|
101
109
|
- Validation gaps on external input
|
|
102
110
|
- Exposed secrets, hardcoded credentials
|
|
103
111
|
- Unsafe error handling that leaks internals
|
|
112
|
+
</dimension>
|
|
104
113
|
|
|
105
|
-
|
|
114
|
+
<dimension name="Consistency" floor="true">
|
|
106
115
|
Does the implementation fit the codebase? Check for:
|
|
107
116
|
|
|
108
117
|
- Follows existing patterns and conventions (naming, structure, error handling)
|
|
109
118
|
- Uses existing utilities instead of reinventing them
|
|
110
119
|
- No unnecessary changes outside the task scope — spec drift
|
|
111
120
|
- Test patterns match the project's existing test style
|
|
121
|
+
</dimension>
|
|
112
122
|
{{EXTRA_DIMENSIONS_SECTION}}
|
|
113
|
-
|
|
114
|
-
|
|
123
|
+
|
|
124
|
+
Evaluate only what was asked vs what was delivered — suggesting improvements beyond the task scope creates noise that
|
|
125
|
+
distracts from the actual pass/fail decision.
|
|
115
126
|
|
|
116
127
|
### Pass Bar
|
|
117
128
|
|
|
@@ -165,6 +176,8 @@ Each issue must reference which dimension it violates.]
|
|
|
165
176
|
|
|
166
177
|
### Calibration Examples
|
|
167
178
|
|
|
179
|
+
<examples>
|
|
180
|
+
|
|
168
181
|
**Example of a correct PASS:**
|
|
169
182
|
|
|
170
183
|
> Task: "Add date validation to export endpoint"
|
|
@@ -193,6 +206,8 @@ Each issue must reference which dimension it violates.]
|
|
|
193
206
|
> 2. [Safety] `src/repositories/users.ts:23` — `WHERE name LIKE '%${query}%'` is SQL injection. Use parameterized
|
|
194
207
|
> query: `WHERE name LIKE $1` with `%${query}%` as parameter.
|
|
195
208
|
|
|
209
|
+
</examples>
|
|
210
|
+
|
|
196
211
|
Be direct and specific — point to files, lines, and concrete problems.
|
|
197
212
|
|
|
198
213
|
{{SIGNALS}}
|
|
@@ -15,16 +15,17 @@ When finished, emit a signal from the `<signals>` block below.
|
|
|
15
15
|
- **Respect task boundaries** — complete exactly the declared steps for this one task, then stop. Other agents may be
|
|
16
16
|
working on neighboring tasks in parallel; skipping steps, improvising, or editing files outside the declared set
|
|
17
17
|
causes merge conflicts with their work.
|
|
18
|
-
- **Prefer fixing the code over the test** — a failing test usually indicates a bug in the implementation. Update
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
18
|
+
- **Prefer fixing the code over the test** — a failing test usually indicates a bug in the implementation. Update
|
|
19
|
+
tests only when the declared steps intentionally change the asserted behaviour (e.g. a contract change, a regression
|
|
20
|
+
fix). If the right move is genuinely ambiguous, signal `<task-blocked>` so a human can decide — do not silently
|
|
21
|
+
weaken a test to make a failure go away.
|
|
22
22
|
- **Verify before completing** — the harness runs a post-task check gate; unverified work will be caught and rejected.
|
|
23
23
|
- **Append progress, never overwrite** — append each progress entry at the end of the progress file. Overwriting
|
|
24
24
|
erases context that downstream tasks depend on.
|
|
25
25
|
- **Leave {{CONTEXT_FILE}} and task definitions alone** — the context file is cleaned up by the harness (committing it
|
|
26
26
|
pollutes the repo); the task name, description, steps, and other task files are immutable.
|
|
27
|
-
|
|
27
|
+
|
|
28
|
+
{{COMMIT_CONSTRAINT}}
|
|
28
29
|
|
|
29
30
|
</constraints>
|
|
30
31
|
|
|
@@ -93,7 +94,8 @@ Complete these steps IN ORDER:
|
|
|
93
94
|
1. **Confirm all steps done** — Every task step has been completed
|
|
94
95
|
2. **Run ALL verification commands** — Execute every verification command (see Check Script section in the context file
|
|
95
96
|
or project instructions). Fix any failures before proceeding. The harness runs the check script as a post-task
|
|
96
|
-
gate — your task is not marked done unless it passes.
|
|
97
|
+
gate — your task is not marked done unless it passes.
|
|
98
|
+
{{COMMIT_STEP}}
|
|
97
99
|
3. **Update progress file** — Append to {{PROGRESS_FILE}} using this format:
|
|
98
100
|
|
|
99
101
|
```markdown
|
|
@@ -142,17 +144,15 @@ Complete these steps IN ORDER:
|
|
|
142
144
|
- The WHERE clause builder in src/repositories/base.ts can be extended for future filters
|
|
143
145
|
```
|
|
144
146
|
|
|
145
|
-
4. **Output verification results
|
|
147
|
+
4. **Output verification results** — use the actual commands the harness ran; the examples below are illustrative:
|
|
146
148
|
|
|
147
149
|
<!-- prettier-ignore -->
|
|
148
150
|
```
|
|
149
151
|
<task-verified>
|
|
150
|
-
$
|
|
151
|
-
|
|
152
|
-
$
|
|
153
|
-
|
|
154
|
-
$ pnpm test
|
|
155
|
-
47 tests passed
|
|
152
|
+
$ <check-command-1>
|
|
153
|
+
<output>
|
|
154
|
+
$ <check-command-2>
|
|
155
|
+
<output>
|
|
156
156
|
</task-verified>
|
|
157
157
|
```
|
|
158
158
|
|
|
@@ -223,10 +223,14 @@ The `ref` field should match either:
|
|
|
223
223
|
- The ticket's internal ID
|
|
224
224
|
- The exact ticket title
|
|
225
225
|
|
|
226
|
+
<task-specification>
|
|
227
|
+
|
|
226
228
|
## Ticket to Refine
|
|
227
229
|
|
|
228
230
|
{{TICKET}}
|
|
229
231
|
|
|
232
|
+
</task-specification>
|
|
233
|
+
|
|
230
234
|
{{ISSUE_CONTEXT}}
|
|
231
235
|
|
|
232
236
|
---
|
|
@@ -1,3 +1,5 @@
|
|
|
1
|
+
<validation-checklist>
|
|
2
|
+
|
|
1
3
|
## Pre-Output Validation
|
|
2
4
|
|
|
3
5
|
Before writing the JSON output, verify EVERY item:
|
|
@@ -12,3 +14,5 @@ Before writing the JSON output, verify EVERY item:
|
|
|
12
14
|
8. **`projectPath` assigned** — every task uses a path from the available repositories
|
|
13
15
|
9. **Verification criteria** — every task has 2-4 `verificationCriteria` that are testable and unambiguous
|
|
14
16
|
10. **Raw JSON output** — the output is valid JSON matching the schema exactly; the harness parses the output directly as JSON, so emit it without markdown fences, commentary, or surrounding prose
|
|
17
|
+
|
|
18
|
+
</validation-checklist>
|
|
@@ -11,7 +11,7 @@ var dynamicResolvers = {
|
|
|
11
11
|
"--project": async () => {
|
|
12
12
|
const result = await wrapAsync(
|
|
13
13
|
async () => {
|
|
14
|
-
const { listProjects } = await import("./project-
|
|
14
|
+
const { listProjects } = await import("./project-JF47ZWMF.mjs");
|
|
15
15
|
return listProjects();
|
|
16
16
|
},
|
|
17
17
|
(err) => new IOError("Failed to load projects for completion", err instanceof Error ? err : void 0)
|
|
@@ -45,7 +45,7 @@ var configValueCompletions = {
|
|
|
45
45
|
async function getSprintCompletions() {
|
|
46
46
|
const result = await wrapAsync(
|
|
47
47
|
async () => {
|
|
48
|
-
const { listSprints } = await import("./sprint-
|
|
48
|
+
const { listSprints } = await import("./sprint-54DOSIJK.mjs");
|
|
49
49
|
return listSprints();
|
|
50
50
|
},
|
|
51
51
|
(err) => new IOError("Failed to load sprints for completion", err instanceof Error ? err : void 0)
|
|
@@ -133,7 +133,7 @@ async function resolveCompletions(program, ctx) {
|
|
|
133
133
|
function getCommandPath(cmd) {
|
|
134
134
|
const parts = [];
|
|
135
135
|
let current = cmd;
|
|
136
|
-
while (current
|
|
136
|
+
while (current.parent) {
|
|
137
137
|
parts.unshift(current.name());
|
|
138
138
|
current = current.parent;
|
|
139
139
|
}
|
|
@@ -11,10 +11,10 @@ import {
|
|
|
11
11
|
logSprintBaselines,
|
|
12
12
|
resolveSprintId,
|
|
13
13
|
saveSprint
|
|
14
|
-
} from "./chunk-
|
|
15
|
-
import "./chunk-
|
|
14
|
+
} from "./chunk-RQGD5WS6.mjs";
|
|
15
|
+
import "./chunk-WZTY77GY.mjs";
|
|
16
16
|
import "./chunk-IWXBJD2D.mjs";
|
|
17
|
-
import "./chunk-
|
|
17
|
+
import "./chunk-D2HWXEHH.mjs";
|
|
18
18
|
import {
|
|
19
19
|
NoCurrentSprintError,
|
|
20
20
|
SprintNotFoundError,
|
|
@@ -2,14 +2,16 @@
|
|
|
2
2
|
import {
|
|
3
3
|
parseSprintStartArgs,
|
|
4
4
|
sprintStartCommand
|
|
5
|
-
} from "./chunk-
|
|
6
|
-
import "./chunk-
|
|
5
|
+
} from "./chunk-LCY32RW4.mjs";
|
|
6
|
+
import "./chunk-2FT37OZX.mjs";
|
|
7
|
+
import "./chunk-EGUFQNRB.mjs";
|
|
7
8
|
import "./chunk-CFUVE2BP.mjs";
|
|
8
9
|
import "./chunk-747KW2RW.mjs";
|
|
9
|
-
import "./chunk-
|
|
10
|
-
import "./chunk-
|
|
10
|
+
import "./chunk-LDSG7G2T.mjs";
|
|
11
|
+
import "./chunk-RQGD5WS6.mjs";
|
|
12
|
+
import "./chunk-WZTY77GY.mjs";
|
|
11
13
|
import "./chunk-IWXBJD2D.mjs";
|
|
12
|
-
import "./chunk-
|
|
14
|
+
import "./chunk-D2HWXEHH.mjs";
|
|
13
15
|
import "./chunk-57UWLHRH.mjs";
|
|
14
16
|
export {
|
|
15
17
|
parseSprintStartArgs,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ralphctl",
|
|
3
|
-
"version": "0.4.
|
|
3
|
+
"version": "0.4.3",
|
|
4
4
|
"description": "Agent harness for long-running AI coding tasks — orchestrates Claude Code & GitHub Copilot across repositories",
|
|
5
5
|
"homepage": "https://github.com/lukas-grigis/ralphctl",
|
|
6
6
|
"type": "module",
|
|
@@ -53,8 +53,8 @@
|
|
|
53
53
|
"@types/node": "^25.6.0",
|
|
54
54
|
"@types/react": "^19.2.14",
|
|
55
55
|
"@types/tabtab": "^3.0.4",
|
|
56
|
-
"@vitest/coverage-v8": "^4.1.
|
|
57
|
-
"eslint": "^10.2.
|
|
56
|
+
"@vitest/coverage-v8": "^4.1.5",
|
|
57
|
+
"eslint": "^10.2.1",
|
|
58
58
|
"eslint-config-prettier": "^10.1.8",
|
|
59
59
|
"globals": "^17.5.0",
|
|
60
60
|
"husky": "^9.1.7",
|
|
@@ -64,11 +64,11 @@
|
|
|
64
64
|
"tsup": "^8.5.1",
|
|
65
65
|
"tsx": "^4.21.0",
|
|
66
66
|
"typescript": "^5.9.3",
|
|
67
|
-
"typescript-eslint": "^8.
|
|
68
|
-
"vitest": "^4.1.
|
|
67
|
+
"typescript-eslint": "^8.59.0",
|
|
68
|
+
"vitest": "^4.1.5"
|
|
69
69
|
},
|
|
70
70
|
"lint-staged": {
|
|
71
|
-
"*.ts": [
|
|
71
|
+
"*.{ts,tsx}": [
|
|
72
72
|
"eslint --cache --fix",
|
|
73
73
|
"prettier --write"
|
|
74
74
|
],
|