npm - ralphctl - Versions diffs - 0.4.1 → 0.4.3 - Mend

ralphctl 0.4.1 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (36) hide show

package/README.md +13 -11
package/dist/{add-CIM72NE3.mjs → add-MG26JWBP.mjs} +6 -6
package/dist/{add-GX7P7XTT.mjs → add-ZZYL4BSF.mjs} +5 -4
package/dist/chunk-2FT37OZX.mjs +1071 -0
package/dist/{chunk-CTP2A436.mjs → chunk-D2HWXEHH.mjs} +9 -2
package/dist/{chunk-JOQO4HMM.mjs → chunk-EGUFQNRB.mjs} +10 -10
package/dist/{chunk-3HJNVQ7N.mjs → chunk-LCY32RW4.mjs} +621 -976
package/dist/{chunk-NUYQK5MN.mjs → chunk-LDSG7G2T.mjs} +1 -1
package/dist/{chunk-7JLZQICD.mjs → chunk-MDE6KPJQ.mjs} +6 -6
package/dist/{chunk-3QBEBKMZ.mjs → chunk-Q4AVHUZL.mjs} +7 -7
package/dist/{chunk-YCDUVPRT.mjs → chunk-RQGD5WS6.mjs} +4 -72
package/dist/{chunk-D2YGPLIV.mjs → chunk-TDBEEHTS.mjs} +213 -8
package/dist/{chunk-SM4GGZSU.mjs → chunk-WOMGKKZY.mjs} +152 -179
package/dist/{chunk-FKMKOWLA.mjs → chunk-WZTY77GY.mjs} +75 -1
package/dist/cli.mjs +68 -19
package/dist/{create-7WFSCMP4.mjs → create-PQK6KKRD.mjs} +5 -5
package/dist/{handle-BBAZJ44Y.mjs → handle-SYVCFI6Y.mjs} +1 -1
package/dist/{mount-2N6H5CWA.mjs → mount-2ANLHHQE.mjs} +556 -318
package/dist/{project-2IE7VWDB.mjs → project-JF47ZWMF.mjs} +2 -2
package/dist/prompts/check-script-discover.md +69 -0
package/dist/prompts/ideate-auto.md +26 -1
package/dist/prompts/ideate.md +5 -1
package/dist/prompts/plan-auto.md +30 -2
package/dist/prompts/plan-common-examples.md +82 -0
package/dist/prompts/plan-common.md +26 -78
package/dist/prompts/plan-interactive.md +6 -2
package/dist/prompts/repo-onboard.md +111 -0
package/dist/prompts/sprint-feedback.md +6 -2
package/dist/prompts/task-evaluation.md +25 -10
package/dist/prompts/task-execution.md +13 -13
package/dist/prompts/ticket-refine.md +4 -0
package/dist/prompts/validation-checklist.md +4 -0
package/dist/{resolver-EOE5WUMV.mjs → resolver-PG2DZEBX.mjs} +3 -3
package/dist/{sprint-OGOFEJJH.mjs → sprint-54DOSIJK.mjs} +3 -3
package/dist/{start-IUDCXIEA.mjs → start-2SZTBKGF.mjs} +7 -5
package/package.json +6 -6

package/dist/prompts/task-evaluation.md CHANGED Viewed

@@ -58,10 +58,16 @@ Now apply semantic judgment to what the computational checks cannot catch:
 2. **Read the changed files carefully** — understand the full implementation, not just the diff.
 3. **Read surrounding code** — check that the implementation follows existing patterns and conventions.
 4. **Augment the Project Tooling section above** — the section lists detected subagents, skills, and MCP servers.
-   Additionally skim `package.json` scripts, `playwright.config.*`, `cypress.config.*`, `vitest.config.*`, `.storybook/`,
-   `CLAUDE.md`, and `.github/copilot-instructions.md` for the test/verification stack and any conventions the section
-   didn't surface. Note which application type this is (backend API / CLI / frontend SPA / fullstack / library) — it
-   determines which verification methods apply.
+   Additionally skim repository config for the test/verification stack and any conventions the section didn't surface.
+   Note which application type this is (backend API / CLI / frontend SPA / fullstack / library) — it determines which
+   verification methods apply.
+   <examples>
+   Representative files to scan when present — not an exhaustive list, adapt to the ecosystem:
+   `package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `playwright.config.*`, `cypress.config.*`,
+   `vitest.config.*`, `.storybook/`, `CLAUDE.md`, `AGENTS.md`, `.github/copilot-instructions.md`.
+   </examples>
 5. **Run extended verification when the detected tooling makes it cheap and deterministic:**
    - **Frontend/UI tasks** — if Playwright or Cypress is configured, run a targeted e2e test or use a browser MCP to
      verify the changed UI renders correctly (console errors, layout, interactive behaviour).
@@ -78,14 +84,15 @@ Evaluate the implementation across the dimensions below. Each dimension is pass/
 dimension fails, the overall evaluation fails. The first four are the floor — every task is graded on them. The
 planner may have flagged additional task-specific dimensions; when present, they are graded on top of the floor.
-**Dimension 1 — Correctness**
+<dimension name="Correctness" floor="true">
 Does the implementation do what the specification says? Check for:
 - Logical errors, off-by-one, race conditions, type issues
 - Behavior matches each verification criterion (grade each one explicitly)
 - Edge cases handled where specified
+  </dimension>
-**Dimension 2 — Completeness**
+<dimension name="Completeness" floor="true">
 Is the full specification implemented? Check for:
 - Every verification criterion is satisfied (not just most)
@@ -93,25 +100,29 @@ Is the full specification implemented? Check for:
 - No TODO/FIXME/HACK markers left behind that indicate unfinished work
 - Uncommitted changes that look like incomplete work (WIP diffs, stashed edits) — committing is expected unless the
   task's contract says otherwise
+  </dimension>
-**Dimension 3 — Safety**
+<dimension name="Safety" floor="true">
 Are there security or reliability issues? Check for:
 - Injection vulnerabilities (SQL, command, XSS)
 - Validation gaps on external input
 - Exposed secrets, hardcoded credentials
 - Unsafe error handling that leaks internals
+  </dimension>
-**Dimension 4 — Consistency**
+<dimension name="Consistency" floor="true">
 Does the implementation fit the codebase? Check for:
 - Follows existing patterns and conventions (naming, structure, error handling)
 - Uses existing utilities instead of reinventing them
 - No unnecessary changes outside the task scope — spec drift
 - Test patterns match the project's existing test style
+  </dimension>
   {{EXTRA_DIMENSIONS_SECTION}}
-  Evaluate only what was asked vs what was delivered — suggesting improvements beyond the task scope creates noise that
-  distracts from the actual pass/fail decision.
+Evaluate only what was asked vs what was delivered — suggesting improvements beyond the task scope creates noise that
+distracts from the actual pass/fail decision.
 ### Pass Bar
@@ -165,6 +176,8 @@ Each issue must reference which dimension it violates.]
 ### Calibration Examples
+<examples>
 **Example of a correct PASS:**
 > Task: "Add date validation to export endpoint"
@@ -193,6 +206,8 @@ Each issue must reference which dimension it violates.]
 > 2. [Safety] `src/repositories/users.ts:23` — `WHERE name LIKE '%${query}%'` is SQL injection. Use parameterized
 >    query: `WHERE name LIKE $1` with `%${query}%` as parameter.
+</examples>
 Be direct and specific — point to files, lines, and concrete problems.
 {{SIGNALS}}

package/dist/prompts/task-execution.md CHANGED Viewed

@@ -15,16 +15,17 @@ When finished, emit a signal from the `<signals>` block below.
 - **Respect task boundaries** — complete exactly the declared steps for this one task, then stop. Other agents may be
   working on neighboring tasks in parallel; skipping steps, improvising, or editing files outside the declared set
   causes merge conflicts with their work.
-- **Prefer fixing the code over the test** — a failing test usually indicates a bug in the implementation. Update a
-  test only when the declared steps intentionally change the behaviour it asserts (e.g. a regression fix, a contract
-  change). Do not remove, skip, or weaken a test to make a failure go away — that masks real bugs. If the right move
-  is genuinely ambiguous, signal `<task-blocked>` so a human can decide.
+- **Prefer fixing the code over the test** — a failing test usually indicates a bug in the implementation. Update
+  tests only when the declared steps intentionally change the asserted behaviour (e.g. a contract change, a regression
+  fix). If the right move is genuinely ambiguous, signal `<task-blocked>` so a human can decide — do not silently
+  weaken a test to make a failure go away.
 - **Verify before completing** — the harness runs a post-task check gate; unverified work will be caught and rejected.
 - **Append progress, never overwrite** — append each progress entry at the end of the progress file. Overwriting
   erases context that downstream tasks depend on.
 - **Leave {{CONTEXT_FILE}} and task definitions alone** — the context file is cleaned up by the harness (committing it
   pollutes the repo); the task name, description, steps, and other task files are immutable.
-  {{COMMIT_CONSTRAINT}}
+{{COMMIT_CONSTRAINT}}
 </constraints>
@@ -93,7 +94,8 @@ Complete these steps IN ORDER:
 1. **Confirm all steps done** — Every task step has been completed
 2. **Run ALL verification commands** — Execute every verification command (see Check Script section in the context file
    or project instructions). Fix any failures before proceeding. The harness runs the check script as a post-task
-   gate — your task is not marked done unless it passes.{{COMMIT_STEP}}
+   gate — your task is not marked done unless it passes.
+   {{COMMIT_STEP}}
 3. **Update progress file** — Append to {{PROGRESS_FILE}} using this format:
    ```markdown
@@ -142,17 +144,15 @@ Complete these steps IN ORDER:
    - The WHERE clause builder in src/repositories/base.ts can be extended for future filters
    ```
-4. **Output verification results:**
+4. **Output verification results** — use the actual commands the harness ran; the examples below are illustrative:
 <!-- prettier-ignore -->
 ```
 <task-verified>
-$ pnpm typecheck
-No type errors
-$ pnpm lint
-No lint errors
-$ pnpm test
-47 tests passed
+$ <check-command-1>
+<output>
+$ <check-command-2>
+<output>
 </task-verified>
 ```

package/dist/prompts/ticket-refine.md CHANGED Viewed

@@ -223,10 +223,14 @@ The `ref` field should match either:
 - The ticket's internal ID
 - The exact ticket title
+<task-specification>
 ## Ticket to Refine
 {{TICKET}}
+</task-specification>
 {{ISSUE_CONTEXT}}
 ---

package/dist/prompts/validation-checklist.md CHANGED Viewed

@@ -1,3 +1,5 @@
+<validation-checklist>
 ## Pre-Output Validation
 Before writing the JSON output, verify EVERY item:
@@ -12,3 +14,5 @@ Before writing the JSON output, verify EVERY item:
 8. **`projectPath` assigned** — every task uses a path from the available repositories
 9. **Verification criteria** — every task has 2-4 `verificationCriteria` that are testable and unambiguous
 10. **Raw JSON output** — the output is valid JSON matching the schema exactly; the harness parses the output directly as JSON, so emit it without markdown fences, commentary, or surrounding prose
+</validation-checklist>

package/dist/{resolver-EOE5WUMV.mjs → resolver-PG2DZEBX.mjs} RENAMED Viewed

@@ -11,7 +11,7 @@ var dynamicResolvers = {
   "--project": async () => {
     const result = await wrapAsync(
       async () => {
-        const { listProjects } = await import("./project-2IE7VWDB.mjs");
+        const { listProjects } = await import("./project-JF47ZWMF.mjs");
         return listProjects();
       },
       (err) => new IOError("Failed to load projects for completion", err instanceof Error ? err : void 0)
@@ -45,7 +45,7 @@ var configValueCompletions = {
 async function getSprintCompletions() {
   const result = await wrapAsync(
     async () => {
-      const { listSprints } = await import("./sprint-OGOFEJJH.mjs");
+      const { listSprints } = await import("./sprint-54DOSIJK.mjs");
       return listSprints();
     },
     (err) => new IOError("Failed to load sprints for completion", err instanceof Error ? err : void 0)
@@ -133,7 +133,7 @@ async function resolveCompletions(program, ctx) {
 function getCommandPath(cmd) {
   const parts = [];
   let current = cmd;
-  while (current?.parent) {
+  while (current.parent) {
     parts.unshift(current.name());
     current = current.parent;
   }

package/dist/{sprint-OGOFEJJH.mjs → sprint-54DOSIJK.mjs} RENAMED Viewed

@@ -11,10 +11,10 @@ import {
   logSprintBaselines,
   resolveSprintId,
   saveSprint
-} from "./chunk-YCDUVPRT.mjs";
-import "./chunk-FKMKOWLA.mjs";
+} from "./chunk-RQGD5WS6.mjs";
+import "./chunk-WZTY77GY.mjs";
 import "./chunk-IWXBJD2D.mjs";
-import "./chunk-CTP2A436.mjs";
+import "./chunk-D2HWXEHH.mjs";
 import {
   NoCurrentSprintError,
   SprintNotFoundError,

package/dist/{start-IUDCXIEA.mjs → start-2SZTBKGF.mjs} RENAMED Viewed

@@ -2,14 +2,16 @@
 import {
   parseSprintStartArgs,
   sprintStartCommand
-} from "./chunk-3HJNVQ7N.mjs";
-import "./chunk-JOQO4HMM.mjs";
+} from "./chunk-LCY32RW4.mjs";
+import "./chunk-2FT37OZX.mjs";
+import "./chunk-EGUFQNRB.mjs";
 import "./chunk-CFUVE2BP.mjs";
 import "./chunk-747KW2RW.mjs";
-import "./chunk-YCDUVPRT.mjs";
-import "./chunk-FKMKOWLA.mjs";
+import "./chunk-LDSG7G2T.mjs";
+import "./chunk-RQGD5WS6.mjs";
+import "./chunk-WZTY77GY.mjs";
 import "./chunk-IWXBJD2D.mjs";
-import "./chunk-CTP2A436.mjs";
+import "./chunk-D2HWXEHH.mjs";
 import "./chunk-57UWLHRH.mjs";
 export {
   parseSprintStartArgs,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ralphctl",
-  "version": "0.4.1",
+  "version": "0.4.3",
   "description": "Agent harness for long-running AI coding tasks — orchestrates Claude Code & GitHub Copilot across repositories",
   "homepage": "https://github.com/lukas-grigis/ralphctl",
   "type": "module",
@@ -53,8 +53,8 @@
     "@types/node": "^25.6.0",
     "@types/react": "^19.2.14",
     "@types/tabtab": "^3.0.4",
-    "@vitest/coverage-v8": "^4.1.4",
-    "eslint": "^10.2.0",
+    "@vitest/coverage-v8": "^4.1.5",
+    "eslint": "^10.2.1",
     "eslint-config-prettier": "^10.1.8",
     "globals": "^17.5.0",
     "husky": "^9.1.7",
@@ -64,11 +64,11 @@
     "tsup": "^8.5.1",
     "tsx": "^4.21.0",
     "typescript": "^5.9.3",
-    "typescript-eslint": "^8.58.2",
-    "vitest": "^4.1.4"
+    "typescript-eslint": "^8.59.0",
+    "vitest": "^4.1.5"
   },
   "lint-staged": {
-    "*.ts": [
+    "*.{ts,tsx}": [
       "eslint --cache --fix",
       "prettier --write"
     ],