npm - ralphctl - Versions diffs - 0.4.1 → 0.4.3 - Mend

ralphctl 0.4.1 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (36) hide show

package/README.md +13 -11
package/dist/{add-CIM72NE3.mjs → add-MG26JWBP.mjs} +6 -6
package/dist/{add-GX7P7XTT.mjs → add-ZZYL4BSF.mjs} +5 -4
package/dist/chunk-2FT37OZX.mjs +1071 -0
package/dist/{chunk-CTP2A436.mjs → chunk-D2HWXEHH.mjs} +9 -2
package/dist/{chunk-JOQO4HMM.mjs → chunk-EGUFQNRB.mjs} +10 -10
package/dist/{chunk-3HJNVQ7N.mjs → chunk-LCY32RW4.mjs} +621 -976
package/dist/{chunk-NUYQK5MN.mjs → chunk-LDSG7G2T.mjs} +1 -1
package/dist/{chunk-7JLZQICD.mjs → chunk-MDE6KPJQ.mjs} +6 -6
package/dist/{chunk-3QBEBKMZ.mjs → chunk-Q4AVHUZL.mjs} +7 -7
package/dist/{chunk-YCDUVPRT.mjs → chunk-RQGD5WS6.mjs} +4 -72
package/dist/{chunk-D2YGPLIV.mjs → chunk-TDBEEHTS.mjs} +213 -8
package/dist/{chunk-SM4GGZSU.mjs → chunk-WOMGKKZY.mjs} +152 -179
package/dist/{chunk-FKMKOWLA.mjs → chunk-WZTY77GY.mjs} +75 -1
package/dist/cli.mjs +68 -19
package/dist/{create-7WFSCMP4.mjs → create-PQK6KKRD.mjs} +5 -5
package/dist/{handle-BBAZJ44Y.mjs → handle-SYVCFI6Y.mjs} +1 -1
package/dist/{mount-2N6H5CWA.mjs → mount-2ANLHHQE.mjs} +556 -318
package/dist/{project-2IE7VWDB.mjs → project-JF47ZWMF.mjs} +2 -2
package/dist/prompts/check-script-discover.md +69 -0
package/dist/prompts/ideate-auto.md +26 -1
package/dist/prompts/ideate.md +5 -1
package/dist/prompts/plan-auto.md +30 -2
package/dist/prompts/plan-common-examples.md +82 -0
package/dist/prompts/plan-common.md +26 -78
package/dist/prompts/plan-interactive.md +6 -2
package/dist/prompts/repo-onboard.md +111 -0
package/dist/prompts/sprint-feedback.md +6 -2
package/dist/prompts/task-evaluation.md +25 -10
package/dist/prompts/task-execution.md +13 -13
package/dist/prompts/ticket-refine.md +4 -0
package/dist/prompts/validation-checklist.md +4 -0
package/dist/{resolver-EOE5WUMV.mjs → resolver-PG2DZEBX.mjs} +3 -3
package/dist/{sprint-OGOFEJJH.mjs → sprint-54DOSIJK.mjs} +3 -3
package/dist/{start-IUDCXIEA.mjs → start-2SZTBKGF.mjs} +7 -5
package/package.json +6 -6

package/dist/{project-2IE7VWDB.mjs → project-JF47ZWMF.mjs} RENAMED Viewed

@@ -11,8 +11,8 @@ import {
   removeProjectRepo,
   resolveRepoPath,
   updateProject
-} from "./chunk-NUYQK5MN.mjs";
-import "./chunk-CTP2A436.mjs";
+} from "./chunk-LDSG7G2T.mjs";
+import "./chunk-D2HWXEHH.mjs";
 import {
   ProjectExistsError,
   ProjectNotFoundError

package/dist/prompts/check-script-discover.md ADDED Viewed

@@ -0,0 +1,69 @@
+# Check-Script Discovery Protocol
+You are a build-system analyst. Inspect the repository at the path below and propose a single shell command that the
+harness can run after every AI task to verify the working tree still passes the project's own quality gates (typecheck
+/ lint / tests / build — whatever the project considers "green"). Static ecosystem detection has already returned
+nothing useful, which usually means the project is polyglot, custom, or uses an uncommon build tool.
+<harness-context>
+This invocation is read-only — do not modify the working tree, do not create files, do not run network calls, do not
+execute the candidate command. The harness owns execution; your only job is to read configuration files and produce a
+recommendation. The user will see your suggestion as an editable default and can accept, modify, or discard it.
+</harness-context>
+<context>
+**Repository path:** `{{REPO_PATH}}`
+</context>
+<constraints>
+- Inspect only the files explicitly listed below — do not crawl the entire tree, do not open source files, do not read
+  vendored or generated directories
+- Prefer commands that exit non-zero on failure and zero on success — that is the contract the harness relies on to
+  decide whether a task passes the post-task gate
+- Combine multiple gates with `&&` so the first failure aborts the chain — example shape: `<install> && <typecheck> &&
+<lint> && <test>` (substitute the project's actual tools)
+- If you find a single canonical entry point — a `Makefile` target like `make check`, a `mise` task, or a top-level
+  script in `scripts/` — prefer that over reconstructing the chain by hand
+- Never embed credentials, environment-specific paths, or commands that touch remote services
+- Output exactly one `<check-script>` block, on its own line, containing the bare command (no markdown fences, no
+  surrounding prose)
+- If the repo contains nothing actionable, emit `<check-script></check-script>` with empty content — the harness will
+  treat that as "no check script" and fall through to manual entry
+</constraints>
+<examples>
+- Polyglot Node + Python:
+  `<check-script>pnpm install && pnpm typecheck && pnpm test && uv run pytest</check-script>`
+- Makefile-driven:
+  `<check-script>make check</check-script>`
+- mise tasks:
+  `<check-script>mise run ci</check-script>`
+- Bare scripts directory:
+  `<check-script>./scripts/verify.sh</check-script>`
+</examples>
+## Files to Inspect
+Read whichever of these exist; ignore the rest:
+- `package.json` — `scripts` block (look for `test`, `typecheck`, `lint`, `check`, `ci`, `verify`)
+- `pyproject.toml` — `[tool.poetry.scripts]`, `[tool.uv]`, `[tool.hatch]`, `[project.scripts]`
+- `Makefile` — top-level targets (`check`, `test`, `ci`, `verify`, `all`)
+- `mise.toml` / `.mise.toml` — `[tasks]` block
+- `.tool-versions` — runtime hints only; combine with the above
+- `.github/workflows/*.yml` — CI definitions are the most authoritative source of "what passes"
+- `README.md` — explicit "running tests" / "development" sections, if present
+- `flake.nix` — `apps`, `checks`, `devShells.default.shellHook`
+- `WORKSPACE` / `BUILD` — Bazel target conventions (`bazel test //...`)
+- `scripts/` — top-level entries only (do not recurse); look for `check`, `verify`, `ci`, `test`
+## Output Contract
+After your inspection, emit a single `<check-script>…</check-script>` element on its own line. Nothing else — no
+preamble, no explanation, no markdown. The harness parses the first match with a strict regex.

package/dist/prompts/ideate-auto.md CHANGED Viewed

@@ -11,6 +11,27 @@ When finished, emit a signal from the `<signals>` block below.
 ## Two-Phase Protocol
+### Phase 0: Think Before Writing
+Before emitting any JSON, write your reasoning in a `<thinking>…</thinking>` block. Use it to interrogate the idea —
+surface hidden assumptions, identify the real user problem, sketch requirements, and reason about which repositories
+and dependencies the work touches. Explicit reasoning produces sharper output than jumping straight to JSON.
+The harness's JSON extractor skips everything before the first `{`, so the `<thinking>` block is stripped
+automatically — but the JSON object itself must still be emitted without markdown fences or commentary after it.
+```
+<thinking>
+The idea says "webhook notifications" but doesn't say which events. Reviewing the API, the natural candidates are
+task-status transitions. Scope = status-change webhooks only; other event types are out of scope.
+Acceptance: POST to configured URL with JSON payload on task status change; retries on 5xx.
+…
+</thinking>
+{
+  … JSON object …
+}
+```
 ### Phase 1: Refine Requirements (WHAT)
 Analyze the idea and produce complete, implementation-agnostic requirements:
@@ -87,6 +108,8 @@ If you cannot produce a valid plan, signal the issue instead of outputting incom
 - `<planning-blocked>reason</planning-blocked>`
+<context>
 ## Idea to Implement
 **Title:** {{IDEA_TITLE}}
@@ -107,6 +130,8 @@ You have access to these repositories:
 {{COMMON}}
+</context>
 {{VALIDATION}}
 ## Output Format
@@ -148,7 +173,7 @@ If you cannot produce a valid plan, output `<planning-blocked>reason</planning-b
         "Update src/repositories/export.ts findExports() to add WHERE clause for date filtering",
         "Add unit tests in src/schemas/__tests__/date-range.test.ts covering valid ranges, invalid formats, and reversed dates",
         "Add integration test in src/controllers/__tests__/export.test.ts for filtered and unfiltered queries",
-        "Run pnpm typecheck && pnpm lint && pnpm test — all pass"
+        "{{CHECK_GATE_EXAMPLE}}"
       ],
       "verificationCriteria": [
         "TypeScript compiles with no errors",

package/dist/prompts/ideate.md CHANGED Viewed

@@ -118,6 +118,8 @@ Focus: Determine HOW to implement the approved requirements
 {{VALIDATION}}
+<context>
 ## Idea to Refine and Plan
 **Title:** {{IDEA_TITLE}}
@@ -141,6 +143,8 @@ mention it as an observation.
 {{COMMON}}
+</context>
 ## Output Format
 When BOTH phases are approved by the user, write the JSON to: {{OUTPUT_FILE}}
@@ -169,7 +173,7 @@ Use this exact JSON Schema:
         "Update ExportController.getExport() in src/controllers/export.ts to parse and validate date range params",
         "Add date range filtering to ExportRepository.findRecords() in src/repositories/export.ts",
         "Write tests in src/controllers/__tests__/export.test.ts for: no dates, valid range, invalid range, start > end",
-        "Run pnpm typecheck && pnpm lint && pnpm test — all pass"
+        "{{CHECK_GATE_EXAMPLE}}"
       ],
       "verificationCriteria": [
         "TypeScript compiles with no errors",

package/dist/prompts/plan-auto.md CHANGED Viewed

@@ -12,6 +12,27 @@ When finished, emit a signal from the `<signals>` block below.
 ## Protocol
+### Step 0: Think Before Writing
+Before emitting any JSON, write your reasoning in a `<thinking>…</thinking>` block. Use it to work through the problem
+— map tickets to repositories, reason about dependencies, identify risks, and decide on task boundaries. Explicit
+reasoning produces sharper plans than jumping straight to output.
+The harness's JSON extractor skips everything before the first `[`, so the `<thinking>` block is stripped
+automatically — but the JSON array itself must still be emitted without markdown fences or commentary after it.
+```
+<thinking>
+Ticket 1 touches both the API and the worker repo — split into two tasks with a blockedBy edge.
+The shared schema change must land first so the worker can import it.
+Verification criterion for the API task: a contract test against the new schema.
+…
+</thinking>
+[
+  { … JSON array … }
+]
+```
 ### Step 1: Explore the Project
 Scope exploration to what will change the plan — read instruction files first, then only the specific files you need
@@ -55,10 +76,14 @@ The sprint contains:
 - **Existing Tasks**: Tasks from a previous planning run (your output replaces all existing tasks)
 - **Projects**: Each ticket belongs to a project which may have multiple repository paths
+<context>
 {{CONTEXT}}
 {{COMMON}}
+</context>
 ### Step 5: Handle Blockers
 If you cannot produce a valid task breakdown, signal the issue instead of outputting incomplete JSON:
@@ -73,6 +98,9 @@ If you cannot produce a valid task breakdown, signal the issue instead of output
 ## Output
+Your output MAY begin with a `<thinking>…</thinking>` block — the harness's JSON extractor skips everything before the
+first `[`. The JSON array itself must still be emitted without markdown fences or surrounding prose.
 Output only the JSON document matching the schema below — the harness parses your raw output directly as JSON, so emit
 it without markdown fences, commentary, or surrounding prose. If you cannot produce tasks, output a
 `<planning-blocked>` signal instead.
@@ -102,7 +130,7 @@ JSON Schema:
     "steps": [
       "Create src/utils/validation.ts with validateEmail(), validatePhone(), validateDateRange()",
       "Add corresponding unit tests in src/utils/__tests__/validation.test.ts covering valid inputs, invalid inputs, and edge cases (empty strings, unicode)",
-      "Run pnpm typecheck && pnpm lint && pnpm test — all pass"
+      "{{CHECK_GATE_EXAMPLE}}"
     ],
     "verificationCriteria": [
       "TypeScript compiles with no errors",
@@ -123,7 +151,7 @@ JSON Schema:
       "Wire up validation from src/utils/validation.ts with inline error messages",
       "Add form submission handler that calls POST /api/users",
       "Write component tests in src/components/__tests__/RegistrationForm.test.ts for valid submission, validation errors, and API failure",
-      "Run pnpm typecheck && pnpm lint && pnpm test — all pass"
+      "{{CHECK_GATE_EXAMPLE}}"
     ],
     "verificationCriteria": [
       "TypeScript compiles with no errors",

package/dist/prompts/plan-common-examples.md ADDED Viewed

@@ -0,0 +1,82 @@
+<examples>
+The illustrations below are non-normative — they show good/bad shapes for the rules stated in `plan-common.md`. Use
+them as calibration, not templates to copy literally.
+## Verification Criteria — good vs bad
+> **Good criteria (verifiable, unambiguous):**
+>
+> - "TypeScript compiles with no errors"
+> - "All existing tests pass plus new tests for the added feature"
+> - "GET /api/users returns 200 with paginated user list"
+> - "GET /api/users?page=-1 returns 400 with validation error"
+> - "Component renders without console errors in browser"
+> - "Playwright e2e: login flow completes without errors" _(UI tasks with Playwright configured)_
+> **Bad criteria (vague, not independently verifiable):**
+>
+> - "Code is clean and well-structured"
+> - "Error handling is appropriate"
+> - "Performance is acceptable"
+## Dependency Graph — good vs bad
+### Good Dependency Graph
+```
+Task 1: Add shared validation utilities       (no deps)
+Task 2: Implement user registration form       (blockedBy: [1])
+Task 3: Implement user profile editor          (blockedBy: [1])
+Task 4: Add form submission analytics          (blockedBy: [2, 3])
+```
+Tasks 2 and 3 run in parallel (both depend only on 1). Task 4 waits for both.
+### Bad Dependency Graph
+```
+Task 1: Add validation utilities               (no deps)
+Task 2: Implement registration form            (blockedBy: [1])
+Task 3: Implement profile editor               (blockedBy: [2])  <-- WRONG
+Task 4: Add submission analytics               (blockedBy: [3])  <-- WRONG
+```
+Task 3 does not actually need Task 2 — it only needs Task 1. This creates a false serial chain that prevents parallel
+execution.
+## Precise Steps — good vs bad
+Bad — vague steps that force the agent to guess:
+```json
+{
+  "name": "Add user authentication",
+  "steps": ["Implement auth", "Add tests", "Update docs"]
+}
+```
+Good — precise steps with file paths and pattern references:
+```json
+{
+  "name": "Add user authentication",
+  "projectPath": "/Users/dev/my-app",
+  "steps": [
+    "Create auth service in src/services/auth.ts with login(), logout(), getCurrentUser() — follow the pattern in src/services/user.ts for error handling and return types",
+    "Add AuthContext provider in src/contexts/AuthContext.tsx wrapping the app — follow existing ThemeContext pattern",
+    "Create useAuth hook in src/hooks/useAuth.ts exposing auth state and actions",
+    "Add ProtectedRoute wrapper component in src/components/ProtectedRoute.tsx",
+    "Write unit tests in src/services/__tests__/auth.test.ts — follow test patterns in src/services/__tests__/user.test.ts",
+    "{{CHECK_GATE_EXAMPLE}}"
+  ],
+  "verificationCriteria": [
+    "TypeScript compiles with no errors",
+    "All existing tests pass plus new auth tests",
+    "ProtectedRoute redirects unauthenticated users to /login",
+    "useAuth hook exposes isAuthenticated, user, login, and logout"
+  ]
+}
+```
+</examples>

package/dist/prompts/plan-common.md CHANGED Viewed

@@ -1,17 +1,22 @@
 ## Project Resources
-Each repository may ship with project-specific instruction files at its root and a `.claude/` configuration directory.
-Read them during exploration and reference them throughout planning:
+During exploration, check for project instruction files if present. Treat whichever files exist as authoritative for
+that codebase; skip silently when absent.
+**Instruction files (any ecosystem):**
+- **`CLAUDE.md` / `AGENTS.md`** — when present: project-level rules, conventions, and persistent memory
+- **`.github/copilot-instructions.md`** — when present: GitHub Copilot-specific repository instructions
+- **`README.md`** and manifest files (`package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `pom.xml`, …) — setup,
+  scripts, and dependencies
+**Claude-specific configuration (only when the repo has a `.claude/` directory):**
-- **`CLAUDE.md` / `AGENTS.md`** — project-level rules, conventions, and persistent memory
-- **`.github/copilot-instructions.md`** — GitHub Copilot-specific repository instructions, when present
 - **`.mcp.json`** — MCP servers the project ships with (Playwright, database inspection, etc.)
 - **`.claude/agents/`** — subagent definitions for Task-tool delegation
 - **`.claude/skills/`** — custom skills invokable with the Skill tool for project-specific workflows
 - **`.claude/settings.json`** / **`.claude/settings.local.json`** — tool permissions, model preferences, hooks
-When repository instruction files exist, treat their instructions as authoritative for that codebase.
 ## What Makes a Great Task
 A great task can be picked up cold by an AI agent, implemented independently, and verified as done — by a _different_ AI
@@ -63,6 +68,8 @@ Right size (one task covering the full change):
 ### Verification Criteria (The Evaluator Contract)
+_See the `<examples>` block at the end of this page for good/bad pairs._
 Every task must include a `verificationCriteria` array — these are the **done contract** between the generator (task
 executor) and the evaluator (independent reviewer). The evaluator grades each criterion as pass/fail across four
 floor dimensions: correctness, completeness, safety, and consistency. If ANY dimension fails, the task fails
@@ -86,21 +93,6 @@ Write criteria that are:
 - **Unambiguous** — two reviewers would agree on pass/fail
 - **Outcome-oriented** — describe WHAT is true when done, not HOW to get there
-> **Good criteria (verifiable, unambiguous):**
->
-> - "TypeScript compiles with no errors"
-> - "All existing tests pass plus new tests for the added feature"
-> - "GET /api/users returns 200 with paginated user list"
-> - "GET /api/users?page=-1 returns 400 with validation error"
-> - "Component renders without console errors in browser"
-> - "Playwright e2e: login flow completes without errors" _(UI tasks with Playwright configured)_
-> **Bad criteria (vague, not independently verifiable):**
->
-> - "Code is clean and well-structured"
-> - "Error handling is appropriate"
-> - "Performance is acceptable"
 Aim for 2-4 criteria per task. Include at least one criterion that is computationally checkable (test pass, type check,
 lint clean). For **UI/frontend tasks**, if the project has Playwright configured, add a browser-verifiable criterion —
 the evaluator will attempt visual verification using Playwright or browser tools when the project supports it.
@@ -108,7 +100,8 @@ the evaluator will attempt visual verification using Playwright or browser tools
 ### Guidelines
 1. **Outcome-oriented** — Each task delivers a testable result
-2. **Merge create+use** — Never separate "create X" from "use X" — that is one task
+2. **Merge create+use** — Keep "create X" and "use X" in one task — except when a stable contract makes them
+   independently testable (e.g. schema + migration lands first, consumer wiring lands after)
 3. **Let scope drive task count** — do not aim for a specific number. Fewer, larger coherent tasks beat many
    micro-tasks; split only when parallelism or a clean boundary justifies it
 4. **Merge serial chains** — If tasks only make sense when run in sequence, fold them into one task
@@ -134,6 +127,8 @@ the evaluator will attempt visual verification using Playwright or browser tools
 ## Dependency Graph
+_See the `<examples>` block at the end of this page for good/bad pairs._
 Tasks execute in dependency order — foundations before dependents.
 ### Guidelines
@@ -143,29 +138,6 @@ Tasks execute in dependency order — foundations before dependents.
 3. **Maximize parallelism** — Only add `blockedBy` when there is a real code dependency
 4. **Validate the DAG** — No cycles; earlier tasks cannot depend on later ones
-### Good Dependency Graph
-```
-Task 1: Add shared validation utilities       (no deps)
-Task 2: Implement user registration form       (blockedBy: [1])
-Task 3: Implement user profile editor          (blockedBy: [1])
-Task 4: Add form submission analytics          (blockedBy: [2, 3])
-```
-Tasks 2 and 3 run in parallel (both depend only on 1). Task 4 waits for both.
-### Bad Dependency Graph
-```
-Task 1: Add validation utilities               (no deps)
-Task 2: Implement registration form            (blockedBy: [1])
-Task 3: Implement profile editor               (blockedBy: [2])  <-- WRONG
-Task 4: Add submission analytics               (blockedBy: [3])  <-- WRONG
-```
-Task 3 does not actually need Task 2 — it only needs Task 1. This creates a false serial chain that prevents parallel
-execution.
 **Dependency test**: For each `blockedBy` entry, ask: "Does this task literally use code produced by the blocker?" If
 not, remove the dependency.
@@ -177,10 +149,14 @@ Each task must specify which repository it executes in via `projectPath`:
 2. **Split by repo** — If a ticket affects multiple repos, create separate tasks per repo with dependencies
 3. **Use exact paths** — `projectPath` must be one of the absolute paths from the project's Repositories section
-Never create a task that modifies files in multiple repos — split it.
+Split cross-repo work into one task per repo with `blockedBy` — except when atomicity is genuinely required (a
+single commit must land in both repos to avoid broken state), in which case flag the task and surface the need for
+human coordination.
 ## Precise Step Declarations
+_See the `<examples>` block at the end of this page for good/bad pairs._
 Every task must include explicit, actionable steps — the implementation checklist.
 ### Step Requirements
@@ -194,38 +170,6 @@ Every task must include explicit, actionable steps — the implementation checkl
    instruction files
 5. **No ambiguity** — Another developer should be able to follow steps without guessing
-Bad — vague steps that force the agent to guess:
-```json
-{
-  "name": "Add user authentication",
-  "steps": ["Implement auth", "Add tests", "Update docs"]
-}
-```
-Good — precise steps with file paths and pattern references:
-```json
-{
-  "name": "Add user authentication",
-  "projectPath": "/Users/dev/my-app",
-  "steps": [
-    "Create auth service in src/services/auth.ts with login(), logout(), getCurrentUser() — follow the pattern in src/services/user.ts for error handling and return types",
-    "Add AuthContext provider in src/contexts/AuthContext.tsx wrapping the app — follow existing ThemeContext pattern",
-    "Create useAuth hook in src/hooks/useAuth.ts exposing auth state and actions",
-    "Add ProtectedRoute wrapper component in src/components/ProtectedRoute.tsx",
-    "Write unit tests in src/services/__tests__/auth.test.ts — follow test patterns in src/services/__tests__/user.test.ts",
-    "Run pnpm typecheck && pnpm lint && pnpm test — all pass"
-  ],
-  "verificationCriteria": [
-    "TypeScript compiles with no errors",
-    "All existing tests pass plus new auth tests",
-    "ProtectedRoute redirects unauthenticated users to /login",
-    "useAuth hook exposes isAuthenticated, user, login, and logout"
-  ]
-}
-```
 Use actual file paths discovered during exploration. Reference the repository instruction files for verification
 commands.
@@ -234,6 +178,10 @@ commands.
 Start with an action verb (Add, Create, Update, Fix, Refactor, Remove, Migrate). Include the feature/concept, not files.
 Keep under 60 characters. Avoid vague verbs (Improve, Enhance, Handle).
+See `<examples>` below for concrete good/bad pairs.
+{{PLAN_COMMON_EXAMPLES}}
 ## Delegation to Available Tooling
 The "Project Tooling" section below (when present) lists subagents, skills, and MCP servers detected in the target

package/dist/prompts/plan-interactive.md CHANGED Viewed

@@ -72,7 +72,7 @@ before the plan is finalized.
    **Steps:**
    1. Create src/utils/csvExport.ts with column formatters for date, number, and string types
    2. Add unit tests in src/utils/__tests__/csvExport.test.ts covering empty data, special characters, and large datasets
-   3. Run `pnpm typecheck && pnpm lint && pnpm test` — all pass
+   3. Run the project's check/test/build gate — all pass
    ```
 2. **Show the dependency graph** — Make it obvious which tasks run in parallel vs sequentially, and why each dependency
@@ -123,10 +123,14 @@ The sprint contains:
 - **Existing Tasks**: Tasks from a previous planning run (your output replaces all existing tasks)
 - **Projects**: Each ticket belongs to a project which may have multiple repository paths
+<context>
 {{CONTEXT}}
 {{COMMON}}
+</context>
 ### Repository Assignment
 Repositories have been pre-selected by the user. Only create tasks targeting these repositories — the harness executes
@@ -166,7 +170,7 @@ Use this exact JSON Schema:
     "Update ExportController.getExport() in src/controllers/export.ts to parse and validate date range params",
     "Add date range filtering to ExportRepository.findRecords() in src/repositories/export.ts",
     "Write tests in src/controllers/__tests__/export.test.ts for: no dates, valid range, invalid range, start > end",
-    "Run pnpm typecheck && pnpm lint && pnpm test — all pass"
+    "{{CHECK_GATE_EXAMPLE}}"
   ],
   "verificationCriteria": [
     "TypeScript compiles with no errors",

package/dist/prompts/repo-onboard.md ADDED Viewed

@@ -0,0 +1,111 @@
+# Repository Onboarding Protocol
+You are a senior engineer preparing a repository for agentic work. Your job is to produce a minimal, high-signal
+project context file, written to `{{FILE_NAME}}` at the repo root, that captures the _non-inferable_ facts an
+autonomous coding agent needs — custom tooling, non-standard commands, security constraints, and performance
+boundaries — and to suggest a single shell check command the harness can run as a post-task gate. Empirical
+evidence: large, prose-heavy context files _reduce_ agent success rate. Keep it small and surgical.
+<harness-context>
+This invocation is read-only — do not modify the working tree, do not create files, do not run network calls, do not
+execute the candidate command. The harness owns execution. The user reviews your proposal before anything is written.
+</harness-context>
+<context>
+**Repository path:** `{{REPO_PATH}}`
+**Target file:** `{{FILE_NAME}}` — the harness will write the body you emit to this path.
+**Mode:** `{{MODE}}` — one of `bootstrap` (no prior project context file), `adopt` (authored project context file
+exists, do not clobber), `update` (prior harness-managed project context file exists; propose a prune + augment).
+**Project type hint:** `{{PROJECT_TYPE}}`
+**Static check-script suggestion (may be empty):** `{{CHECK_SCRIPT_SUGGESTION}}`
+{{EXISTING_AGENTS_MD}}
+</context>
+<constraints>
+- Inspect only configuration and metadata files — `package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `Makefile`,
+  `mise.toml`, `.tool-versions`, `.github/workflows/*.yml`, `README.md`, top-level `scripts/` entries, `flake.nix`.
+  Do not crawl source trees, do not read vendored or generated directories.
+- The proposed project context file MUST have exactly these H2 sections, in this order — omit none:
+  1. `## Project Overview` — one-paragraph description of what the repo is and who uses it.
+  2. `## Build & Run` — exact commands to install dependencies and run the project locally.
+  3. `## Testing` — exact commands to run unit / integration / end-to-end tests.
+  4. `## Architecture` — three to six bullets naming the top-level modules or layers, with a one-line role each.
+  5. `## Implementation Style` — conventions that can't be inferred from a file listing (naming, error handling,
+     logging, imports).
+  6. `## Security & Safety` — secrets / auth / network boundaries the agent must respect.
+  7. `## Performance Constraints` — hot paths, latency budgets, or memory limits the agent must honour.
+- Security & Safety and Performance Constraints are mandatory — when the repo offers no clues, prefix the body with
+  `LOW-CONFIDENCE:` and state what _is_ known (e.g. "LOW-CONFIDENCE: no explicit budgets; default to O(n) on request
+  hot paths"). Never drop these sections.
+- Implementation Style entries must reflect conventions demonstrably present in at least two files of the repository —
+  when you cannot cite at least two occurrences (mentally, not in the output), prefix the bullet with
+  `LOW-CONFIDENCE:`. Do not invent conventions.
+- Do not embed tool-specific slash commands, hooks, subagent definitions, MCP server configurations, or IDE settings
+  in this file. Those belong in tool-specific directories (e.g. `.claude/`, `.cursor/`). This file is facts about the
+  repository only.
+- Hard caps: exactly one H1, at most 7 H2 sections, no H4 or deeper headings, under 300 lines total. Prefer bullets
+  and short sentences — target a Flesch reading ease above 40.
+- Use the em-dash `—` (not `-`) for explanatory clauses in prose. Ordinary hyphens in identifiers and compound words
+  are fine.
+- Never embed credentials, user-specific paths, or commands that touch remote services.
+- Do not hardcode package-manager commands outside the tooling context — every command you cite must actually resolve
+  in this repository (e.g. only write `pnpm lint` when `package.json` has a `lint` script).
+- In `adopt` mode: treat the existing body as authoritative. Emit only the _additions_ you propose as new sections;
+  never rewrite or reorder the user's prose.
+- In `update` mode: emit the full replacement body AND a short `<changes>` block listing the non-obvious
+  prunes/augments (`- removed stale command "npm run foo"`, `- added missing Security section`).
+</constraints>
+<examples>
+- Minimal Node.js API:
+  ```
+  # Acme API
+  ## Project Overview
+  Internal REST service for order ingestion — consumed by the dashboard and the worker fleet.
+  ## Build & Run
+  - `pnpm install` then `pnpm dev` for local hot-reload on port 3000.
+  ## Testing
+  - `pnpm test` — unit + integration (Vitest).
+  ## Architecture
+  - `src/routes/` — HTTP surface, thin controllers.
+  - `src/services/` — business logic, pure where possible.
+  - `src/db/` — Drizzle schema and query builders.
+  ## Implementation Style
+  - Result<T, Err> at service boundaries, never throw for expected failures.
+  - Zod-validated request bodies, no untyped inputs.
+  ## Security & Safety
+  - All inbound requests are authenticated by upstream gateway; never trust the `X-User-Id` header directly.
+  - Do not log PII — scrub emails and phone numbers from error payloads.
+  ## Performance Constraints
+  - LOW-CONFIDENCE: no explicit budgets documented; default to p95 under 100 ms for read endpoints.
+  ```
+</examples>
+## Output Contract
+After your inspection, emit exactly two elements on their own lines — nothing else (no preamble, no summary):
+1. `<agents-md>…full project context file body…</agents-md>` — the proposed file, obeying every constraint above.
+2. `<check-script>…single shell command…</check-script>` — one command the harness can run as a post-task gate.
+   Empty content (`<check-script></check-script>`) is allowed when no gate can be inferred.
+In `update` mode, also emit a third element describing the delta:
+3. `<changes>…bullet list…</changes>` — one bullet per non-obvious prune or addition.
+No markdown fences around the elements. No commentary between them.

package/dist/prompts/sprint-feedback.md CHANGED Viewed

@@ -19,16 +19,20 @@ something entirely new (create a file, add a feature, tweak a script), do exactl
 ## User Feedback — Implement this
+<task-specification>
 {{FEEDBACK}}
+</task-specification>
 ## Protocol
 1. **Parse the feedback as an instruction** — Identify the concrete change(s) requested. If it says "create X", create
    X. If it says "change Y", change Y. Do not ask for clarification unless the instruction is genuinely contradictory.
 2. **Implement the change** — Create or edit the files required to satisfy the feedback. Make the smallest change that
    fully carries out the instruction.
-3. **Run verification** — If the project has a check script (e.g., `pnpm test`, `pnpm typecheck`), run it and confirm
-   it passes. If no check script is configured, skip this step.
+3. **Run verification** — If the project has a check script (test, typecheck, lint, or build command), run it and
+   confirm it passes. If no check script is configured, skip this step.
 4. **Output verification results** — Wrap any verification output in `<task-verified>...</task-verified>`. If you
    skipped step 3, emit `<task-verified>no check script configured; change applied</task-verified>`.
 5. **Commit your work** — Stage the modified files and create a git commit with a descriptive message summarising the