ralphctl 0.6.2 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +250 -138
- package/dist/cli.mjs +20370 -21106
- package/dist/manifest.json +17 -19
- package/dist/prompts/_partials/signals-evaluation.md +14 -0
- package/dist/prompts/_partials/signals-task.md +26 -0
- package/dist/prompts/_partials/validation-checklist.md +24 -0
- package/dist/prompts/apply-feedback/template.md +118 -0
- package/dist/prompts/detect-scripts/template.md +118 -0
- package/dist/prompts/detect-skills/template.md +136 -0
- package/dist/prompts/evaluate/template.md +236 -0
- package/dist/prompts/ideate/template.md +172 -0
- package/dist/prompts/implement/template.md +203 -0
- package/dist/prompts/plan/template.md +347 -0
- package/dist/prompts/readiness/template.md +132 -0
- package/dist/prompts/refine/template.md +254 -0
- package/dist/skills/{default/abstraction-first → ralphctl-abstraction-first}/SKILL.md +1 -1
- package/dist/skills/{default/alignment → ralphctl-alignment}/SKILL.md +1 -1
- package/dist/skills/{default/iterative-review → ralphctl-iterative-review}/SKILL.md +1 -1
- package/package.json +25 -28
- package/dist/absolute-path-WUTZQ37D.mjs +0 -8
- package/dist/chunk-6RDMCLWU.mjs +0 -108
- package/dist/chunk-HIU74KTO.mjs +0 -1046
- package/dist/chunk-S3PTDH57.mjs +0 -78
- package/dist/chunk-WV4D2CPG.mjs +0 -26
- package/dist/prompt-adapter-JQICGVX7.mjs +0 -7
- package/dist/prompts/ideate.md +0 -204
- package/dist/prompts/plan-auto.md +0 -182
- package/dist/prompts/plan-common-examples.md +0 -82
- package/dist/prompts/plan-common.md +0 -200
- package/dist/prompts/plan-interactive.md +0 -212
- package/dist/prompts/repo-onboard.md +0 -201
- package/dist/prompts/signals-evaluation.md +0 -6
- package/dist/prompts/signals-planning.md +0 -5
- package/dist/prompts/signals-task.md +0 -10
- package/dist/prompts/sprint-feedback.md +0 -64
- package/dist/prompts/task-evaluation.md +0 -276
- package/dist/prompts/task-execution.md +0 -233
- package/dist/prompts/ticket-refine.md +0 -242
- package/dist/prompts/validation-checklist.md +0 -19
- package/dist/skills/exec/.gitkeep +0 -0
- package/dist/skills/plan/.gitkeep +0 -0
- package/dist/skills/refine/.gitkeep +0 -0
- package/dist/storage-paths-IPNZZM5D.mjs +0 -15
- package/dist/validation-error-QT6Q7FYU.mjs +0 -7
- /package/dist/prompts/{harness-context.md → _partials/harness-context.md} +0 -0
|
@@ -1,200 +0,0 @@
|
|
|
1
|
-
## Project Resources
|
|
2
|
-
|
|
3
|
-
During exploration, check for project instruction files if present. Treat whichever files exist as authoritative for
|
|
4
|
-
that codebase; skip silently when absent.
|
|
5
|
-
|
|
6
|
-
**Instruction files (any ecosystem):**
|
|
7
|
-
|
|
8
|
-
- **`CLAUDE.md` / `AGENTS.md`** — when present: project-level rules, conventions, and persistent memory
|
|
9
|
-
- **`.github/copilot-instructions.md`** — when present: GitHub Copilot-specific repository instructions
|
|
10
|
-
- **`README.md`** and manifest files (`package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `pom.xml`, …) — setup,
|
|
11
|
-
scripts, and dependencies
|
|
12
|
-
|
|
13
|
-
**Claude-specific configuration (only when the repo has a `.claude/` directory):**
|
|
14
|
-
|
|
15
|
-
- **`.mcp.json`** — MCP servers the project ships with (Playwright, database inspection, etc.)
|
|
16
|
-
- **`.claude/agents/`** — subagent definitions for Task-tool delegation
|
|
17
|
-
- **`.claude/skills/`** — custom skills invokable with the Skill tool for project-specific workflows
|
|
18
|
-
- **`.claude/settings.json`** / **`.claude/settings.local.json`** — tool permissions, model preferences, hooks
|
|
19
|
-
|
|
20
|
-
## What Makes a Great Task
|
|
21
|
-
|
|
22
|
-
A great task can be picked up cold by an AI agent, implemented independently, and verified as done — by a _different_ AI
|
|
23
|
-
agent (the evaluator). The litmus test: "Could an independent reviewer verify this task is done using only the
|
|
24
|
-
verification criteria and the codebase?" If not, the task needs work.
|
|
25
|
-
|
|
26
|
-
<task-qualities>
|
|
27
|
-
|
|
28
|
-
- **Clear scope** — which files/modules change, and what the outcome looks like
|
|
29
|
-
- **Verifiable result** — can be checked with tests, type checks, or other project commands
|
|
30
|
-
- **Independence** — can be implemented without waiting on other tasks (unless explicitly declared via `blockedBy`)
|
|
31
|
-
- **Pattern reference** — steps reference existing similar code the agent should follow (feedforward guidance)
|
|
32
|
-
|
|
33
|
-
</task-qualities>
|
|
34
|
-
|
|
35
|
-
### Task Sizing
|
|
36
|
-
|
|
37
|
-
The unit is **one coherent feature or vertical slice** — a change that can be picked up cold, implemented in a single
|
|
38
|
-
session, and verified end-to-end against its criteria. Size is driven by coherence, not line count. Modern agents are
|
|
39
|
-
capable; artificial fragmentation creates serial chains, duplicate context reloads, and merge conflicts that cost far
|
|
40
|
-
more than they save.
|
|
41
|
-
|
|
42
|
-
**Do not split when:**
|
|
43
|
-
|
|
44
|
-
- A utility and its first caller would be separated — create-and-use is always one task
|
|
45
|
-
- A feature and its tests would be separated
|
|
46
|
-
- The same pattern applies across N call sites — it is one refactor, not N tasks
|
|
47
|
-
|
|
48
|
-
**Do split when:**
|
|
49
|
-
|
|
50
|
-
- Two chunks are independent (different `projectPath`, or independent files with no shared contract)
|
|
51
|
-
- A clean, verifiable boundary exists partway through (e.g. schema + migration land first, then consumer wiring — the
|
|
52
|
-
schema is independently testable and unblocks parallel consumers)
|
|
53
|
-
- The change spans multiple repositories — one task per repo, connected via `blockedBy`
|
|
54
|
-
|
|
55
|
-
**Soft ceiling, not a target:** if a task looks like it will touch more than ~10 files or ~500 lines of meaningful
|
|
56
|
-
change AND a natural split point exists, split it. No natural split point? Keep it whole.
|
|
57
|
-
|
|
58
|
-
Too granular (one task, not three):
|
|
59
|
-
|
|
60
|
-
- "Create date formatting utility"
|
|
61
|
-
- "Refactor experience module to use date utility"
|
|
62
|
-
- "Refactor certifications module to use date utility"
|
|
63
|
-
|
|
64
|
-
Right size (one task covering the full change):
|
|
65
|
-
|
|
66
|
-
- "Centralize date formatting across all sections" — creates utility AND updates all usages
|
|
67
|
-
- "Improve style robustness in interactive components" — handles multiple related files
|
|
68
|
-
|
|
69
|
-
### Verification Criteria (The Evaluator Contract)
|
|
70
|
-
|
|
71
|
-
_See the `<examples>` block at the end of this page for good/bad pairs._
|
|
72
|
-
|
|
73
|
-
Every task must include a `verificationCriteria` array — these are the **done contract** between the generator (task
|
|
74
|
-
executor) and the evaluator (independent reviewer). The evaluator grades each criterion as pass/fail across four
|
|
75
|
-
floor dimensions: correctness, completeness, safety, and consistency. If ANY dimension fails, the task fails
|
|
76
|
-
evaluation and the generator receives specific feedback to fix.
|
|
77
|
-
|
|
78
|
-
#### Optional: Extra Evaluator Dimensions (`extraDimensions`)
|
|
79
|
-
|
|
80
|
-
The four floor dimensions apply to every task. When a task has a non-default success criterion that the floor
|
|
81
|
-
dimensions do not capture cleanly — e.g. perf-sensitive work, UI/accessibility, schema migration safety,
|
|
82
|
-
security-critical changes — emit `extraDimensions: ["Name"]` on that task. The evaluator will grade those names
|
|
83
|
-
on top of the floor.
|
|
84
|
-
|
|
85
|
-
Use sparingly — most tasks need no extras. Pick PascalCase names the evaluator can interpret directly (e.g.
|
|
86
|
-
`"Performance"`, `"Accessibility"`, `"MigrationSafety"`, `"BackwardCompatibility"`). Omit the field when
|
|
87
|
-
floor-only is enough.
|
|
88
|
-
|
|
89
|
-
Write criteria that are:
|
|
90
|
-
|
|
91
|
-
- **Computationally verifiable** where possible — prefer "TypeScript compiles with no errors" over "code is well-typed"
|
|
92
|
-
- **Observable** — the evaluator must be able to check it by running commands or reading code
|
|
93
|
-
- **Unambiguous** — two reviewers would agree on pass/fail
|
|
94
|
-
- **Outcome-oriented** — describe WHAT is true when done, not HOW to get there
|
|
95
|
-
|
|
96
|
-
Aim for 2-4 criteria per task. Include at least one criterion that is computationally checkable (test pass, type check,
|
|
97
|
-
lint clean). For **UI/frontend tasks**, if the project has Playwright configured, add a browser-verifiable criterion —
|
|
98
|
-
the evaluator will attempt visual verification using Playwright or browser tools when the project supports it.
|
|
99
|
-
|
|
100
|
-
### Guidelines
|
|
101
|
-
|
|
102
|
-
1. **Outcome-oriented** — Each task delivers a testable result
|
|
103
|
-
2. **Merge create+use** — Keep "create X" and "use X" in one task — except when a stable contract makes them
|
|
104
|
-
independently testable (e.g. schema + migration lands first, consumer wiring lands after)
|
|
105
|
-
3. **Let scope drive task count** — do not aim for a specific number. Fewer, larger coherent tasks beat many
|
|
106
|
-
micro-tasks; split only when a clean boundary justifies it
|
|
107
|
-
4. **Merge serial chains** — If tasks only make sense when run in sequence, fold them into one task
|
|
108
|
-
|
|
109
|
-
### Anti-Patterns
|
|
110
|
-
|
|
111
|
-
- Separate tasks for "create utility" and "integrate utility" — always merge create+use
|
|
112
|
-
- One task per file modification — group by logical change, not by file
|
|
113
|
-
- Tasks that are "blocked by" the previous task for trivial reasons — false chains create artificial ordering and
|
|
114
|
-
obscure the real dependency structure
|
|
115
|
-
- Micro-refactoring tasks (add directive, remove import, etc.) — fold into the task that needs them
|
|
116
|
-
|
|
117
|
-
## Non-Overlapping File Ownership
|
|
118
|
-
|
|
119
|
-
**Each task must own its files exclusively.** Before finalizing:
|
|
120
|
-
|
|
121
|
-
1. **List files per task** — Write down which files each task creates or modifies
|
|
122
|
-
2. **Check for overlap** — If two tasks touch the same file, either merge them or clearly delineate which
|
|
123
|
-
sections/functions each owns (document in steps)
|
|
124
|
-
3. **Check for concept overlap** — If two tasks involve the same abstraction (e.g., both deal with "error handling"),
|
|
125
|
-
merge or split cleanly by concern
|
|
126
|
-
|
|
127
|
-
**Overlap test**: Could task B's implementation conflict with or undo task A's work? If yes, restructure.
|
|
128
|
-
|
|
129
|
-
## Dependency Graph
|
|
130
|
-
|
|
131
|
-
_See the `<examples>` block at the end of this page for good/bad pairs._
|
|
132
|
-
|
|
133
|
-
Tasks execute in dependency order — foundations before dependents.
|
|
134
|
-
|
|
135
|
-
### Guidelines
|
|
136
|
-
|
|
137
|
-
1. **Foundation first** — Shared utilities, types, schemas before anything that uses them
|
|
138
|
-
2. **Declare all dependencies** — Use `blockedBy` to enforce order; reference each blocker by its `id` placeholder (any unique string). Do not rely on array position alone.
|
|
139
|
-
3. **Avoid false dependencies** — Only add `blockedBy` when there is a real code dependency
|
|
140
|
-
4. **Validate the DAG** — No cycles; earlier tasks cannot depend on later ones
|
|
141
|
-
|
|
142
|
-
**Dependency test**: For each `blockedBy` entry, ask: "Does this task literally use code produced by the blocker?" If
|
|
143
|
-
not, remove the dependency.
|
|
144
|
-
|
|
145
|
-
## Task Repository Assignment
|
|
146
|
-
|
|
147
|
-
Each task must specify which repository it executes in via `projectPath`:
|
|
148
|
-
|
|
149
|
-
1. **One repo per task** — Each task runs in exactly one repository directory
|
|
150
|
-
2. **Split by repo** — If a ticket affects multiple repos, create separate tasks per repo with dependencies
|
|
151
|
-
3. **Use exact paths** — `projectPath` must be one of the absolute paths from the project's Repositories section
|
|
152
|
-
|
|
153
|
-
Split cross-repo work into one task per repo with `blockedBy` — except when atomicity is genuinely required (a
|
|
154
|
-
single commit must land in both repos to avoid broken state), in which case flag the task and surface the need for
|
|
155
|
-
human coordination.
|
|
156
|
-
|
|
157
|
-
## Precise Step Declarations
|
|
158
|
-
|
|
159
|
-
_See the `<examples>` block at the end of this page for good/bad pairs._
|
|
160
|
-
|
|
161
|
-
Every task must include explicit, actionable steps — the implementation checklist.
|
|
162
|
-
|
|
163
|
-
### Step Requirements
|
|
164
|
-
|
|
165
|
-
1. **Specific file references** — Name exact files/directories to create or modify
|
|
166
|
-
2. **Concrete actions** — "Add function X to file Y", not "implement the feature"
|
|
167
|
-
3. **Pattern references** — When possible, point to existing code the agent should follow: "Follow the pattern in
|
|
168
|
-
`src/controllers/users.ts` for error handling and response format." This is feedforward guidance — it steers the
|
|
169
|
-
agent toward correct behavior before it starts.
|
|
170
|
-
4. **Verification included** — Last step(s) should include project-specific verification commands from the repository
|
|
171
|
-
instruction files
|
|
172
|
-
5. **No ambiguity** — Another developer should be able to follow steps without guessing
|
|
173
|
-
|
|
174
|
-
Use actual file paths discovered during exploration. Reference the repository instruction files for verification
|
|
175
|
-
commands.
|
|
176
|
-
|
|
177
|
-
## Task Naming
|
|
178
|
-
|
|
179
|
-
Start with an action verb (Add, Create, Update, Fix, Refactor, Remove, Migrate). Include the feature/concept, not files.
|
|
180
|
-
Keep under 60 characters. Avoid vague verbs (Improve, Enhance, Handle).
|
|
181
|
-
|
|
182
|
-
See `<examples>` below for concrete good/bad pairs.
|
|
183
|
-
|
|
184
|
-
{{PLAN_COMMON_EXAMPLES}}
|
|
185
|
-
|
|
186
|
-
## Delegation to Available Tooling
|
|
187
|
-
|
|
188
|
-
The "Project Tooling" section below (when present) lists subagents, skills, and MCP servers detected in the target
|
|
189
|
-
repositories. Use these in your task planning:
|
|
190
|
-
|
|
191
|
-
- **Surface tool delegation in task steps.** When a step's nature matches an available tool's specialization, write
|
|
192
|
-
the step so the executor knows to delegate. For example, if the tooling section lists a subagent specialized in
|
|
193
|
-
security review, security-sensitive task steps should explicitly recommend invoking it via the Task tool. Generic
|
|
194
|
-
pseudo-step: _"Delegate the final review of authentication changes to the `<name>` subagent via the Task tool."_
|
|
195
|
-
- **Pull verification criteria from available tools.** UI tasks should add browser-verifiable criteria when a
|
|
196
|
-
Playwright or similar MCP is listed. Database tasks should reference DB-inspection MCPs when present.
|
|
197
|
-
- **Do not invent tools.** Only reference tools that actually appear in the Project Tooling section. If the section is
|
|
198
|
-
empty or absent, omit delegation recommendations entirely — do not fabricate subagent names.
|
|
199
|
-
|
|
200
|
-
{{PROJECT_TOOLING}}
|
|
@@ -1,212 +0,0 @@
|
|
|
1
|
-
# Interactive Task Planning Protocol
|
|
2
|
-
|
|
3
|
-
You are a task planning specialist collaborating with the user. Produce a dependency-ordered set of implementation
|
|
4
|
-
tasks — each one a self-contained mini-spec that an AI agent can pick up cold and complete in a single session. Think
|
|
5
|
-
carefully and step-by-step as you plan; surface decisions that require user input rather than silently assuming.
|
|
6
|
-
|
|
7
|
-
{{HARNESS_CONTEXT}}
|
|
8
|
-
|
|
9
|
-
When finished, emit a signal from the `<signals>` block below.
|
|
10
|
-
|
|
11
|
-
## Protocol
|
|
12
|
-
|
|
13
|
-
### Step 1: Explore the Project
|
|
14
|
-
|
|
15
|
-
Before planning, understand the codebase:
|
|
16
|
-
|
|
17
|
-
1. **Read project instructions** — start with `CLAUDE.md` (or `AGENTS.md`) if it exists, then check
|
|
18
|
-
`.github/copilot-instructions.md` when present. Follow any links to other documentation. See the "Project Resources"
|
|
19
|
-
section below for the full list of resources under `.claude/` and at the repo root.
|
|
20
|
-
2. **Read key files** — README, manifest files (package.json, pyproject.toml, Cargo.toml, etc.), main entry points,
|
|
21
|
-
directory structure
|
|
22
|
-
3. **Find similar implementations** — Look for existing features similar to what tickets require and follow their
|
|
23
|
-
patterns
|
|
24
|
-
4. **Extract verification commands** — Find the exact build, test, lint, and typecheck commands from the repository
|
|
25
|
-
instruction files or project config
|
|
26
|
-
|
|
27
|
-
### Step 2: Review Ticket Requirements
|
|
28
|
-
|
|
29
|
-
The canonical, user-approved requirements for this sprint are staged
|
|
30
|
-
inside your working directory at `./requirements.json`. Read that file
|
|
31
|
-
directly — it is the single source of truth.
|
|
32
|
-
|
|
33
|
-
Schema:
|
|
34
|
-
|
|
35
|
-
```json
|
|
36
|
-
{
|
|
37
|
-
"sprintId": "...",
|
|
38
|
-
"sprintName": "...",
|
|
39
|
-
"generatedAt": "<ISO timestamp>",
|
|
40
|
-
"tickets": [{ "ticketId": "...", "title": "...", "requirements": "<markdown body>" }]
|
|
41
|
-
}
|
|
42
|
-
```
|
|
43
|
-
|
|
44
|
-
Only tickets the user approved during refinement are present. Tickets
|
|
45
|
-
that were skipped or rejected do not appear and must not be planned for.
|
|
46
|
-
|
|
47
|
-
For each entry:
|
|
48
|
-
|
|
49
|
-
1. **Read the requirements** — Understand WHAT needs to be built
|
|
50
|
-
2. **Note constraints** — Business rules, acceptance criteria, scope boundaries from refinement
|
|
51
|
-
3. **Identify open questions** — Implementation details that need user input
|
|
52
|
-
|
|
53
|
-
The requirements from Phase 1 are implementation-agnostic. Your job in Phase 2 is to determine HOW to implement them.
|
|
54
|
-
|
|
55
|
-
### Step 3: Explore Pre-Selected Repositories
|
|
56
|
-
|
|
57
|
-
The user selected which repositories to include before this session started — repository selection is a separate
|
|
58
|
-
workflow step, not part of planning.
|
|
59
|
-
|
|
60
|
-
1. **Check accessible directories** — the pre-selected repository paths are listed in the Sprint Context below
|
|
61
|
-
2. **Deep-dive into selected repos** — read the repository instruction files, key files, patterns, conventions, and
|
|
62
|
-
existing implementations
|
|
63
|
-
3. **Map ticket scope to repos** — determine which parts of each ticket map to which repository
|
|
64
|
-
|
|
65
|
-
If you believe a critical repository is missing, surface it as an observation; the selection decision stays with the
|
|
66
|
-
user.
|
|
67
|
-
|
|
68
|
-
### Step 4: Plan Tasks
|
|
69
|
-
|
|
70
|
-
Using the confirmed repositories and your codebase exploration, create tasks. Use the tools available to you:
|
|
71
|
-
|
|
72
|
-
Use available tools to search, explore, and read the codebase. When you need implementation decisions from the user, use AskUserQuestion with:
|
|
73
|
-
|
|
74
|
-
- **Recommended option first** with "(Recommended)" in the label
|
|
75
|
-
- **2-4 options** with descriptions explaining trade-offs
|
|
76
|
-
- **One question at a time**, wait for answer, then continue
|
|
77
|
-
|
|
78
|
-
### Step 5: Present Tasks for Review
|
|
79
|
-
|
|
80
|
-
Present tasks in readable markdown before writing to file — the user must review scope, ordering, and completeness
|
|
81
|
-
before the plan is finalized.
|
|
82
|
-
|
|
83
|
-
1. **Present each task in readable markdown:**
|
|
84
|
-
|
|
85
|
-
```
|
|
86
|
-
### Task 1: Create CSV export utility
|
|
87
|
-
**Repository:** /path/to/frontend
|
|
88
|
-
**Blocked by:** none
|
|
89
|
-
|
|
90
|
-
**Steps:**
|
|
91
|
-
1. Create src/utils/csvExport.ts with column formatters for date, number, and string types
|
|
92
|
-
2. Add unit tests in src/utils/__tests__/csvExport.test.ts covering empty data, special characters, and large datasets
|
|
93
|
-
3. Run the project's check/test/build gate — all pass
|
|
94
|
-
```
|
|
95
|
-
|
|
96
|
-
2. **Show the dependency graph** — Make the dependency structure obvious, and explain why each dependency exists:
|
|
97
|
-
|
|
98
|
-
```
|
|
99
|
-
Dependency graph:
|
|
100
|
-
Task 1 (no deps) ──┬──> Task 3 (blockedBy: [1, 2])
|
|
101
|
-
Task 2 (no deps) ──┘
|
|
102
|
-
Task 4 (no deps) ──────> Task 5 (blockedBy: [4])
|
|
103
|
-
```
|
|
104
|
-
|
|
105
|
-
3. **Ask for approval using AskUserQuestion:**
|
|
106
|
-
|
|
107
|
-
```
|
|
108
|
-
Question: "Does this task breakdown look correct? Any changes needed?"
|
|
109
|
-
Header: "Approval"
|
|
110
|
-
Options:
|
|
111
|
-
- "Approved, write it" — "Tasks are complete, dependencies correct, ready to import"
|
|
112
|
-
- "Needs changes" — "I'll describe what to adjust"
|
|
113
|
-
- "Give feedback" — "Type specific corrections or comments in my own words"
|
|
114
|
-
```
|
|
115
|
-
|
|
116
|
-
If the user selects "Needs changes", ask follow-up questions to understand what to adjust. If the user selects
|
|
117
|
-
"Give feedback" or uses "Other", apply their written input directly. Revise the tasks and re-present for approval.
|
|
118
|
-
Iterate until approved.
|
|
119
|
-
|
|
120
|
-
4. Write JSON to output file after the user approves — writing before approval risks wasted work if the plan needs
|
|
121
|
-
changes
|
|
122
|
-
|
|
123
|
-
### Step 6: Handle Blockers
|
|
124
|
-
|
|
125
|
-
If you encounter issues that prevent planning, communicate clearly:
|
|
126
|
-
|
|
127
|
-
- **Inaccessible repository** — Tell the user and ask if they want to proceed without it
|
|
128
|
-
- **Contradictory requirements** — Present the conflict and ask the user to resolve it
|
|
129
|
-
- **Missing context** — Ask the user using AskUserQuestion before proceeding with assumptions
|
|
130
|
-
- **No approved tickets** — Read `./requirements.json`; if it contains no entries, signal `<planning-blocked>No approved tickets to plan for</planning-blocked>`
|
|
131
|
-
|
|
132
|
-
### Step 7: Pre-Output Checklist
|
|
133
|
-
|
|
134
|
-
{{VALIDATION}}
|
|
135
|
-
|
|
136
|
-
## Sprint Context
|
|
137
|
-
|
|
138
|
-
The sprint contains:
|
|
139
|
-
|
|
140
|
-
- **Tickets**: Things to be done (may have optional ID/link if from an issue tracker)
|
|
141
|
-
- **Existing Tasks**: Tasks from a previous planning run (your output replaces all existing tasks)
|
|
142
|
-
- **Projects**: Each ticket belongs to a project which may have multiple repository paths
|
|
143
|
-
|
|
144
|
-
<context>
|
|
145
|
-
|
|
146
|
-
{{CONTEXT}}
|
|
147
|
-
|
|
148
|
-
{{COMMON}}
|
|
149
|
-
|
|
150
|
-
</context>
|
|
151
|
-
|
|
152
|
-
### Repository Assignment
|
|
153
|
-
|
|
154
|
-
Repositories have been pre-selected by the user. Only create tasks targeting these repositories — the harness executes
|
|
155
|
-
each task in its `projectPath` directory, so tasks targeting unlisted repos would fail.
|
|
156
|
-
|
|
157
|
-
- **Use listed paths** — each task's `projectPath` must be one of the repository paths shown in the Sprint Context
|
|
158
|
-
|
|
159
|
-
Tasks targeting unlisted `projectPath` values fail at execution time — the harness executes each task inside its declared directory.
|
|
160
|
-
|
|
161
|
-
- **One repo per task** — if a ticket spans multiple repos, create separate tasks per repo with proper dependencies
|
|
162
|
-
- **Stay within scope** — tasks for repositories not listed in the Sprint Context cannot be executed
|
|
163
|
-
|
|
164
|
-
## Output Format
|
|
165
|
-
|
|
166
|
-
When the user approves the plan, write the tasks to: {{OUTPUT_FILE}}
|
|
167
|
-
|
|
168
|
-
Use this exact JSON Schema:
|
|
169
|
-
|
|
170
|
-
```json
|
|
171
|
-
{{SCHEMA}}
|
|
172
|
-
```
|
|
173
|
-
|
|
174
|
-
**Dependencies**: Give each task an `id` field — any unique placeholder string — and reference earlier tasks via `blockedBy`:
|
|
175
|
-
|
|
176
|
-
- `id` is a placeholder local to this output (e.g. `"1"`, `"auth-setup"`, `"add-validation"`). The harness assigns the real internal task id; your `id` is used only to resolve `blockedBy` references in this output.
|
|
177
|
-
- Reference earlier tasks by their placeholder: `"blockedBy": ["1"]` or `"blockedBy": ["auth-setup"]`.
|
|
178
|
-
- Every entry in `blockedBy` must match the `id` of an earlier task in the same array.
|
|
179
|
-
- Placeholders must be unique across the array.
|
|
180
|
-
- Dependencies must reference tasks that appear earlier in the array (no forward refs, no cycles).
|
|
181
|
-
|
|
182
|
-
### Example Well-Formed Task
|
|
183
|
-
|
|
184
|
-
```json
|
|
185
|
-
{
|
|
186
|
-
"id": "1",
|
|
187
|
-
"name": "Add date range filter to export API",
|
|
188
|
-
"description": "Add startDate/endDate query parameters to the /api/export endpoint with validation",
|
|
189
|
-
"projectPath": "/Users/dev/my-app/backend",
|
|
190
|
-
"ticketId": "abc12345",
|
|
191
|
-
"steps": [
|
|
192
|
-
"Add DateRangeSchema to src/schemas/export.ts with startDate and endDate as optional ISO8601 strings",
|
|
193
|
-
"Update ExportController.getExport() in src/controllers/export.ts to parse and validate date range params",
|
|
194
|
-
"Add date range filtering to ExportRepository.findRecords() in src/repositories/export.ts",
|
|
195
|
-
"Write tests in src/controllers/__tests__/export.test.ts for: no dates, valid range, invalid range, start > end",
|
|
196
|
-
"{{CHECK_GATE_EXAMPLE}}"
|
|
197
|
-
],
|
|
198
|
-
"verificationCriteria": [
|
|
199
|
-
"TypeScript compiles with no errors",
|
|
200
|
-
"All existing tests pass plus new tests for date range filtering",
|
|
201
|
-
"GET /api/export?startDate=invalid returns 400 with validation error",
|
|
202
|
-
"GET /api/export?startDate=2024-01-01&endDate=2024-12-31 returns only matching records"
|
|
203
|
-
],
|
|
204
|
-
"blockedBy": []
|
|
205
|
-
}
|
|
206
|
-
```
|
|
207
|
-
|
|
208
|
-
{{SIGNALS}}
|
|
209
|
-
|
|
210
|
-
---
|
|
211
|
-
|
|
212
|
-
Start by reading the repository instruction files and exploring the codebase, then discuss the approach with the user.
|
|
@@ -1,201 +0,0 @@
|
|
|
1
|
-
# Repository Onboarding Protocol
|
|
2
|
-
|
|
3
|
-
You are a senior engineer preparing a repository for agentic work. Your job is to inventory this repo from its
|
|
4
|
-
configuration and metadata files and propose four artefacts in one pass — a project context file written to
|
|
5
|
-
`{{FILE_NAME}}`, a single-line setup command, a single-line verify command, and an optional list of skill
|
|
6
|
-
suggestions. Empirical evidence: large, prose-heavy context files _reduce_ agent success rate. Keep every artefact
|
|
7
|
-
small and surgical.
|
|
8
|
-
|
|
9
|
-
<harness-context>
|
|
10
|
-
This invocation is read-only — do not modify the working tree, do not create files, do not run network calls, do not
|
|
11
|
-
execute the candidate commands. The harness owns execution. The user reviews each proposal before anything is
|
|
12
|
-
written.
|
|
13
|
-
</harness-context>
|
|
14
|
-
|
|
15
|
-
<context>
|
|
16
|
-
|
|
17
|
-
**Repository path:** `{{REPO_PATH}}`
|
|
18
|
-
**Target file:** `{{FILE_NAME}}` — the harness will write the body you emit to this path.
|
|
19
|
-
**Mode:** `{{MODE}}` — one of `bootstrap` (no prior project context file), `adopt` (authored project context file
|
|
20
|
-
exists, do not clobber), `update` (prior harness-managed project context file exists; propose a prune + augment).
|
|
21
|
-
**Project type hint:** `{{PROJECT_TYPE}}`
|
|
22
|
-
**Static check-script suggestion (may be empty):** `{{CHECK_SCRIPT_SUGGESTION}}`
|
|
23
|
-
|
|
24
|
-
{{EXISTING_AGENTS_MD}}
|
|
25
|
-
|
|
26
|
-
</context>
|
|
27
|
-
|
|
28
|
-
<constraints>
|
|
29
|
-
|
|
30
|
-
**Inspection scope.** Read only configuration and metadata — `package.json`, `pyproject.toml`, `Cargo.toml`,
|
|
31
|
-
`go.mod`, `Makefile`, `mise.toml`, `.tool-versions`, `.github/workflows/*.yml`, `README.md`, top-level
|
|
32
|
-
`scripts/` entries, `flake.nix`. Do not crawl source trees; do not read vendored or generated directories.
|
|
33
|
-
|
|
34
|
-
**Inclusion test (the most important rule).** Include something only when an experienced engineer unfamiliar
|
|
35
|
-
with this repo would get it _wrong_ without being told. Anything an agent can derive by reading the code or the
|
|
36
|
-
existing docs does not belong in this file — empirical studies show that redundant context measurably reduces
|
|
37
|
-
agent success. Lean is better than comprehensive.
|
|
38
|
-
|
|
39
|
-
**Recommended sections (use only the ones that carry signal):**
|
|
40
|
-
|
|
41
|
-
- `## Build & Run` — exact commands the agent can't guess (custom dev runner, monorepo task graph, required env
|
|
42
|
-
vars). Skip when `pnpm dev` / `npm run dev` / `cargo run` is obvious from the manifest.
|
|
43
|
-
- `## Testing` — exact commands and any non-obvious test runner quirks (parallelism caps, fixture setup).
|
|
44
|
-
- `## Architecture` — three to six bullets naming module boundaries or layering rules an agent would otherwise
|
|
45
|
-
violate. Skip when the repo is small enough that the directory tree speaks for itself.
|
|
46
|
-
- `## Conventions` — code-style rules that **differ from language defaults**, naming or error-handling patterns
|
|
47
|
-
enforced by reviewers. Each bullet must be specific and verifiable: "Use `Result<T, E>` at service
|
|
48
|
-
boundaries; never throw for expected failures" beats "handle errors carefully".
|
|
49
|
-
- `## Security & Safety` — secrets handling, auth boundaries, anything the agent must not log or call. Include
|
|
50
|
-
when the repo touches user data, network, or credentials. Skip when the repo is a pure offline tool with no
|
|
51
|
-
such surface.
|
|
52
|
-
- `## Gotchas` — non-obvious behaviour that bit prior contributors (race conditions, hidden coupling, lock
|
|
53
|
-
files, env-specific bugs).
|
|
54
|
-
|
|
55
|
-
There is no required minimum — emit only what passes the inclusion test. A short, accurate file beats a long,
|
|
56
|
-
padded one.
|
|
57
|
-
|
|
58
|
-
**Hard caps.** Exactly one H1; at most 7 H2 sections; no H4 or deeper headings; **under 200 lines total**
|
|
59
|
-
(Anthropic's empirical guidance — adherence degrades past that). Prefer bullets and short sentences.
|
|
60
|
-
|
|
61
|
-
**Specificity rule.** Every rule must be specific and verifiable. Replace vague guidance ("write clean code",
|
|
62
|
-
"format properly") with concrete checks ("Use 2-space indentation"; "Run `pnpm verify` before committing").
|
|
63
|
-
Reserve emphasis tokens (`IMPORTANT`, `YOU MUST`) for genuinely surprising rules — overuse erodes their meaning.
|
|
64
|
-
|
|
65
|
-
**Do NOT include:**
|
|
66
|
-
|
|
67
|
-
- Tool-specific slash commands, hooks, subagent definitions, MCP server configurations, IDE settings — they
|
|
68
|
-
belong in `.claude/`, `.cursor/`, etc.
|
|
69
|
-
- Long tutorials, file-by-file descriptions, or generic engineering wisdom.
|
|
70
|
-
- Frequently-changing data (current versions beyond pins, ticket numbers, in-flight work).
|
|
71
|
-
- Credentials, user-specific paths, or commands that touch remote services.
|
|
72
|
-
- Standard language conventions the agent already knows.
|
|
73
|
-
- Hardcoded package-manager commands outside the project's actual scripts — cite `pnpm lint` only when
|
|
74
|
-
`package.json` has a `lint` script, and so on.
|
|
75
|
-
|
|
76
|
-
**Style.** Use the em-dash `—` (not `-`) for explanatory clauses in prose. Ordinary hyphens in identifiers and
|
|
77
|
-
compound words are fine.
|
|
78
|
-
|
|
79
|
-
**Mode-specific output rules.**
|
|
80
|
-
|
|
81
|
-
- `bootstrap` mode (no prior file): `<agents-md>` carries the FULL fresh body.
|
|
82
|
-
|
|
83
|
-
- `adopt` mode (a prior, hand-authored file exists — see `Existing project context file body` above): the
|
|
84
|
-
existing prose is authoritative. The output's `<agents-md>` MUST contain the existing body **byte-for-byte
|
|
85
|
-
verbatim** at the start, in its original order, with NO rewording, summarising, or reformatting. Append any
|
|
86
|
-
proposed additions as new H2 sections at the bottom. Do not modify, prune, or merge into existing sections.
|
|
87
|
-
Emit a `<changes>` block listing each addition. When you have nothing to add, still emit `<agents-md>` with
|
|
88
|
-
the existing body unchanged and a `<changes>` block reading `- no additions proposed`.
|
|
89
|
-
|
|
90
|
-
- `update` mode (the prior file is harness-managed and starts with the `<!-- ralphctl onboard: -->` marker):
|
|
91
|
-
emit the FULL replacement body in `<agents-md>` (you may prune and reorder) and a `<changes>` block listing
|
|
92
|
-
the non-obvious prunes / augments (`- removed stale command "npm run foo"`, `- added missing Security
|
|
93
|
-
section`).
|
|
94
|
-
|
|
95
|
-
**Setup script.** One shell line that prepares the working tree for an agentic session (typically dependency
|
|
96
|
-
install). Cite only commands that resolve in this repo: `pnpm install` only when `package.json` is present,
|
|
97
|
-
`pip install -r requirements.txt` only when that file exists, `cargo fetch` only with a `Cargo.toml`, and so
|
|
98
|
-
on. Reject pipe-to-shell shapes (`curl … | sh`, `wget -O- … | bash`), `eval`, and `rm -rf`. When no setup is
|
|
99
|
-
needed, omit the `<setup-script>` tag entirely.
|
|
100
|
-
|
|
101
|
-
**Verify script.** One shell line the harness runs as the post-task gate. Combine the typecheck / lint / test
|
|
102
|
-
commands the project actually exposes, chained with `&&`. Same rejection list as the setup script. When the
|
|
103
|
-
project exposes none of these, omit the `<verify-script>` tag.
|
|
104
|
-
|
|
105
|
-
**Skill suggestions.** At most three short kebab-case names matching libraries / patterns / domains the agent
|
|
106
|
-
would benefit from having loaded (e.g. `react-patterns`, `nextjs-app-router`, `prisma-migrations`). Optional —
|
|
107
|
-
omit the tag when the repo offers no clear hooks. Do not invent skills the user has not asked for.
|
|
108
|
-
|
|
109
|
-
</constraints>
|
|
110
|
-
|
|
111
|
-
<examples>
|
|
112
|
-
|
|
113
|
-
- Minimal Node.js API (bootstrap mode — only the sections that carry signal):
|
|
114
|
-
|
|
115
|
-
```
|
|
116
|
-
# Acme API
|
|
117
|
-
|
|
118
|
-
Internal REST service for order ingestion. Consumed by the dashboard and worker fleet.
|
|
119
|
-
|
|
120
|
-
## Build & Run
|
|
121
|
-
- `pnpm install`, then `pnpm dev` for local hot-reload on port 3000.
|
|
122
|
-
|
|
123
|
-
## Testing
|
|
124
|
-
- `pnpm test` runs Vitest unit + integration. Tag-filter via `pnpm test -- -t '<name>'`.
|
|
125
|
-
|
|
126
|
-
## Conventions
|
|
127
|
-
- Use `Result<T, E>` at service boundaries; never throw for expected failures.
|
|
128
|
-
- Validate every request body with Zod — no untyped inputs reach the service layer.
|
|
129
|
-
|
|
130
|
-
## Security & Safety
|
|
131
|
-
- Upstream gateway authenticates inbound requests — never trust the `X-User-Id` header directly.
|
|
132
|
-
- Do not log PII; scrub emails and phone numbers from error payloads.
|
|
133
|
-
```
|
|
134
|
-
|
|
135
|
-
No "Performance Constraints" section here — none was demonstrably present in the repo. A short, accurate
|
|
136
|
-
file is the goal.
|
|
137
|
-
|
|
138
|
-
- `adopt` mode example. Suppose the repo's existing `CLAUDE.md` is exactly:
|
|
139
|
-
|
|
140
|
-
```
|
|
141
|
-
# Acme API
|
|
142
|
-
|
|
143
|
-
## Build & Run
|
|
144
|
-
- `pnpm install`, then `pnpm dev`.
|
|
145
|
-
```
|
|
146
|
-
|
|
147
|
-
And you've identified that the project also exposes Vitest under `pnpm test`, plus a stable `Result<T, E>`
|
|
148
|
-
pattern across the service layer. The correct `<agents-md>` body is the existing body unchanged, with the
|
|
149
|
-
additions appended:
|
|
150
|
-
|
|
151
|
-
```
|
|
152
|
-
# Acme API
|
|
153
|
-
|
|
154
|
-
## Build & Run
|
|
155
|
-
- `pnpm install`, then `pnpm dev`.
|
|
156
|
-
|
|
157
|
-
## Testing
|
|
158
|
-
- `pnpm test` runs Vitest unit + integration.
|
|
159
|
-
|
|
160
|
-
## Conventions
|
|
161
|
-
- Use `Result<T, E>` at service boundaries; never throw for expected failures.
|
|
162
|
-
```
|
|
163
|
-
|
|
164
|
-
And the `<changes>` block lists exactly:
|
|
165
|
-
|
|
166
|
-
```
|
|
167
|
-
- added Testing section (Vitest commands)
|
|
168
|
-
- added Conventions section (Result<T, E> pattern at service boundaries)
|
|
169
|
-
```
|
|
170
|
-
|
|
171
|
-
</examples>
|
|
172
|
-
|
|
173
|
-
## Output Contract
|
|
174
|
-
|
|
175
|
-
After your inspection, emit exactly the elements below — each on its own line, in the order shown — with no preamble,
|
|
176
|
-
no commentary, no markdown fences around the elements:
|
|
177
|
-
|
|
178
|
-
1. `<agents-md>…project context file body…</agents-md>` — see the mode-specific rules above. In `bootstrap` and
|
|
179
|
-
`update` mode this is the full fresh / replacement body. In `adopt` mode the existing prose appears verbatim
|
|
180
|
-
at the start, with any additions appended as new H2 sections.
|
|
181
|
-
2. `<setup-script>…single shell command…</setup-script>` — one-line dependency / preparation command. Omit the tag
|
|
182
|
-
entirely when no setup is needed.
|
|
183
|
-
3. `<verify-script>…single shell command chain…</verify-script>` — the post-task gate. Omit the tag entirely when
|
|
184
|
-
the project exposes no typecheck / lint / test commands.
|
|
185
|
-
4. `<skill-suggestions>` — markdown bullet list, one `- skill-name` per line. Omit the tag entirely when no
|
|
186
|
-
suggestions apply. Example body:
|
|
187
|
-
|
|
188
|
-
```
|
|
189
|
-
- react-patterns
|
|
190
|
-
- nextjs-app-router
|
|
191
|
-
```
|
|
192
|
-
|
|
193
|
-
5. `<changes>…bullet list…</changes>` — REQUIRED in `adopt` and `update` modes (one bullet per addition / prune
|
|
194
|
-
/ non-obvious change; emit `- no additions proposed` if you genuinely have nothing to add). Omit the tag in
|
|
195
|
-
`bootstrap` mode.
|
|
196
|
-
|
|
197
|
-
## References
|
|
198
|
-
|
|
199
|
-
- Anthropic, _Claude Code Memory (CLAUDE.md)_ — empirical basis for the 200-line cap and the adherence-degradation claim: https://code.claude.com/docs/en/memory
|
|
200
|
-
- Anthropic, _Claude Code Best Practices_ — source of the "no slash commands / hooks / MCP / IDE settings" rule: https://code.claude.com/docs/en/best-practices
|
|
201
|
-
- Gloaguen et al., _Evaluating AGENTS.md_ (arXiv 2602.11988) — redundant context reduces agent success rate (~2.7% improvement from removing it; 2–3% degradation from LLM-generated context dumps)
|
|
@@ -1,10 +0,0 @@
|
|
|
1
|
-
<signals>
|
|
2
|
-
|
|
3
|
-
- `<task-verified>output</task-verified>` — Records verification results (required before completion)
|
|
4
|
-
|
|
5
|
-
Emit `<task-verified>` before `<task-complete>` — omitting verification leaves the harness with no record of what passed.
|
|
6
|
-
|
|
7
|
-
- `<task-complete>` — Marks task as done (ONLY after verified)
|
|
8
|
-
- `<task-blocked>reason</task-blocked>` — Marks task as blocked (cannot proceed)
|
|
9
|
-
|
|
10
|
-
</signals>
|