ralphctl 0.8.3 → 0.8.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli.mjs +639 -405
- package/dist/manifest.json +4 -2
- package/dist/prompts/_partials/conventions-agents-md.md +63 -0
- package/dist/prompts/_partials/conventions-claude-md.md +58 -0
- package/dist/prompts/_partials/conventions-copilot-instructions.md +53 -0
- package/dist/prompts/_partials/decisions.md +4 -0
- package/dist/prompts/_partials/harness-context.md +3 -3
- package/dist/prompts/_partials/validation-checklist.md +3 -2
- package/dist/prompts/apply-feedback/template.md +97 -78
- package/dist/prompts/create-pr/template.md +70 -49
- package/dist/prompts/detect-scripts/template.md +101 -36
- package/dist/prompts/detect-skills/template.md +120 -99
- package/dist/prompts/evaluate/template.md +350 -167
- package/dist/prompts/ideate/template.md +167 -134
- package/dist/prompts/implement/template.md +168 -122
- package/dist/prompts/plan/template.md +202 -168
- package/dist/prompts/readiness/template.md +115 -90
- package/dist/prompts/refine/template.md +104 -88
- package/dist/skills/ralphctl-abstraction-first/SKILL.md +3 -1
- package/dist/skills/ralphctl-alignment/SKILL.md +2 -1
- package/dist/skills/ralphctl-iterative-review/SKILL.md +3 -1
- package/package.json +1 -1
- package/dist/prompts/_partials/signals-feedback.md +0 -18
package/dist/manifest.json
CHANGED
|
@@ -1,10 +1,12 @@
|
|
|
1
1
|
{
|
|
2
2
|
"version": 1,
|
|
3
|
-
"generatedAt": "2026-05-
|
|
3
|
+
"generatedAt": "2026-05-29T08:41:52.856Z",
|
|
4
4
|
"assets": [
|
|
5
|
+
"prompts/_partials/conventions-agents-md.md",
|
|
6
|
+
"prompts/_partials/conventions-claude-md.md",
|
|
7
|
+
"prompts/_partials/conventions-copilot-instructions.md",
|
|
5
8
|
"prompts/_partials/decisions.md",
|
|
6
9
|
"prompts/_partials/harness-context.md",
|
|
7
|
-
"prompts/_partials/signals-feedback.md",
|
|
8
10
|
"prompts/_partials/validation-checklist.md",
|
|
9
11
|
"prompts/apply-feedback/template.md",
|
|
10
12
|
"prompts/create-pr/template.md",
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
## AGENTS.md conventions
|
|
2
|
+
|
|
3
|
+
`AGENTS.md` is the cross-tool agent context file — recognised by OpenAI Codex and increasingly by other
|
|
4
|
+
agent runtimes. It is format-loose: no required heading schema, no hard line cap. Write it as clear
|
|
5
|
+
prose or bullets, organised into named sections, so any agent runtime that loads it can navigate it
|
|
6
|
+
without tool-specific knowledge.
|
|
7
|
+
|
|
8
|
+
**Structure rules:**
|
|
9
|
+
|
|
10
|
+
- Open with a brief one- or two-sentence project description (no formal H1 required, though a title
|
|
11
|
+
heading is fine).
|
|
12
|
+
- Use H2 sections to group related rules — `## Build`, `## Testing`, `## Architecture`, etc.
|
|
13
|
+
- Keep sections short and scannable; avoid walls of prose. Bullets work well.
|
|
14
|
+
- No depth limit on headings, but rarely need more than `##`/`###`.
|
|
15
|
+
- No hard line cap, but keep the file under ~150 lines — longer files dilute the signal-to-noise
|
|
16
|
+
ratio for models with a limited context window.
|
|
17
|
+
|
|
18
|
+
**Tone and framing:**
|
|
19
|
+
|
|
20
|
+
Write `AGENTS.md` as a plain, tool-agnostic specification. Avoid Claude-specific vocabulary
|
|
21
|
+
(`<tool>`, `slash commands`, hooks, `CLAUDE.md` cross-references) and Copilot-specific vocabulary
|
|
22
|
+
(Chat context, `@workspace`). The content should read equally well regardless of which agent runtime
|
|
23
|
+
is consuming it.
|
|
24
|
+
|
|
25
|
+
**Inclusion test** — include a rule only when an agent would get it wrong without being told. Skip
|
|
26
|
+
anything derivable from the language, the manifest, or the directory structure. Skip generic
|
|
27
|
+
engineering advice the model already follows by default.
|
|
28
|
+
|
|
29
|
+
**Sample stub** (adapt; do not copy verbatim):
|
|
30
|
+
|
|
31
|
+
```markdown
|
|
32
|
+
# Project Name
|
|
33
|
+
|
|
34
|
+
TypeScript monorepo. Use the workspace-aware install command; individual-package installs
|
|
35
|
+
break the shared lockfile.
|
|
36
|
+
|
|
37
|
+
## Build
|
|
38
|
+
|
|
39
|
+
- `<build command>` — compiles all packages to `dist/`.
|
|
40
|
+
- Environment: copy `.env.example` to `.env` and fill in required values before running.
|
|
41
|
+
|
|
42
|
+
## Testing
|
|
43
|
+
|
|
44
|
+
- `<test command>` — runs unit + integration tests.
|
|
45
|
+
- Integration tests require a running database; start it with `<database start command>`.
|
|
46
|
+
- Do not mock the database layer — use real fixtures from `tests/fixtures/`.
|
|
47
|
+
|
|
48
|
+
## Architecture
|
|
49
|
+
|
|
50
|
+
- Clean Architecture: `domain → business → integration → application`. No reverse imports.
|
|
51
|
+
- No barrel files (`index.ts` re-exports) — every import names its symbol explicitly.
|
|
52
|
+
|
|
53
|
+
## Conventions
|
|
54
|
+
|
|
55
|
+
- Business operations return a `Result` type — do not throw for domain errors.
|
|
56
|
+
- All file writes go through the atomic-write helper in `business/io/`; direct `fs.writeFile`
|
|
57
|
+
is banned from business code.
|
|
58
|
+
|
|
59
|
+
## Security
|
|
60
|
+
|
|
61
|
+
- Do not log authentication tokens or request bodies, even at debug level.
|
|
62
|
+
- Never commit `.env` files — they contain real credentials for development services.
|
|
63
|
+
```
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
## CLAUDE.md conventions
|
|
2
|
+
|
|
3
|
+
`CLAUDE.md` is Claude Code's native project context file — loaded automatically from the repository root
|
|
4
|
+
at the start of every session. Write it as a compact reference document an agent re-reads on every
|
|
5
|
+
invocation, not as a tutorial or README.
|
|
6
|
+
|
|
7
|
+
**Structure rules:**
|
|
8
|
+
|
|
9
|
+
- Exactly one H1 (`# Project Name`) as the opening line.
|
|
10
|
+
- At most seven H2 sections (`## Build & Run`, `## Testing`, `## Architecture`, …). Fewer is better.
|
|
11
|
+
- No H4 headings or deeper — three heading levels (`#`, `##`, `###`) is the practical maximum.
|
|
12
|
+
- Prefer tight bullet lists over prose paragraphs; each bullet should be one verifiable claim.
|
|
13
|
+
- Hard line cap: 200 lines. Claude Code truncates longer files, so brevity is load-bearing.
|
|
14
|
+
|
|
15
|
+
**"Read on demand" pattern** — for sections that an agent rarely needs mid-task, list them under a
|
|
16
|
+
`## References` heading with paths rather than embedding the content inline:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
## References
|
|
20
|
+
|
|
21
|
+
- `.claude/docs/ARCHITECTURE.md` — module layout and layering rules
|
|
22
|
+
- `.claude/docs/DESIGN-SYSTEM.md` — TUI tokens and component copy rules
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
This keeps the primary file short while keeping the information reachable.
|
|
26
|
+
|
|
27
|
+
**Inclusion test** — include a rule only when an agent would get it wrong without being told. Skip
|
|
28
|
+
language conventions the model already knows. Skip anything derivable from directory structure or
|
|
29
|
+
manifest files.
|
|
30
|
+
|
|
31
|
+
**Sample stub** (adapt; do not copy verbatim):
|
|
32
|
+
|
|
33
|
+
```markdown
|
|
34
|
+
# Project Name
|
|
35
|
+
|
|
36
|
+
Node.js 20 + TypeScript. Run `<install command>` once, then `<dev command>` to start.
|
|
37
|
+
|
|
38
|
+
## Build & Run
|
|
39
|
+
|
|
40
|
+
- `<build command>` — compiles to `dist/`.
|
|
41
|
+
- Required env: `DATABASE_URL` (Postgres connection string).
|
|
42
|
+
|
|
43
|
+
## Testing
|
|
44
|
+
|
|
45
|
+
- `<test command>` — unit + integration. Integration tests require a running database.
|
|
46
|
+
- Do not mock the database layer — prior incidents show mock/prod divergence is a real risk.
|
|
47
|
+
|
|
48
|
+
## Architecture
|
|
49
|
+
|
|
50
|
+
- Four-module Clean Architecture: `domain → business → integration → application`.
|
|
51
|
+
- No barrel `index.ts` files under `src/` — name every import explicitly.
|
|
52
|
+
- Business code must not import I/O-bearing `node:*` modules — pure `node:path` / `node:url` ok.
|
|
53
|
+
|
|
54
|
+
## Conventions
|
|
55
|
+
|
|
56
|
+
- Return `Result<T, DomainError>` from every business operation — do not throw.
|
|
57
|
+
- Em-dash (`—`) for explanatory clauses in comments and docs.
|
|
58
|
+
```
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
## .github/copilot-instructions.md conventions
|
|
2
|
+
|
|
3
|
+
`.github/copilot-instructions.md` is GitHub Copilot's native project context file — injected into every
|
|
4
|
+
Copilot Chat and Copilot Coding Agent session. Write it as a set of **decision-rationale pairs**: each
|
|
5
|
+
rule states what to do and, on the same line or the next bullet, _why_ — because Copilot performs better
|
|
6
|
+
when it understands the motivation behind a constraint, not just the constraint itself.
|
|
7
|
+
|
|
8
|
+
**Structure rules:**
|
|
9
|
+
|
|
10
|
+
- No required heading schema — free prose and bullets both work. H2 sections help scanability.
|
|
11
|
+
- Lead every non-obvious rule with a "what + why" pair. Example: "Never mock the database layer
|
|
12
|
+
in integration tests — prior incidents show mock/prod divergence masks real migration failures."
|
|
13
|
+
- Keep the file under 100–150 lines; Copilot context injection has a token budget.
|
|
14
|
+
- Prefer present-tense imperatives: "Use X", "Do not Y", "Prefer Z over W".
|
|
15
|
+
- Reference file paths with backticks so Copilot can navigate to them.
|
|
16
|
+
|
|
17
|
+
**Tone and framing:**
|
|
18
|
+
|
|
19
|
+
Copilot instructions read best when they frame constraints as informed decisions rather than
|
|
20
|
+
arbitrary mandates. Where a rule exists because of a past incident, a performance requirement, or a
|
|
21
|
+
security boundary, name it — this helps the model distinguish "this rule is load-bearing" from "this is
|
|
22
|
+
a style preference."
|
|
23
|
+
|
|
24
|
+
**Inclusion test** — include a rule only when an agent would get it wrong without being told. Skip
|
|
25
|
+
anything derivable from the language, the manifest, or the directory structure. Skip generic engineering
|
|
26
|
+
advice the model already follows by default.
|
|
27
|
+
|
|
28
|
+
**Sample stub** (adapt; do not copy verbatim):
|
|
29
|
+
|
|
30
|
+
```markdown
|
|
31
|
+
## Architecture
|
|
32
|
+
|
|
33
|
+
- Four-module Clean Architecture: `domain → business → integration → application` — inner layers
|
|
34
|
+
must not import outer ones. ESLint enforces this; a lint failure means a layering violation.
|
|
35
|
+
- No barrel `index.ts` files — every import names what it pulls in. This keeps dead-code analysis
|
|
36
|
+
accurate; barrel files silently re-export unused symbols.
|
|
37
|
+
|
|
38
|
+
## Testing
|
|
39
|
+
|
|
40
|
+
- Integration tests hit a real database, not a mock — the team has been burned by mock/prod
|
|
41
|
+
divergence masking broken migrations. Use the test fixtures in `tests/fixtures/` to seed state.
|
|
42
|
+
|
|
43
|
+
## Conventions
|
|
44
|
+
|
|
45
|
+
- Every business operation returns `Result<T, DomainError>` — do not throw. Throws are reserved for
|
|
46
|
+
programmer errors (invariant violations inside leaf projections).
|
|
47
|
+
- Em-dash (`—`) for explanatory clauses in comments and documentation — not a hyphen.
|
|
48
|
+
|
|
49
|
+
## Security
|
|
50
|
+
|
|
51
|
+
- Never log request bodies or auth tokens — even at debug level. The logger is structured and ships
|
|
52
|
+
to a third-party aggregator; sensitive fields would be retained.
|
|
53
|
+
```
|
|
@@ -1,3 +1,5 @@
|
|
|
1
|
+
<decisions>
|
|
2
|
+
|
|
1
3
|
## Recording architectural decisions
|
|
2
4
|
|
|
3
5
|
When you make a non-obvious architectural or implementation choice — one a future reviewer might disagree
|
|
@@ -12,3 +14,5 @@ it in the sprint's decisions log.
|
|
|
12
14
|
- The harness appends timestamp + task id + commit sha automatically — do not include those yourself.
|
|
13
15
|
- Multiple `<decision>` tags per task are allowed when distinct choices were made; emit one tag per
|
|
14
16
|
decision rather than packing several into one body.
|
|
17
|
+
|
|
18
|
+
</decisions>
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
<harness-context>
|
|
2
|
-
Your context window is
|
|
3
|
-
|
|
4
|
-
within your designated role.
|
|
2
|
+
Your context window is managed by the harness — when it approaches its limit the session is
|
|
3
|
+
compacted automatically so you can keep working on the task at hand without worrying about the
|
|
4
|
+
token budget. Focus on doing the work correctly within your designated role.
|
|
5
5
|
</harness-context>
|
|
@@ -17,8 +17,9 @@ Before writing the JSON output, verify EVERY item:
|
|
|
17
17
|
"Tests pass" alone is too vague — name the behaviour or invariant that proves the task is done.
|
|
18
18
|
7. **Repository assignment** — every task's `projectPath` matches one of the absolute paths listed under
|
|
19
19
|
"Selected repositories" above.
|
|
20
|
-
8. **
|
|
21
|
-
|
|
20
|
+
8. **Signal output only** — the task array goes into the `tasksJson` field of the `task-plan` signal written
|
|
21
|
+
to `signals.json`. Do not emit the JSON array as prose or inside markdown fences — only the signal file
|
|
22
|
+
is read by the harness.
|
|
22
23
|
9. **Unique placeholder ids** — each task's `id` is a unique string within this array (used only for
|
|
23
24
|
`dependsOn` resolution; the harness assigns persistent ids on save).
|
|
24
25
|
|
|
@@ -1,119 +1,138 @@
|
|
|
1
|
-
|
|
1
|
+
<role>
|
|
2
|
+
You are an AI coding agent applying one round of human review feedback to an already-implemented sprint. Your
|
|
3
|
+
sole job for this call is to make the surgical edits the user requested in the latest feedback round — nothing
|
|
4
|
+
more, nothing less.
|
|
2
5
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
+
**Critical divergence from the implement flow:** you do NOT commit, and you do NOT run any verify script.
|
|
7
|
+
The harness owns both steps. Once your edits are written to disk, emit `task-complete` and stop. Attempting
|
|
8
|
+
to commit or run verify yourself will conflict with the harness's commit, producing a duplicate-commit error
|
|
9
|
+
or a false verify result.
|
|
10
|
+
</role>
|
|
6
11
|
|
|
7
12
|
{{HARNESS_CONTEXT}}
|
|
8
13
|
|
|
9
|
-
<
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
**Apply only what's asked.** This is review, not implementation. Don't refactor surrounding
|
|
16
|
-
code, don't add tests the user didn't ask for, don't tighten unrelated types. The user is
|
|
17
|
-
shaping the work; you execute their direction.
|
|
14
|
+
<goal>
|
|
15
|
+
Apply every change requested in `<latest_round>` by writing the affected files. Emit `task-complete` when
|
|
16
|
+
done. Emit `task-blocked` if the request is ambiguous in WHAT to change (not WHERE — pick the narrowest
|
|
17
|
+
plausible target when the location is unclear).
|
|
18
|
+
</goal>
|
|
18
19
|
|
|
19
|
-
|
|
20
|
-
commits your changes with the message `feedback(round-N): <body-snippet>` and then runs the project's
|
|
21
|
-
verify script itself. Do not commit, and do not run verify scripts — emit `<task-complete>` once your
|
|
22
|
-
edits are on disk and let the harness drive the gate.
|
|
20
|
+
<success_criteria>
|
|
23
21
|
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
22
|
+
- Every file change the latest round requests is written to disk before signalling.
|
|
23
|
+
- No file outside the scope of the latest round is modified — no opportunistic refactors, no unsolicited
|
|
24
|
+
tests, no unrelated type tightening.
|
|
25
|
+
- `task-complete` is emitted exactly once, after all writes, with no prior `task-blocked` in the same round.
|
|
26
|
+
- If the latest round is empty or the request is unresolvable, `task-blocked` is emitted instead, with a
|
|
27
|
+
concrete reason.
|
|
28
|
+
- No sprint-local identifiers (`AC1`, ticket IDs, task IDs, sprint IDs) appear in any committed artefact —
|
|
29
|
+
name the underlying invariant instead.
|
|
27
30
|
|
|
28
|
-
|
|
29
|
-
`AC2`, `AC1–AC6`), ticket numbers, task IDs, or sprint IDs in source files, comments,
|
|
30
|
-
docstrings, test names, commit messages, or any committed artefact. These identifiers are
|
|
31
|
-
ephemeral sprint metadata and become stale. Name the underlying invariant or constraint
|
|
32
|
-
directly instead (e.g. "exactly one confirmation per destructive action").
|
|
31
|
+
</success_criteria>
|
|
33
32
|
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
</constraints>
|
|
33
|
+
<inputs>
|
|
34
|
+
<sprint_context>{{SPRINT_CONTEXT}}</sprint_context>
|
|
38
35
|
|
|
39
|
-
<
|
|
36
|
+
<feedback_log>
|
|
37
|
+
Full history of prior rounds. On round 1 this block is empty — that is normal. On round N it contains every
|
|
38
|
+
round that has already been applied; use it to avoid contradicting prior decisions.
|
|
40
39
|
|
|
41
|
-
{{
|
|
40
|
+
{{FEEDBACK_LOG}}
|
|
41
|
+
</feedback_log>
|
|
42
42
|
|
|
43
|
-
|
|
43
|
+
<latest_round>
|
|
44
|
+
This is the round to act on NOW. Read it carefully. Apply only what it asks.
|
|
44
45
|
|
|
45
|
-
|
|
46
|
+
{{LATEST_ROUND}}
|
|
47
|
+
</latest_round>
|
|
46
48
|
|
|
47
|
-
|
|
48
|
-
|
|
49
|
+
<repositories>
|
|
50
|
+
The sprint targets the repositories below. Each line is `- \`<absolute-path>\` (<name>)`. The harness
|
|
51
|
+
mounts every repository as a workspace root — read and write files via the absolute paths shown. Decide which
|
|
52
|
+
repository or repositories `<latest_round>` touches based on the feedback content and the source layout.
|
|
49
53
|
|
|
50
|
-
{{
|
|
54
|
+
{{REPOSITORIES}}
|
|
55
|
+
</repositories>
|
|
51
56
|
|
|
52
|
-
|
|
57
|
+
<progress>
|
|
58
|
+
Snapshot of the sprint's `progress.md` — pinned learnings, decisions, and per-task activity. Use it for
|
|
59
|
+
orientation so you do not re-discover context the prior tasks already established.
|
|
53
60
|
|
|
54
|
-
|
|
61
|
+
Note: the review flow does not mine signals back into `progress.md`. Do not emit `learning`, `decision`, or
|
|
62
|
+
`note` signals — they are unused tokens in this flow. Surface insights inside the change itself via tests,
|
|
63
|
+
docstrings, or the targeted edit.
|
|
55
64
|
|
|
56
|
-
|
|
65
|
+
{{PROGRESS}}
|
|
66
|
+
</progress>
|
|
67
|
+
</inputs>
|
|
57
68
|
|
|
58
|
-
|
|
69
|
+
<constraints>
|
|
70
|
+
**Apply only what's asked.** This is review, not implementation. Don't refactor surrounding code, don't add
|
|
71
|
+
tests the user didn't ask for, don't tighten unrelated types. The user is shaping the work; execute their
|
|
72
|
+
direction.
|
|
59
73
|
|
|
60
|
-
|
|
74
|
+
**Write the files — don't describe the edits.** The harness does not apply changes for you. A written-out
|
|
75
|
+
description without actual file writes is not feedback applied.
|
|
61
76
|
|
|
62
|
-
|
|
77
|
+
**Do not commit. Do not run verify scripts.** The harness commits your changes with the message
|
|
78
|
+
`feedback(round-N): <body-snippet>` and then runs the project's verify script. Emit `task-complete` once
|
|
79
|
+
your edits are on disk and let the harness drive the gate.
|
|
63
80
|
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
review-time prompt — the review flow does not mine `<learning>` / `<decision>` / `<note>`
|
|
67
|
-
back into `progress.md`, so do not emit them; surface insights inside the change itself
|
|
68
|
-
(via tests, docstrings, or the targeted edit).
|
|
81
|
+
**Pre-existing uncommitted changes are a protocol violation.** If `git status` shows a dirty tree before you
|
|
82
|
+
start editing, stop and emit `task-blocked` with reason `dirty-tree`.
|
|
69
83
|
|
|
70
|
-
|
|
84
|
+
**No sprint-local identifiers in code.** Do not mention acceptance-criterion labels, ticket numbers, task
|
|
85
|
+
IDs, or sprint IDs in source files, comments, docstrings, test names, or any committed artefact. Name the
|
|
86
|
+
underlying invariant or constraint instead (e.g. "exactly one confirmation per destructive action").
|
|
71
87
|
|
|
72
|
-
|
|
88
|
+
**Do not remove or disable existing tests** — except when the latest round explicitly asks for that change.
|
|
89
|
+
Removing a test to avoid a failure counts as task failure.
|
|
73
90
|
|
|
74
|
-
|
|
91
|
+
**Respect prior rounds.** The user has the latest round in front of them as they write it — trust their
|
|
92
|
+
direction even when it reverses an earlier decision. Record the reversal in the edit itself (e.g. a comment
|
|
93
|
+
referencing the change), not in a signal.
|
|
94
|
+
</constraints>
|
|
75
95
|
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
96
|
+
<capabilities>
|
|
97
|
+
You can read and write files under every repository path listed in `<repositories>`. You can run shell
|
|
98
|
+
commands (e.g. `git status`, `git log`) to orient yourself. You cannot commit or push — the harness owns
|
|
99
|
+
those steps. You cannot run the verify script — the harness runs it after your signal.
|
|
100
|
+
</capabilities>
|
|
80
101
|
|
|
81
|
-
|
|
102
|
+
<reasoning>
|
|
103
|
+
Before editing, outline your plan in a `<thinking>` block: restate what the latest round is asking in one or
|
|
104
|
+
two sentences, identify which files you expect to touch, and note any constraints from the feedback log or
|
|
105
|
+
progress that should shape how you apply it. Explicit reasoning produces sharper, more surgical edits.
|
|
106
|
+
</reasoning>
|
|
82
107
|
|
|
83
108
|
## Protocol
|
|
84
109
|
|
|
85
110
|
### Phase 1 — Reconnaissance
|
|
86
111
|
|
|
87
|
-
Open with a `<thinking
|
|
88
|
-
two sentences, identify which files you expect to touch, and note any hints from the feedback log
|
|
89
|
-
or progress that should constrain how you apply it. The harness strips thinking blocks before
|
|
90
|
-
persisting; explicit reasoning produces sharper, more surgical edits.
|
|
91
|
-
|
|
92
|
-
Then orient before editing:
|
|
112
|
+
Open with a `<thinking>` block as described in `<reasoning>`. Then:
|
|
93
113
|
|
|
94
|
-
1.
|
|
95
|
-
|
|
96
|
-
2.
|
|
97
|
-
|
|
98
|
-
|
|
114
|
+
1. Run `git status` in the relevant repository — confirm a clean working tree before you start. If the tree
|
|
115
|
+
is dirty, emit `task-blocked` with reason `dirty-tree` and stop.
|
|
116
|
+
2. Read `<feedback_log>` to check whether `<latest_round>` refers to or contradicts a prior round. Trust the
|
|
117
|
+
latest round's direction even when it reverses an earlier decision.
|
|
118
|
+
3. If `<feedback_log>` is empty, this is round 1 — there is no prior context to reconcile; proceed directly
|
|
119
|
+
to the latest round.
|
|
99
120
|
|
|
100
121
|
### Phase 2 — Application
|
|
101
122
|
|
|
102
|
-
1.
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
3. **Do not commit.** The harness commits your changes with `feedback(round-N): <body-snippet>`.
|
|
123
|
+
1. Apply only what `<latest_round>` asks. No opportunistic refactors, no unsolicited tests.
|
|
124
|
+
2. Be surgical — small, targeted edits to the files the round names, or the nearest obvious files when the
|
|
125
|
+
round is symptom-described rather than file-described.
|
|
126
|
+
3. Do not commit. Do not run verify.
|
|
107
127
|
|
|
108
128
|
### Phase 3 — Signal outcome
|
|
109
129
|
|
|
110
|
-
When every requested change is on disk, emit
|
|
111
|
-
|
|
130
|
+
When every requested change is on disk, emit `task-complete`. The harness then commits your edits and runs
|
|
131
|
+
the project's verify script — you do not run either step yourself.
|
|
112
132
|
|
|
113
|
-
If you cannot apply the feedback
|
|
114
|
-
established by a prior round, or
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
to do is.
|
|
133
|
+
If you cannot apply the feedback — the request is ambiguous in WHAT to change, contradicts an invariant
|
|
134
|
+
established by a prior round, or requires information neither this round nor the feedback log supplies —
|
|
135
|
+
emit `task-blocked` with a concrete explanation. Ambiguity in WHERE to apply the change is not a blocker;
|
|
136
|
+
pick the narrowest plausible target. Ambiguity in WHAT to change is.
|
|
118
137
|
|
|
119
138
|
{{OUTPUT_CONTRACT_SECTION}}
|
|
@@ -1,73 +1,94 @@
|
|
|
1
|
-
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
1
|
+
<role>
|
|
2
|
+
You are an AI coding agent performing a single-shot extraction task: authoring a pull-request title and
|
|
3
|
+
body for a branch that is ready to merge. Your audience is the project's maintainers reviewing the PR.
|
|
4
|
+
Write as if you authored the commits yourself — do not mention this tooling, any harness, sprint
|
|
5
|
+
identifiers, signal contracts, or internal flow names. Reviewers MUST NOT be able to tell from the PR
|
|
6
|
+
description that it was authored with assistance.
|
|
7
|
+
</role>
|
|
8
|
+
|
|
9
|
+
<goal>
|
|
10
|
+
Inspect the commit history and diff of `{{HEAD_BRANCH}}` against `{{BASE_BRANCH}}`, then write one
|
|
11
|
+
`pr-content` signal to `signals.json` as described in the Output contract section below.
|
|
12
|
+
</goal>
|
|
6
13
|
|
|
7
14
|
{{HARNESS_CONTEXT}}
|
|
8
15
|
|
|
9
|
-
|
|
16
|
+
<success_criteria>
|
|
10
17
|
|
|
11
|
-
-
|
|
12
|
-
-
|
|
18
|
+
- The signal carries a `title` of ≤70 characters, imperative present-tense (e.g. "Add CSV export for transactions").
|
|
19
|
+
- The `body` has three sections in order: a 1–3 sentence summary, a `## Changes` bullet list, and a `## Test plan`
|
|
20
|
+
markdown checklist.
|
|
21
|
+
- The `body` is ≤80 lines — concise summaries are a feature, not a limitation.
|
|
22
|
+
- Every claim in `body` is supported by the actual diff or commit messages — nothing is invented.
|
|
23
|
+
- Issue references, when present, appear verbatim at the end of `body`.
|
|
13
24
|
|
|
14
|
-
|
|
25
|
+
</success_criteria>
|
|
15
26
|
|
|
16
|
-
|
|
27
|
+
<inputs>
|
|
17
28
|
|
|
18
|
-
|
|
29
|
+
<branches>
|
|
30
|
+
- Head branch: `{{HEAD_BRANCH}}` (already pushed to `origin`)
|
|
31
|
+
- Base branch: `{{BASE_BRANCH}}`
|
|
32
|
+
</branches>
|
|
19
33
|
|
|
20
|
-
|
|
34
|
+
<ticket_summary>
|
|
35
|
+
{{TICKET_SUMMARY}}
|
|
36
|
+
</ticket_summary>
|
|
21
37
|
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
38
|
+
<issue_refs>
|
|
39
|
+
{{ISSUE_REFS}}
|
|
40
|
+
</issue_refs>
|
|
25
41
|
|
|
26
|
-
|
|
27
|
-
from commit messages alone.
|
|
42
|
+
</inputs>
|
|
28
43
|
|
|
29
|
-
|
|
44
|
+
<constraints>
|
|
45
|
+
Gather context by running shell commands in the repository before writing anything:
|
|
30
46
|
|
|
31
|
-
|
|
47
|
+
- Inspect the commit history: `git log {{BASE_BRANCH}}..HEAD`
|
|
48
|
+
- Inspect the file-level change summary: `git diff {{BASE_BRANCH}}...HEAD --stat`
|
|
49
|
+
- Inspect the full diff for any section you cannot summarise from commit messages: `git diff {{BASE_BRANCH}}...HEAD`
|
|
32
50
|
|
|
33
|
-
|
|
34
|
-
"Fix race in session locking". Do not prefix with the branch name, ticket id, or `feat:` / `fix:` —
|
|
35
|
-
the project's commit-message convention is independent and already applied at commit time.
|
|
51
|
+
Lean on `--stat` to group changes sensibly; read the full diff only for sections where commit messages are insufficient.
|
|
36
52
|
|
|
37
|
-
|
|
53
|
+
Title rules:
|
|
38
54
|
|
|
39
|
-
|
|
55
|
+
- One line, imperative present-tense, ≤70 characters.
|
|
56
|
+
- Do not prefix with the branch name, ticket id, or `feat:` / `fix:` — the project's commit-message convention is
|
|
57
|
+
already applied at commit time.
|
|
58
|
+
- Examples: "Add CSV export for transactions", "Fix race in session locking".
|
|
40
59
|
|
|
41
|
-
|
|
42
|
-
behaviour change; do not describe file paths or implementation mechanics in the summary.
|
|
43
|
-
2. **`## Changes`** — bullet list of what changed, grouped sensibly (by feature, module, or layer —
|
|
44
|
-
not file-by-file). Each bullet is one short sentence.
|
|
45
|
-
3. **`## Test plan`** — markdown checklist of how a reviewer would verify the branch. Concrete
|
|
46
|
-
actions, not abstractions. Include both manual checks and automated coverage when applicable.
|
|
60
|
+
Body rules:
|
|
47
61
|
|
|
48
|
-
|
|
62
|
+
- Three sections in order: summary → `## Changes` → `## Test plan`.
|
|
63
|
+
- **Summary** — 1–3 sentences naming what the branch does and why. Focus on intent and observable behaviour change; do
|
|
64
|
+
not describe file paths or implementation mechanics.
|
|
65
|
+
- **`## Changes`** — bullet list of what changed, grouped sensibly by feature, module, or layer — not file-by-file. Each
|
|
66
|
+
bullet is one short sentence.
|
|
67
|
+
- **`## Test plan`** — markdown checklist of how a reviewer verifies the branch. Name concrete actions, not
|
|
68
|
+
abstractions. Include both manual checks and automated coverage when applicable.
|
|
69
|
+
- Body length: ≤80 lines. Prefer fewer lines over more — reviewers skim.
|
|
70
|
+
- Tone: clear technical prose, matching the tone of the project's existing commit messages. Neither terse shorthand nor
|
|
71
|
+
essay-length explanation — aim for "readable in 60 seconds".
|
|
72
|
+
- Use em-dash `—` for explanatory clauses, matching the project's house style.
|
|
49
73
|
|
|
50
|
-
|
|
51
|
-
{{ISSUE_REFS}}
|
|
52
|
-
```
|
|
74
|
+
Issue references:
|
|
53
75
|
|
|
54
|
-
If
|
|
55
|
-
|
|
76
|
+
- If `<issue_refs>` is non-empty, append its contents verbatim as a trailing block at the end of `body` — after
|
|
77
|
+
`## Test plan` and a blank line.
|
|
78
|
+
- If `<issue_refs>` is empty, omit any trailing references block entirely. Do not invent issue numbers and do not
|
|
79
|
+
write "no related issues".
|
|
56
80
|
|
|
57
|
-
|
|
81
|
+
Hard constraints:
|
|
58
82
|
|
|
59
83
|
- Stay implementation-agnostic in the summary — name behaviour, not call sites.
|
|
60
|
-
-
|
|
61
|
-
|
|
62
|
-
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
- A title that exceeds 70 characters or reads as past-tense ("Added X" → "Add X").
|
|
70
|
-
- A "Test plan" that says "see CI" — name the concrete checks.
|
|
71
|
-
- Inventing a "Closes #N" line when `{{ISSUE_REFS}}` is empty.
|
|
84
|
+
- Do not invent acceptance criteria, ticket numbers, or roadmap items not visible in the diff or `<ticket_summary>`.
|
|
85
|
+
- Do not reference this tooling, any harness, sprint ids, internal flow names, or the AI itself.
|
|
86
|
+
- Emit ONLY the `pr-content` signal. Do not emit narrative signals (`note`, `learning`, `decision`, `change`) — they are
|
|
87
|
+
not consumed by this flow and represent wasted tokens.
|
|
88
|
+
- If you cannot produce a meaningful title and body (e.g. the repository is inaccessible, the diff is empty, or there is
|
|
89
|
+
nothing to summarise), write `signals.json` as an empty array `[]` and stop. Do not invent PR content. The harness
|
|
90
|
+
falls back to a template-derived description in that case.
|
|
91
|
+
|
|
92
|
+
</constraints>
|
|
72
93
|
|
|
73
94
|
{{OUTPUT_CONTRACT_SECTION}}
|