@xenonbyte/req-2-plan 0.6.0 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +81 -29
- package/README.zh-CN.md +70 -26
- package/package.json +1 -1
- package/tools/r2p-task-brief +10 -0
- package/tools/workflow_cli/agent_shortcuts.py +20 -0
- package/tools/workflow_cli/agent_templates/claude/commands/r2p-execute.md +101 -12
- package/tools/workflow_cli/agent_templates/codex/skills/r2p-execute/SKILL.md +101 -12
- package/tools/workflow_cli/cli.py +92 -3
- package/tools/workflow_cli/install.py +2 -1
- package/tools/workflow_cli/version.py +1 -1
package/README.md
CHANGED
|
@@ -6,12 +6,16 @@ English | [简体中文](README.zh-CN.md)
|
|
|
6
6
|
[](https://nodejs.org)
|
|
7
7
|
[](./LICENSE)
|
|
8
8
|
|
|
9
|
-
> Turn a raw requirement into an approved, executor-neutral implementation PLAN across Claude Code, Codex, Gemini, and opencode.
|
|
9
|
+
> Turn a raw requirement into an approved, executor-neutral implementation PLAN - then execute that PLAN in place - across Claude Code, Codex, Gemini, and opencode.
|
|
10
10
|
|
|
11
|
-
`req-2-plan` installs the `r2p` workflow for AI coding agents
|
|
12
|
-
requirement through a staged, gated
|
|
13
|
-
discovery**, **DESIGN**, **SPEC**, and
|
|
14
|
-
|
|
11
|
+
`req-2-plan` installs the `r2p` workflow for AI coding agents, and it works in
|
|
12
|
+
two phases. **Plan:** it takes a rough requirement through a staged, gated
|
|
13
|
+
process - **requirement brief**, **risk discovery**, **DESIGN**, **SPEC**, and
|
|
14
|
+
**PLAN** - so the final plan is grounded, reviewed, and ready to execute.
|
|
15
|
+
**Execute:** `r2p-execute` then drives that approved PLAN through an in-place,
|
|
16
|
+
subagent-orchestrated implementation loop on your current branch - one
|
|
17
|
+
implementer per task, a reviewer after each, a whole-branch final review, then
|
|
18
|
+
auto-archive - so the same tool that planned the change can also land it.
|
|
15
19
|
|
|
16
20
|
The npm package is the lifecycle installer. It currently supports four agent
|
|
17
21
|
platforms - **Claude Code**, **Codex**, **Gemini**, and **opencode**. From one
|
|
@@ -19,7 +23,7 @@ shared source it generates platform-specific agent surfaces, installs the shared
|
|
|
19
23
|
`r2p-*` wrappers, and keeps an owned manifest so uninstall only removes files
|
|
20
24
|
managed by `r2p`.
|
|
21
25
|
|
|
22
|
-
**Contents:** [Why r2p](#why-r2p) · [Features](#features) · [Installation](#installation) · [Quick start](#quick-start) · [Workflow commands](#workflow-commands) · [Development](#development)
|
|
26
|
+
**Contents:** [Why r2p](#why-r2p) · [Features](#features) · [Installation](#installation) · [Quick start](#quick-start) · [Workflow commands](#workflow-commands) · [Executing a PLAN](#executing-a-plan) · [Development](#development)
|
|
23
27
|
|
|
24
28
|
## Why r2p
|
|
25
29
|
|
|
@@ -31,7 +35,7 @@ planning phase explicit:
|
|
|
31
35
|
- risks and unknowns are surfaced before implementation planning;
|
|
32
36
|
- DESIGN, SPEC, and PLAN each pass structural quality gates;
|
|
33
37
|
- human decisions are recorded instead of guessed;
|
|
34
|
-
- execution
|
|
38
|
+
- execution runs straight from the PLAN - by hand or via `r2p-execute` - without re-deciding scope.
|
|
35
39
|
|
|
36
40
|
Use it when the requirement is more than a one-line edit, when a change touches
|
|
37
41
|
important behavior, or when you want a durable handoff between agents.
|
|
@@ -43,9 +47,9 @@ important behavior, or when you want a durable handoff between agents.
|
|
|
43
47
|
- **Four supported platforms**: installs matching surfaces for Claude Code (`claude`), Codex (`codex`), Gemini (`gemini`), and opencode (`opencode`).
|
|
44
48
|
- **One lifecycle CLI**: `r2p install`, `r2p uninstall`, `r2p status`, `r2p version`, and `r2p help`.
|
|
45
49
|
- **Manifest-backed install safety**: pre-existing files are backed up, and uninstall removes only managed paths.
|
|
46
|
-
- **Project Context Pack**: `--repo-path
|
|
50
|
+
- **Project Context Pack**: real repository facts (the current directory by default, or `--repo-path <dir>`) ground tiering and PLAN checks.
|
|
47
51
|
- **Repair paths**: reopen closed runs, route upstream gaps, and resolve repaired decisions.
|
|
48
|
-
- **
|
|
52
|
+
- **In-place PLAN execution**: `r2p-execute` runs the approved PLAN on your current branch through a subagent-driven SDD loop - a fresh implementer per task, a task-reviewer and fix loop after each, then a whole-branch final review that re-runs the full verification suite before the run auto-archives. Subagent dispatch is required, and it never pushes or opens pull requests.
|
|
49
53
|
|
|
50
54
|
## Supported platforms
|
|
51
55
|
|
|
@@ -109,20 +113,19 @@ r2p install --platform claude,codex,gemini,opencode
|
|
|
109
113
|
Install the platform skills, then start a workflow from your agent:
|
|
110
114
|
|
|
111
115
|
```text
|
|
112
|
-
/r2p-start
|
|
116
|
+
/r2p-start "Add rate limiting"
|
|
113
117
|
/r2p-continue
|
|
114
118
|
```
|
|
115
119
|
|
|
116
120
|
Start from a requirement file instead of inline text:
|
|
117
121
|
|
|
118
122
|
```text
|
|
119
|
-
/r2p-start --
|
|
123
|
+
/r2p-start --file change-req.md
|
|
120
124
|
```
|
|
121
125
|
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
checks.
|
|
126
|
+
Tier estimation and the Project Context Pack are grounded in the current
|
|
127
|
+
directory by default. Pass `--repo-path <dir>` to ground them in a different
|
|
128
|
+
repository instead - for example, a target repository for cross-project work.
|
|
126
129
|
|
|
127
130
|
The workflow stops whenever it needs a human or agent action: tier lock,
|
|
128
131
|
artifact content, quality-gate repair, checkpoint approval, subagent review, or
|
|
@@ -139,20 +142,43 @@ gap resolution. Run the printed `next:` command exactly, then resume with
|
|
|
139
142
|
## Workflow commands
|
|
140
143
|
|
|
141
144
|
After installation, the agent-facing commands call shared wrappers under
|
|
142
|
-
`~/.req-to-plan/bin`.
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
| `r2p-
|
|
148
|
-
|
|
|
149
|
-
|
|
|
150
|
-
| `r2p-
|
|
151
|
-
| `r2p-
|
|
152
|
-
| `r2p-
|
|
153
|
-
| `r2p-
|
|
154
|
-
|
|
|
155
|
-
|
|
|
145
|
+
`~/.req-to-plan/bin`. Each command, its purpose, and its parameters - optional
|
|
146
|
+
parameters show their default; `—` means you must supply the value:
|
|
147
|
+
|
|
148
|
+
| Command | Purpose | Parameter | Required / optional | Default |
|
|
149
|
+
|---|---|---|---|---|
|
|
150
|
+
| `r2p-start` | Start a new run and propose a tier from a repo scan. | `<requirement>` or `--file <path>` | one required | — |
|
|
151
|
+
| | | `--repo-path <dir>` | optional | current directory |
|
|
152
|
+
| | | `--separate` | optional | off |
|
|
153
|
+
| `r2p-continue` | Advance the active run and print the exact `next:` action. | *(none)* | — | — |
|
|
154
|
+
| `r2p-status` | Inspect runs without changing state. | `--all` | optional | off |
|
|
155
|
+
| `r2p-switch` | Point the active-run marker at another run. | `--work-id <id>` | required | — |
|
|
156
|
+
| `r2p-tier-lock` | Lock the active run's complexity tier. | `--work-id <id>` | required | — |
|
|
157
|
+
| | | `--base light\|standard` | required | — |
|
|
158
|
+
| | | `--confirm` | required | — |
|
|
159
|
+
| | | `--modifiers <a,b,…>` | optional | none |
|
|
160
|
+
| | | `--override-floor` | optional | off |
|
|
161
|
+
| `r2p-reopen` | Reopen a closed or executing run to repair an upstream artifact. | `--from <work-id>` | required | — |
|
|
162
|
+
| | | `--stage <stage>` | required | — |
|
|
163
|
+
| | | `--reason <text>` | required | — |
|
|
164
|
+
| `r2p-gap-open` | Route an upstream decision gap on an open run back to its owner stage. | `--work-id <id>` | required | — |
|
|
165
|
+
| | | `--owner-stage <stage>` | required | — |
|
|
166
|
+
| | | `--required-action "<text>"` | required | — |
|
|
167
|
+
| `r2p-gap-resolve` | Close a repaired upstream-gap route. | `--work-id <id>` | required | — |
|
|
168
|
+
| | | `--route-id <id>` | required | — |
|
|
169
|
+
| `r2p-archive` | Archive a closed run out of the active workspace. | `--work-id <id>` | optional | active run |
|
|
170
|
+
| | | `--force` | optional | off |
|
|
171
|
+
| `r2p-execute` | Execute a closed PLAN in place, run a whole-branch review, then archive. | `--work-id <id>` | optional | active run |
|
|
172
|
+
|
|
173
|
+
Notes: `--modifiers` takes a comma-separated subset of `migration`,
|
|
174
|
+
`cross_project`, `safety`, `dependency`, `scope_expanding`. `--stage` and
|
|
175
|
+
`--owner-stage` take a pipeline stage (`raw_requirement` … `plan`); a gap's
|
|
176
|
+
`--owner-stage` must be strictly upstream of the current stage, and
|
|
177
|
+
`--required-action` must be a single line. `--confirm` is what makes the tier
|
|
178
|
+
lock take effect, and `--override-floor` allows locking below the computed
|
|
179
|
+
floor. `--separate` starts a parallel run while another is still open. `--force`
|
|
180
|
+
lets `r2p-archive` archive an executing run whose PLAN-TASKs are not all checked
|
|
181
|
+
off.
|
|
156
182
|
|
|
157
183
|
Most runs only need `r2p-start` and repeated `r2p-continue`. Use the specialized
|
|
158
184
|
commands when the workflow prints them or when you intentionally need to switch,
|
|
@@ -212,6 +238,32 @@ present. If a later stage discovers an upstream decision gap, use
|
|
|
212
238
|
`r2p-gap-open`, repair the owner stage, then close the route with
|
|
213
239
|
`r2p-gap-resolve`.
|
|
214
240
|
|
|
241
|
+
## Executing a PLAN
|
|
242
|
+
|
|
243
|
+
`r2p` does not stop at the PLAN. Once a run is closed at the PLAN checkpoint,
|
|
244
|
+
`r2p-execute` implements it in place on your current branch - no new branch, no
|
|
245
|
+
worktree, no push. It assumes the host agent can dispatch subagents and fails
|
|
246
|
+
explicitly if it cannot.
|
|
247
|
+
|
|
248
|
+
The loop is Spec-Driven Development (SDD):
|
|
249
|
+
|
|
250
|
+
- **Pre-flight**: read the PLAN once and batch any contradiction or defect to you
|
|
251
|
+
before work starts; an upstream defect routes back to a stage reopen, never a
|
|
252
|
+
patch in execution.
|
|
253
|
+
- **Per task**: a fresh implementer subagent builds exactly one PLAN-TASK under
|
|
254
|
+
TDD, commits only its own files, and reports back; a task-reviewer checks it
|
|
255
|
+
against the SPEC and the task's verification criteria, and a fix loop clears
|
|
256
|
+
Critical and Important findings before the task's checkbox flips.
|
|
257
|
+
- **Whole-branch review**: once every task is done, a final reviewer on the most
|
|
258
|
+
capable model re-runs the full verification suite over the entire execution
|
|
259
|
+
range and walks the PLAN as a checklist. This review is the merge gate.
|
|
260
|
+
- **Auto-archive**: a clean `Verdict: Approved` final review lets `r2p-execute`
|
|
261
|
+
archive the run. Commits stay on your current branch; `push` and pull requests
|
|
262
|
+
remain a separate, explicit request.
|
|
263
|
+
|
|
264
|
+
Progress is tracked durably in `execution/progress.md`, so an interrupted run
|
|
265
|
+
resumes from the first unchecked task instead of restarting.
|
|
266
|
+
|
|
215
267
|
## Development
|
|
216
268
|
|
|
217
269
|
Install development dependencies:
|
package/README.zh-CN.md
CHANGED
|
@@ -6,17 +6,20 @@
|
|
|
6
6
|
[](https://nodejs.org)
|
|
7
7
|
[](./LICENSE)
|
|
8
8
|
|
|
9
|
-
> 把原始需求变成一份获批、执行器中立的实现 PLAN
|
|
9
|
+
> 把原始需求变成一份获批、执行器中立的实现 PLAN,再在原地把这份 PLAN 执行落地,并在 Claude Code、Codex、Gemini、opencode 上一致运行。
|
|
10
10
|
|
|
11
|
-
`req-2-plan` 为 AI coding agent 安装 `r2p`
|
|
12
|
-
|
|
13
|
-
**PLAN
|
|
11
|
+
`req-2-plan` 为 AI coding agent 安装 `r2p` 工作流,分两个阶段工作。**规划(Plan):**
|
|
12
|
+
把粗略需求推进到一条分阶段、门控的流程中——**requirement brief**、**risk discovery**、
|
|
13
|
+
**DESIGN**、**SPEC**、**PLAN**——让最终计划有上下文、有审查记录、可直接执行。
|
|
14
|
+
**执行(Execute):** 随后 `r2p-execute` 把这份获批 PLAN 接入当前分支上、由 subagent
|
|
15
|
+
编排的原地实现循环——每个任务一个 implementer、每个任务后一次 review、最后一次整分支
|
|
16
|
+
评审,然后自动归档——让规划这次改动的工具也能把它落地。
|
|
14
17
|
|
|
15
18
|
这个 npm 包是生命周期安装器。目前它支持 4 个 agent 平台:**Claude Code**、**Codex**、
|
|
16
19
|
**Gemini**、**opencode**。它从一份共享源生成各平台的 agent 入口,安装共享的
|
|
17
20
|
`r2p-*` wrapper,并维护 owned manifest,确保卸载时只移除 `r2p` 自己管理的文件。
|
|
18
21
|
|
|
19
|
-
**Contents:** [Why r2p](#why-r2p) · [Features](#features) · [Installation](#installation) · [Quick start](#quick-start) · [Workflow commands](#workflow-commands) · [Development](#development)
|
|
22
|
+
**Contents:** [Why r2p](#why-r2p) · [Features](#features) · [Installation](#installation) · [Quick start](#quick-start) · [Workflow commands](#workflow-commands) · [Executing a PLAN](#executing-a-plan) · [Development](#development)
|
|
20
23
|
|
|
21
24
|
## Why r2p
|
|
22
25
|
|
|
@@ -27,7 +30,7 @@ AI agent 执行很快,但模糊需求容易变成含糊计划、隐藏范围
|
|
|
27
30
|
- 风险和未知点会在实现计划前暴露;
|
|
28
31
|
- DESIGN、SPEC、PLAN 都要通过结构化 quality gate;
|
|
29
32
|
- 必须由人选择的决定会被记录,而不是由 agent 猜;
|
|
30
|
-
-
|
|
33
|
+
- 执行可以直接从 PLAN 开始——手动执行,或用 `r2p-execute`——不需要重新决定范围。
|
|
31
34
|
|
|
32
35
|
当需求不只是单行修改、会影响重要行为,或需要在多个 agent 之间做稳定交接时,适合使用它。
|
|
33
36
|
|
|
@@ -38,9 +41,9 @@ AI agent 执行很快,但模糊需求容易变成含糊计划、隐藏范围
|
|
|
38
41
|
- **支持 4 个平台**:为 Claude Code(`claude`)、Codex(`codex`)、Gemini(`gemini`)、opencode(`opencode`)安装匹配入口。
|
|
39
42
|
- **单一生命周期 CLI**:`r2p install`、`r2p uninstall`、`r2p status`、`r2p version`、`r2p help`。
|
|
40
43
|
- **Manifest-backed 安装安全**:覆盖前备份已存在文件,卸载只删除受管路径。
|
|
41
|
-
- **Project Context Pack
|
|
44
|
+
- **Project Context Pack**:以真实仓库事实(默认当前目录,或用 `--repo-path <dir>`)支撑 tier 估算和 PLAN 校验。
|
|
42
45
|
- **修复路径**:可重开 closed run、路由上游缺口,并关闭已修复的决策路线。
|
|
43
|
-
-
|
|
46
|
+
- **原地执行 PLAN**:`r2p-execute` 在当前分支上通过 subagent 驱动的 SDD 循环执行获批 PLAN——每个任务一个全新 implementer、每个任务后接 task-reviewer 与 fix 循环,最后一次整分支评审会重跑完整验证套件,通过后 run 自动归档。它要求宿主能派发 subagent,且从不 push 或开 PR。
|
|
44
47
|
|
|
45
48
|
## Supported platforms
|
|
46
49
|
|
|
@@ -101,18 +104,18 @@ r2p install --platform claude,codex,gemini,opencode
|
|
|
101
104
|
安装平台 skill 后,在 agent 里启动一次工作流:
|
|
102
105
|
|
|
103
106
|
```text
|
|
104
|
-
/r2p-start
|
|
107
|
+
/r2p-start "Add rate limiting"
|
|
105
108
|
/r2p-continue
|
|
106
109
|
```
|
|
107
110
|
|
|
108
111
|
也可以从需求文件启动,而不是传内联文本:
|
|
109
112
|
|
|
110
113
|
```text
|
|
111
|
-
/r2p-start --
|
|
114
|
+
/r2p-start --file change-req.md
|
|
112
115
|
```
|
|
113
116
|
|
|
114
|
-
|
|
115
|
-
|
|
117
|
+
tier 估算和 Project Context Pack 默认以当前目录为基准。传 `--repo-path <dir>`
|
|
118
|
+
可改为以另一个仓库为基准——例如跨项目需求里的目标仓库。
|
|
116
119
|
|
|
117
120
|
工作流会在需要人或 agent 动作时停下:锁定 tier、填写 artifact、修复 quality gate、
|
|
118
121
|
批准 checkpoint、执行 subagent review,或解决 gap。按输出里的 `next:` 命令执行,
|
|
@@ -127,20 +130,40 @@ r2p install --platform claude,codex,gemini,opencode
|
|
|
127
130
|
|
|
128
131
|
## Workflow commands
|
|
129
132
|
|
|
130
|
-
安装后,面向 agent 的命令会调用 `~/.req-to-plan/bin` 下的共享 wrapper
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
| `r2p-
|
|
136
|
-
|
|
|
137
|
-
| `
|
|
138
|
-
| `r2p-
|
|
139
|
-
| `r2p-
|
|
140
|
-
| `r2p-
|
|
141
|
-
| `r2p-
|
|
142
|
-
|
|
|
143
|
-
| `
|
|
133
|
+
安装后,面向 agent 的命令会调用 `~/.req-to-plan/bin` 下的共享 wrapper。各命令的用途与
|
|
134
|
+
参数如下——可选参数标注默认值,`—` 表示必须自行提供:
|
|
135
|
+
|
|
136
|
+
| Command | Purpose | Parameter | 必填 / 可选 | 默认值 |
|
|
137
|
+
|---|---|---|---|---|
|
|
138
|
+
| `r2p-start` | 启动新 run,并基于仓库扫描给出 tier 建议。 | `<requirement>` 或 `--file <path>` | 二选一必填 | — |
|
|
139
|
+
| | | `--repo-path <dir>` | 可选 | 当前目录 |
|
|
140
|
+
| | | `--separate` | 可选 | 关 |
|
|
141
|
+
| `r2p-continue` | 把活动 run 推进到下一个停点,并打印精确的 `next:` 动作。 | *(无)* | — | — |
|
|
142
|
+
| `r2p-status` | 只读查看 run,不改状态。 | `--all` | 可选 | 关 |
|
|
143
|
+
| `r2p-switch` | 把活动 run 标记指向另一个 run。 | `--work-id <id>` | 必填 | — |
|
|
144
|
+
| `r2p-tier-lock` | 锁定活动 run 的复杂度 tier。 | `--work-id <id>` | 必填 | — |
|
|
145
|
+
| | | `--base light\|standard` | 必填 | — |
|
|
146
|
+
| | | `--confirm` | 必填 | — |
|
|
147
|
+
| | | `--modifiers <a,b,…>` | 可选 | 无 |
|
|
148
|
+
| | | `--override-floor` | 可选 | 关 |
|
|
149
|
+
| `r2p-reopen` | 重开一个 closed 或 executing run 以修复上游 artifact。 | `--from <work-id>` | 必填 | — |
|
|
150
|
+
| | | `--stage <stage>` | 必填 | — |
|
|
151
|
+
| | | `--reason <text>` | 必填 | — |
|
|
152
|
+
| `r2p-gap-open` | 在 open run 上把上游决策缺口路由回 owner stage。 | `--work-id <id>` | 必填 | — |
|
|
153
|
+
| | | `--owner-stage <stage>` | 必填 | — |
|
|
154
|
+
| | | `--required-action "<text>"` | 必填 | — |
|
|
155
|
+
| `r2p-gap-resolve` | 关闭已修复的上游缺口 route。 | `--work-id <id>` | 必填 | — |
|
|
156
|
+
| | | `--route-id <id>` | 必填 | — |
|
|
157
|
+
| `r2p-archive` | 把 closed run 归档出活动工作区。 | `--work-id <id>` | 可选 | 活动 run |
|
|
158
|
+
| | | `--force` | 可选 | 关 |
|
|
159
|
+
| `r2p-execute` | 原地执行 closed PLAN,做整分支评审,然后归档。 | `--work-id <id>` | 可选 | 活动 run |
|
|
160
|
+
|
|
161
|
+
说明:`--modifiers` 接受 `migration`、`cross_project`、`safety`、`dependency`、
|
|
162
|
+
`scope_expanding` 的逗号分隔子集。`--stage` 与 `--owner-stage` 取流水线阶段
|
|
163
|
+
(`raw_requirement` … `plan`);gap 的 `--owner-stage` 须严格位于当前阶段上游,
|
|
164
|
+
`--required-action` 须为单行。`--confirm` 才会让 tier 锁定真正生效,`--override-floor`
|
|
165
|
+
允许锁到计算下限以下。`--separate` 可在已有 run 仍打开时另起一个并行 run。`--force`
|
|
166
|
+
让 `r2p-archive` 能归档 PLAN-TASK 未全部勾选的 executing run。
|
|
144
167
|
|
|
145
168
|
大多数 run 只需要 `r2p-start`,然后反复 `r2p-continue`。当工作流输出这些命令,
|
|
146
169
|
或你明确需要切换、修复、重开、执行、归档时,再使用对应的专用命令。
|
|
@@ -195,6 +218,27 @@ Standard tier 的 DESIGN/SPEC/PLAN 阶段可能要求 subagent review,尤其
|
|
|
195
218
|
`migration`、`safety`、`cross_project` 等 tier modifier 时。如果后续阶段发现上游决策缺口,
|
|
196
219
|
用 `r2p-gap-open` 路由回 owner stage,修复后再用 `r2p-gap-resolve` 关闭 route。
|
|
197
220
|
|
|
221
|
+
## Executing a PLAN
|
|
222
|
+
|
|
223
|
+
`r2p` 不止步于 PLAN。run 在 PLAN checkpoint 关闭后,`r2p-execute` 会在当前分支上原地
|
|
224
|
+
实现它——不开新分支、不建 worktree、不 push。它假设宿主 agent 能派发 subagent,否则
|
|
225
|
+
显式失败。
|
|
226
|
+
|
|
227
|
+
这个循环采用 Spec-Driven Development(SDD):
|
|
228
|
+
|
|
229
|
+
- **Pre-flight**:先通读一遍 PLAN,在动工前把任何矛盾或缺陷一次性抛给你;上游缺陷会
|
|
230
|
+
路由回某个阶段重开,而不是在执行里打补丁。
|
|
231
|
+
- **逐任务**:一个全新 implementer subagent 在 TDD 下只实现一个 PLAN-TASK,只提交自己
|
|
232
|
+
改动的文件并回报;随后 task-reviewer 对照 SPEC 与该任务的验证标准检查,fix 循环清掉
|
|
233
|
+
Critical 与 Important 发现后才勾选该任务。
|
|
234
|
+
- **整分支评审**:全部任务完成后,由最强模型上的 final reviewer 在整个执行区间上重跑
|
|
235
|
+
完整验证套件,并把 PLAN 当作清单逐条核对。这次评审就是合并门槛。
|
|
236
|
+
- **自动归档**:final review 干净地给出 `Verdict: Approved`,`r2p-execute` 才会归档该
|
|
237
|
+
run。commit 留在当前分支;`push` 和 pull request 仍需你单独显式请求。
|
|
238
|
+
|
|
239
|
+
进度持久记录在 `execution/progress.md`,因此被打断的 run 会从第一个未勾选的任务继续,
|
|
240
|
+
而不是从头再来。
|
|
241
|
+
|
|
198
242
|
## Development
|
|
199
243
|
|
|
200
244
|
安装开发依赖:
|
package/package.json
CHANGED
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
set -euo pipefail
|
|
3
|
+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
4
|
+
REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
|
|
5
|
+
export PYTHONPATH="$REPO_ROOT${PYTHONPATH:+:$PYTHONPATH}"
|
|
6
|
+
if command -v python3 >/dev/null 2>&1; then
|
|
7
|
+
exec python3 -m tools.workflow_cli.agent_shortcuts task-brief "$@"
|
|
8
|
+
else
|
|
9
|
+
exec python -m tools.workflow_cli.agent_shortcuts task-brief "$@"
|
|
10
|
+
fi
|
|
@@ -968,6 +968,21 @@ def _cmd_execute(ns: argparse.Namespace, base_path: Path) -> None:
|
|
|
968
968
|
sys.exit(EXIT_CONFLICT)
|
|
969
969
|
|
|
970
970
|
|
|
971
|
+
def _cmd_task_brief(ns: argparse.Namespace, base_path: Path) -> None:
|
|
972
|
+
work_id = ns.work_id
|
|
973
|
+
if not work_id:
|
|
974
|
+
pointer = read_active_pointer(base_path)
|
|
975
|
+
if not pointer:
|
|
976
|
+
print("no_selected_run: true\nnext: r2p-task-brief --work-id <id> --task <N>\n")
|
|
977
|
+
sys.exit(1)
|
|
978
|
+
work_id = pointer["selected_work_id"]
|
|
979
|
+
work_id = _validate_work_id(work_id)
|
|
980
|
+
sys.exit(_run_cli(
|
|
981
|
+
["plan-task-brief", "--work-id", work_id, "--task", str(ns.task)],
|
|
982
|
+
base_path,
|
|
983
|
+
))
|
|
984
|
+
|
|
985
|
+
|
|
971
986
|
def _cmd_gap_open(ns: argparse.Namespace, base_path: Path) -> None:
|
|
972
987
|
work_id = _validate_work_id(ns.work_id)
|
|
973
988
|
args = [
|
|
@@ -1065,6 +1080,10 @@ def _build_parser() -> argparse.ArgumentParser:
|
|
|
1065
1080
|
p_tier_lock.add_argument("--override-floor", action="store_true")
|
|
1066
1081
|
p_tier_lock.add_argument("--confirm", action="store_true")
|
|
1067
1082
|
|
|
1083
|
+
p_task_brief = sub.add_parser("task-brief")
|
|
1084
|
+
p_task_brief.add_argument("--work-id", dest="work_id", default=None)
|
|
1085
|
+
p_task_brief.add_argument("--task", type=int, required=True)
|
|
1086
|
+
|
|
1068
1087
|
p_gap_open = sub.add_parser("gap-open")
|
|
1069
1088
|
p_gap_open.add_argument("--work-id", dest="work_id", required=True)
|
|
1070
1089
|
p_gap_open.add_argument("--owner-stage", dest="owner_stage", required=True)
|
|
@@ -1104,6 +1123,7 @@ def main(args: list[str] | None = None, base_path: Path | None = None) -> None:
|
|
|
1104
1123
|
"gap-open": _cmd_gap_open,
|
|
1105
1124
|
"gap-resolve": _cmd_gap_resolve,
|
|
1106
1125
|
"execute": _cmd_execute,
|
|
1126
|
+
"task-brief": _cmd_task_brief,
|
|
1107
1127
|
}
|
|
1108
1128
|
handlers[ns.subcommand](ns, bp)
|
|
1109
1129
|
sys.exit(0)
|
|
@@ -38,26 +38,97 @@ Use the least powerful model that can handle each role:
|
|
|
38
38
|
- **Integration / judgment / debugging** (multi-file coordination, pattern matching): standard model.
|
|
39
39
|
- **Architecture / design AND the final whole-branch review**: most capable model.
|
|
40
40
|
- Always specify the model explicitly when dispatching; an omitted model inherits the session model.
|
|
41
|
+
- Read the task brief on demand when sizing the implementer model; do not paste or retain a rewritten copy of it.
|
|
41
42
|
- **Turn count beats token price**: use a mid-tier floor for reviewers and for implementers working from prose descriptions; drop to cheapest only for complete-code/single-file mechanical tasks.
|
|
42
43
|
|
|
44
|
+
## Controller Narration Discipline
|
|
45
|
+
|
|
46
|
+
Between tool calls, the controller narrates at most one short line. Use the prefix `Narration:` for these inter-call notes (e.g. `Narration: implementer returned DONE, writing diff`). Never paste a subagent's returned text — report body, diff content, or review findings — into a later dispatch; reviewer findings move through `review_report_path`, not pasted text. What the controller restates into its own context is bounded to `status`, `report_path` or diff path, `review_report_path`, `commit_range`, `test_summary`, and `concerns`.
|
|
47
|
+
|
|
48
|
+
## Authoritative Context Set
|
|
49
|
+
|
|
50
|
+
Each subagent (implementer, reviewer, fix) receives these run-dir paths and reads them directly — bodies are never pasted:
|
|
51
|
+
|
|
52
|
+
| Path | Domain |
|
|
53
|
+
|---|---|
|
|
54
|
+
| `02-project-context.md` | planning-time repository baseline |
|
|
55
|
+
| `03-requirement-brief.md` | goal / scope / non-goals / acceptance |
|
|
56
|
+
| `04-risk-discovery.md` | cross-task risks and mitigations |
|
|
57
|
+
| `05-design.md` | chosen design and rejected alternatives |
|
|
58
|
+
| `06-spec.md` | full behavior / interface / data / error / test contracts |
|
|
59
|
+
| `execution/progress.md` | execution ledger (task ID + title list), read-only to subagents |
|
|
60
|
+
|
|
61
|
+
`07-plan.md`, `00-raw-requirement.md`, and `01-intake-brief.md` are **not** in the set. No generated `task-N-context.md` bundle, `context-manifest.json`, sha256/content-hash, or drift gate is introduced.
|
|
62
|
+
|
|
63
|
+
### Authority Responsibility Matrix
|
|
64
|
+
|
|
65
|
+
Each authority owns exactly its domain — this is a matrix, not a priority order:
|
|
66
|
+
|
|
67
|
+
| Authority | Owns |
|
|
68
|
+
|---|---|
|
|
69
|
+
| Task brief | Current task execution scope / files / steps / verification |
|
|
70
|
+
| `03-requirement-brief.md` | Goal / scope / non-goals / acceptance |
|
|
71
|
+
| `04-risk-discovery.md` | Risk constraints and mitigations |
|
|
72
|
+
| `05-design.md` | Chosen architecture and rejected alternatives |
|
|
73
|
+
| `06-spec.md` | Behavior / interface / data / error / test contracts |
|
|
74
|
+
| `## Global Constraints` | Plan-wide execution constraints |
|
|
75
|
+
| `02-project-context.md` | Planning-time repository baseline |
|
|
76
|
+
| Current working tree / HEAD | Operational truth about code that exists now |
|
|
77
|
+
| `00-raw-requirement.md` / `01-intake-brief.md` | Provenance only — never an execution authority |
|
|
78
|
+
|
|
79
|
+
**`02` baseline-vs-working-tree rule**: a predecessor task's legitimate repo change makes the working tree the operational truth; an unexplained or conflicting difference → `BLOCKED`.
|
|
80
|
+
|
|
81
|
+
**Ledger read-only rule**: only the controller flips checkboxes or appends `Resolved:`/`Gap:`/`Unresolved:`/`Minor:` records to `execution/progress.md`; subagents read but do not write to the ledger.
|
|
82
|
+
|
|
83
|
+
### Conflict Rule
|
|
84
|
+
|
|
85
|
+
No artifact silently overrides another outside its domain. If a task cannot satisfy all applicable authorities simultaneously, the subagent returns `BLOCKED`, names the conflicting files/IDs, and asks the human to **reopen** the owning stage — no guessing, no picking a winner, no patching around an upstream defect.
|
|
86
|
+
|
|
87
|
+
### Required Consumption
|
|
88
|
+
|
|
89
|
+
Each implementer, reviewer, and fix subagent must **read the full Authoritative Context Set before acting** — availability is not consumption. Subagents may skip the embedded `(read-only)` Upstream Summary / Project Context blocks within those files. On-demand depth applies only to the current codebase, git history, and prior task reports/reviews — never to whether to read an approved artifact. This rule is orthogonal to Model Selection: a cheap model on a mechanical task still reads the set.
|
|
90
|
+
|
|
91
|
+
### Ledger Ownership and Sibling Escalation
|
|
92
|
+
|
|
93
|
+
Default position/ownership derives from the ledger's `PLAN-TASK-NNN <title>` list (stay within the brief's Files/Steps; treat every other listed task ID as owned elsewhere). The full `07-plan.md` is not handed to subagents; whole-plan reasoning stays with the controller's Pre-flight read. An unclear sibling boundary → the subagent returns `NEEDS_CONTEXT` / `BLOCKED`; the controller resolves it or hands a specific `r2p-task-brief --task <M>` single-task brief and re-dispatches — never a whole-plan read or a guess from the title.
|
|
94
|
+
|
|
95
|
+
### Path Delivery and Fail-Closed Preflight
|
|
96
|
+
|
|
97
|
+
The controller derives `run_dir = parent(plan)` from the execute output and hands absolute paths by default, or repository-root-relative paths paired with an explicit `repo_root`. Each subagent runs a preflight before acting:
|
|
98
|
+
1. Input paths must already exist and be readable. For each role, inputs include the Authoritative Context Set paths, the task brief path, the ledger path, and any handed report/review/diff path the subagent is meant to consume.
|
|
99
|
+
2. Output paths do not need to exist at preflight. For each role, treat generated output paths as destination paths: implementer `execution/task-N-report.md`, task-reviewer `execution/task-N-review.md`, and final reviewer `execution/final-review-report.md`. Their parent directories must resolve under the same `run_dir` / `work_id` and be writable before the subagent writes.
|
|
100
|
+
3. Every handed path resolves under the same `run_dir` / `work_id`.
|
|
101
|
+
4. Any repo-root-relative path was resolved against the handed `repo_root`, not the process cwd.
|
|
102
|
+
5. If any input path is missing or unreadable, any output parent is missing or unwritable, or any path is unresolved or wired to a different run → `BLOCKED` — no silent continue on a partial/mixed set.
|
|
103
|
+
|
|
43
104
|
## Per-Task Loop
|
|
44
105
|
|
|
45
106
|
For each PLAN-TASK (in order):
|
|
46
107
|
|
|
47
|
-
### 1.
|
|
108
|
+
### 1. Run `r2p-task-brief` and obtain the task-brief path
|
|
48
109
|
|
|
49
|
-
|
|
110
|
+
Run `{{R2P_BIN_DIR}}/r2p-task-brief --work-id <work-id> --task <N>` (where `<N>` is the task's integer, e.g. `2` for `PLAN-TASK-002`) for the current task. This installed wrapper delegates to the internal `plan-task-brief` CLI command. The command returns a `brief_path` pointing to a scoped brief file that contains the task's `Skeleton`, `Steps`, `Spec References`, and `Verification` criteria. Pass the `brief_path` as the handoff pointer to both the implementer and the reviewer — not pasted task text from `07-plan.md`. The controller uses the returned `brief_path` without eager-reading the full task body into its own context; the implementer and reviewer read the task-brief on demand.
|
|
50
111
|
|
|
51
112
|
### 2. Dispatch a fresh implementer subagent
|
|
52
113
|
|
|
53
114
|
Record BASE (`git rev-parse HEAD`) BEFORE dispatching the implementer — **never use `HEAD~1`** as BASE (it drops all but the last commit of a multi-commit task). For Task 1, this BASE is also `<execution-base-commit>` for the final whole-branch review. Persist the Task 1 BASE immediately in tracked execution state by adding `Execution BASE: <execution-base-commit>` to `execution/progress.md`.
|
|
54
115
|
|
|
55
116
|
Provide the subagent with:
|
|
56
|
-
- The task text
|
|
57
|
-
-
|
|
117
|
+
- The `brief_path` returned by `r2p-task-brief` (not pasted task text from `07-plan.md`)
|
|
118
|
+
- **Read the Authoritative Context Set before acting**: the `02-project-context.md` entry supplies the project/dependency/architecture baseline deterministically
|
|
119
|
+
- Global Constraints from the PLAN (`## Global Constraints`), copied verbatim when present — the brief carries only the task body, so the implementer does not otherwise see plan-level constraints
|
|
58
120
|
- TDD instructions: follow `Skeleton`/`Steps`; prove `Verification` with evidence
|
|
59
121
|
- A report file path (`execution/task-N-report.md`)
|
|
60
122
|
|
|
123
|
+
The implementer return contract is minimal and inline:
|
|
124
|
+
- `status`: DONE / DONE_WITH_CONCERNS / NEEDS_CONTEXT / BLOCKED
|
|
125
|
+
- `report_path`: the report file path
|
|
126
|
+
- `commit_range`: `<base7>..<head7>` for committed task work, or `none` if no commit was created
|
|
127
|
+
- `test_summary`: one-line test summary, or `not run: <reason>`
|
|
128
|
+
- `concerns`: `none` or a concise list of decision-relevant concerns, missing context, or blockers
|
|
129
|
+
|
|
130
|
+
The controller uses these fields to decide whether to continue without opening the full report. The controller does not ask the implementer to restate the task.
|
|
131
|
+
|
|
61
132
|
The implementer must:
|
|
62
133
|
1. Implement exactly what the task specifies, following TDD
|
|
63
134
|
2. Satisfy the task's `Verification` criteria and attach evidence (test output, assertions)
|
|
@@ -81,20 +152,37 @@ The fresh implementer subagent verifies-then-removes ambiguity by evidence and T
|
|
|
81
152
|
After the implementer reports DONE:
|
|
82
153
|
1. `mkdir -p .req-to-plan/<work-id>/logs` then `git diff -U10 <base-commit> HEAD > .req-to-plan/<work-id>/logs/task-N-diff.md`. Keep diff scratch under `logs/` (gitignored), never under `execution/`.
|
|
83
154
|
2. Dispatch a task-reviewer subagent with:
|
|
84
|
-
-
|
|
85
|
-
- The
|
|
155
|
+
- **Read the Authoritative Context Set before acting**: the reviewer checks `Spec References` IDs against the full `06-spec.md` text, not the IDs alone
|
|
156
|
+
- The `brief_path` returned by `r2p-task-brief` (not pasted task text). The reviewer reads `Spec References` from the task brief. Do not pass separate `Spec References`.
|
|
157
|
+
- The implementer report file path (`execution/task-N-report.md`)
|
|
86
158
|
- The diff file path (`.req-to-plan/<work-id>/logs/task-N-diff.md`)
|
|
159
|
+
- A review report file path (`execution/task-N-review.md`)
|
|
87
160
|
- Global constraints from the plan (copy verbatim from `## Global Constraints`); never pre-judge a finding's severity; never paste prior-task summaries into a later dispatch
|
|
88
161
|
|
|
89
|
-
The task-reviewer returns
|
|
90
|
-
-
|
|
162
|
+
The task-reviewer writes detailed findings, if any, to `execution/task-N-review.md` and returns only this inline summary:
|
|
163
|
+
- `status`: APPROVED / CHANGES_REQUESTED / NEEDS_CONTEXT / BLOCKED
|
|
164
|
+
- `review_report_path`: the review report file path
|
|
165
|
+
- `test_summary`: one-line test summary, or `not run: <reason>`
|
|
166
|
+
- `concerns`: `none` or a concise list of decision-relevant concerns, missing context, or blockers
|
|
167
|
+
|
|
168
|
+
Surface in `concerns` every ⚠️ "cannot verify from diff" item and every unfixed Minor finding — do not leave them only in the report. This is how the controller learns there is something to adjudicate (§6) without opening the report on a clean task.
|
|
169
|
+
|
|
170
|
+
The review report records:
|
|
171
|
+
- **Spec compliance**: checked against the task brief's `Spec References` and `Verification`
|
|
91
172
|
- **Code quality**: clean, tested, maintainable
|
|
173
|
+
- **⚠️ DEFER items**: explicit `cannot verify from diff` warnings for requirements satisfied by unchanged code, by sibling task work, or by evidence outside the task diff
|
|
92
174
|
|
|
93
175
|
### 6. Fix loop
|
|
94
176
|
|
|
95
|
-
- Dispatch fix subagents for Critical and Important findings
|
|
96
|
-
-
|
|
97
|
-
-
|
|
177
|
+
- Dispatch fix subagents for Critical and Important findings. Pass the `review_report_path` to the fix subagent with the instruction: Fix all Critical and Important findings in the review report. Do not paste the finding bodies into the dispatch. Also hand: **Read the Authoritative Context Set before acting**, the task brief path (`brief_path`), and the current task diff path (`logs/task-N-diff.md`).
|
|
178
|
+
- After each fix wave: the fix subagent commits only its intentionally-changed files (staging only files changed for this task, exactly as the §2 implementer does); then the loop regenerates `logs/task-N-diff.md` from the task's BASE to `HEAD` (`git diff -U10 <base-commit> HEAD > .req-to-plan/<work-id>/logs/task-N-diff.md`) — commit-then-diff — before re-dispatching the task-reviewer. The re-review must not run against an uncommitted working tree.
|
|
179
|
+
- Re-dispatch the task-reviewer after each fix wave with the refreshed diff path
|
|
180
|
+
- Before flipping the checkbox, adjudicate each reviewer "cannot verify from diff" warning. When `concerns` lists ⚠️ items, open `review_report_path` to adjudicate each; a `none`/empty `concerns` means no ⚠️ remains to adjudicate. Record one line per finding in `execution/progress.md`:
|
|
181
|
+
- `Resolved: <finding>` — clears the warning; a `Resolved:` claim about unchanged code must cite implementation and test evidence
|
|
182
|
+
- `Gap: <finding>` — blocks the flip and cannot be overridden on the controller's own judgment
|
|
183
|
+
- `Unresolved: <finding>` — blocks the flip and cannot be overridden on the controller's own judgment
|
|
184
|
+
- Minor findings not fixed within a task: record each as `Minor: <finding>` in `execution/progress.md` and carry them into the final whole-branch review input rather than dropping them per task.
|
|
185
|
+
- Only when the task-reviewer is clean (both spec ✅ and quality Approved, and `Verification` satisfied, and no open `Gap:` or `Unresolved:` entries), update the matching `execution/progress.md` checkbox from `- [ ] PLAN-TASK-NNN ...` to `- [x] PLAN-TASK-NNN ...` and append one line:
|
|
98
186
|
`Task N: complete (commits <base7>..<head7>, review clean)`
|
|
99
187
|
|
|
100
188
|
**Continuous execution**: execute all PLAN-TASKs without pausing to ask "should I continue?" between tasks. Stop only on: unresolvable `BLOCKED`, upstream defect requiring repair, dirty-tree block, or all tasks complete. `Verification` requires fresh command output; "should pass" / "looks correct" is not evidence; do not report `DONE` without it.
|
|
@@ -105,9 +193,10 @@ After all tasks complete, dispatch a final whole-branch review subagent on the *
|
|
|
105
193
|
- First create the whole-branch diff: `mkdir -p .req-to-plan/<work-id>/logs` then `git diff -U10 <execution-base-commit> HEAD > .req-to-plan/<work-id>/logs/final-diff.md`
|
|
106
194
|
- Scope: review the complete execution range `git diff -U10 <execution-base-commit> HEAD`, where `<execution-base-commit>` is the Task 1 BASE captured before dispatching the first implementer
|
|
107
195
|
- Include the diff file path (`.req-to-plan/<work-id>/logs/final-diff.md`) in the reviewer dispatch; do not ask the reviewer to infer the changed range
|
|
196
|
+
- Provide a final review report path (`execution/final-review-report.md`) and require detailed findings there
|
|
108
197
|
- **re-run the full verification suite** on the final HEAD and attach the fresh output (per-task greens do not catch cross-task regressions)
|
|
109
198
|
- Walk the PLAN task-by-task as a line-by-line requirements checklist; report any gap
|
|
110
|
-
- Dispatch ONE fix subagent carrying the complete findings list (not one fixer per finding)
|
|
199
|
+
- Dispatch ONE fix subagent carrying the complete findings list by passing `execution/final-review-report.md`, not pasted findings (not one fixer per finding)
|
|
111
200
|
- This whole-branch review is the merge gate
|
|
112
201
|
|
|
113
202
|
After the review settles, write `execution/final-review.md` recording the reviewed range, a one-line summary, and the verdict:
|
|
@@ -39,26 +39,97 @@ Use the least powerful model that can handle each role:
|
|
|
39
39
|
- **Integration / judgment / debugging** (multi-file coordination, pattern matching): standard model.
|
|
40
40
|
- **Architecture / design AND the final whole-branch review**: most capable model.
|
|
41
41
|
- Always specify the model explicitly when dispatching; an omitted model inherits the session model.
|
|
42
|
+
- Read the task brief on demand when sizing the implementer model; do not paste or retain a rewritten copy of it.
|
|
42
43
|
- **Turn count beats token price**: use a mid-tier floor for reviewers and for implementers working from prose descriptions; drop to cheapest only for complete-code/single-file mechanical tasks.
|
|
43
44
|
|
|
45
|
+
## Controller Narration Discipline
|
|
46
|
+
|
|
47
|
+
Between tool calls, the controller narrates at most one short line. Use the prefix `Narration:` for these inter-call notes (e.g. `Narration: implementer returned DONE, writing diff`). Never paste a subagent's returned text — report body, diff content, or review findings — into a later dispatch; reviewer findings move through `review_report_path`, not pasted text. What the controller restates into its own context is bounded to `status`, `report_path` or diff path, `review_report_path`, `commit_range`, `test_summary`, and `concerns`.
|
|
48
|
+
|
|
49
|
+
## Authoritative Context Set
|
|
50
|
+
|
|
51
|
+
Each subagent (implementer, reviewer, fix) receives these run-dir paths and reads them directly — bodies are never pasted:
|
|
52
|
+
|
|
53
|
+
| Path | Domain |
|
|
54
|
+
|---|---|
|
|
55
|
+
| `02-project-context.md` | planning-time repository baseline |
|
|
56
|
+
| `03-requirement-brief.md` | goal / scope / non-goals / acceptance |
|
|
57
|
+
| `04-risk-discovery.md` | cross-task risks and mitigations |
|
|
58
|
+
| `05-design.md` | chosen design and rejected alternatives |
|
|
59
|
+
| `06-spec.md` | full behavior / interface / data / error / test contracts |
|
|
60
|
+
| `execution/progress.md` | execution ledger (task ID + title list), read-only to subagents |
|
|
61
|
+
|
|
62
|
+
`07-plan.md`, `00-raw-requirement.md`, and `01-intake-brief.md` are **not** in the set. No generated `task-N-context.md` bundle, `context-manifest.json`, sha256/content-hash, or drift gate is introduced.
|
|
63
|
+
|
|
64
|
+
### Authority Responsibility Matrix
|
|
65
|
+
|
|
66
|
+
Each authority owns exactly its domain — this is a matrix, not a priority order:
|
|
67
|
+
|
|
68
|
+
| Authority | Owns |
|
|
69
|
+
|---|---|
|
|
70
|
+
| Task brief | Current task execution scope / files / steps / verification |
|
|
71
|
+
| `03-requirement-brief.md` | Goal / scope / non-goals / acceptance |
|
|
72
|
+
| `04-risk-discovery.md` | Risk constraints and mitigations |
|
|
73
|
+
| `05-design.md` | Chosen architecture and rejected alternatives |
|
|
74
|
+
| `06-spec.md` | Behavior / interface / data / error / test contracts |
|
|
75
|
+
| `## Global Constraints` | Plan-wide execution constraints |
|
|
76
|
+
| `02-project-context.md` | Planning-time repository baseline |
|
|
77
|
+
| Current working tree / HEAD | Operational truth about code that exists now |
|
|
78
|
+
| `00-raw-requirement.md` / `01-intake-brief.md` | Provenance only — never an execution authority |
|
|
79
|
+
|
|
80
|
+
**`02` baseline-vs-working-tree rule**: a predecessor task's legitimate repo change makes the working tree the operational truth; an unexplained or conflicting difference → `BLOCKED`.
|
|
81
|
+
|
|
82
|
+
**Ledger read-only rule**: only the controller flips checkboxes or appends `Resolved:`/`Gap:`/`Unresolved:`/`Minor:` records to `execution/progress.md`; subagents read but do not write to the ledger.
|
|
83
|
+
|
|
84
|
+
### Conflict Rule
|
|
85
|
+
|
|
86
|
+
No artifact silently overrides another outside its domain. If a task cannot satisfy all applicable authorities simultaneously, the subagent returns `BLOCKED`, names the conflicting files/IDs, and asks the human to **reopen** the owning stage — no guessing, no picking a winner, no patching around an upstream defect.
|
|
87
|
+
|
|
88
|
+
### Required Consumption
|
|
89
|
+
|
|
90
|
+
Each implementer, reviewer, and fix subagent must **read the full Authoritative Context Set before acting** — availability is not consumption. Subagents may skip the embedded `(read-only)` Upstream Summary / Project Context blocks within those files. On-demand depth applies only to the current codebase, git history, and prior task reports/reviews — never to whether to read an approved artifact. This rule is orthogonal to Model Selection: a cheap model on a mechanical task still reads the set.
|
|
91
|
+
|
|
92
|
+
### Ledger Ownership and Sibling Escalation
|
|
93
|
+
|
|
94
|
+
Default position/ownership derives from the ledger's `PLAN-TASK-NNN <title>` list (stay within the brief's Files/Steps; treat every other listed task ID as owned elsewhere). The full `07-plan.md` is not handed to subagents; whole-plan reasoning stays with the controller's Pre-flight read. An unclear sibling boundary → the subagent returns `NEEDS_CONTEXT` / `BLOCKED`; the controller resolves it or hands a specific `r2p-task-brief --task <M>` single-task brief and re-dispatches — never a whole-plan read or a guess from the title.
|
|
95
|
+
|
|
96
|
+
### Path Delivery and Fail-Closed Preflight
|
|
97
|
+
|
|
98
|
+
The controller derives `run_dir = parent(plan)` from the execute output and hands absolute paths by default, or repository-root-relative paths paired with an explicit `repo_root`. Each subagent runs a preflight before acting:
|
|
99
|
+
1. Input paths must already exist and be readable. For each role, inputs include the Authoritative Context Set paths, the task brief path, the ledger path, and any handed report/review/diff path the subagent is meant to consume.
|
|
100
|
+
2. Output paths do not need to exist at preflight. For each role, treat generated output paths as destination paths: implementer `execution/task-N-report.md`, task-reviewer `execution/task-N-review.md`, and final reviewer `execution/final-review-report.md`. Their parent directories must resolve under the same `run_dir` / `work_id` and be writable before the subagent writes.
|
|
101
|
+
3. Every handed path resolves under the same `run_dir` / `work_id`.
|
|
102
|
+
4. Any repo-root-relative path was resolved against the handed `repo_root`, not the process cwd.
|
|
103
|
+
5. If any input path is missing or unreadable, any output parent is missing or unwritable, or any path is unresolved or wired to a different run → `BLOCKED` — no silent continue on a partial/mixed set.
|
|
104
|
+
|
|
44
105
|
## Per-Task Loop
|
|
45
106
|
|
|
46
107
|
For each PLAN-TASK (in order):
|
|
47
108
|
|
|
48
|
-
### 1.
|
|
109
|
+
### 1. Run `r2p-task-brief` and obtain the task-brief path
|
|
49
110
|
|
|
50
|
-
|
|
111
|
+
Run `{{R2P_BIN_DIR}}/r2p-task-brief --work-id <work-id> --task <N>` (where `<N>` is the task's integer, e.g. `2` for `PLAN-TASK-002`) for the current task. This installed wrapper delegates to the internal `plan-task-brief` CLI command. The command returns a `brief_path` pointing to a scoped brief file that contains the task's `Skeleton`, `Steps`, `Spec References`, and `Verification` criteria. Pass the `brief_path` as the handoff pointer to both the implementer and the reviewer — not pasted task text from `07-plan.md`. The controller uses the returned `brief_path` without eager-reading the full task body into its own context; the implementer and reviewer read the task-brief on demand.
|
|
51
112
|
|
|
52
113
|
### 2. Dispatch a fresh implementer subagent
|
|
53
114
|
|
|
54
115
|
Record BASE (`git rev-parse HEAD`) BEFORE dispatching the implementer — **never use `HEAD~1`** as BASE (it drops all but the last commit of a multi-commit task). For Task 1, this BASE is also `<execution-base-commit>` for the final whole-branch review. Persist the Task 1 BASE immediately in tracked execution state by adding `Execution BASE: <execution-base-commit>` to `execution/progress.md`.
|
|
55
116
|
|
|
56
117
|
Provide the subagent with:
|
|
57
|
-
- The task text
|
|
58
|
-
-
|
|
118
|
+
- The `brief_path` returned by `r2p-task-brief` (not pasted task text from `07-plan.md`)
|
|
119
|
+
- **Read the Authoritative Context Set before acting**: the `02-project-context.md` entry supplies the project/dependency/architecture baseline deterministically
|
|
120
|
+
- Global Constraints from the PLAN (`## Global Constraints`), copied verbatim when present — the brief carries only the task body, so the implementer does not otherwise see plan-level constraints
|
|
59
121
|
- TDD instructions: follow `Skeleton`/`Steps`; prove `Verification` with evidence
|
|
60
122
|
- A report file path (`execution/task-N-report.md`)
|
|
61
123
|
|
|
124
|
+
The implementer return contract is minimal and inline:
|
|
125
|
+
- `status`: DONE / DONE_WITH_CONCERNS / NEEDS_CONTEXT / BLOCKED
|
|
126
|
+
- `report_path`: the report file path
|
|
127
|
+
- `commit_range`: `<base7>..<head7>` for committed task work, or `none` if no commit was created
|
|
128
|
+
- `test_summary`: one-line test summary, or `not run: <reason>`
|
|
129
|
+
- `concerns`: `none` or a concise list of decision-relevant concerns, missing context, or blockers
|
|
130
|
+
|
|
131
|
+
The controller uses these fields to decide whether to continue without opening the full report. The controller does not ask the implementer to restate the task.
|
|
132
|
+
|
|
62
133
|
The implementer must:
|
|
63
134
|
1. Implement exactly what the task specifies, following TDD
|
|
64
135
|
2. Satisfy the task's `Verification` criteria and attach evidence (test output, assertions)
|
|
@@ -82,20 +153,37 @@ The fresh implementer subagent verifies-then-removes ambiguity by evidence and T
|
|
|
82
153
|
After the implementer reports DONE:
|
|
83
154
|
1. `mkdir -p .req-to-plan/<work-id>/logs` then `git diff -U10 <base-commit> HEAD > .req-to-plan/<work-id>/logs/task-N-diff.md`. Keep diff scratch under `logs/` (gitignored), never under `execution/`.
|
|
84
155
|
2. Dispatch a task-reviewer subagent with:
|
|
85
|
-
-
|
|
86
|
-
- The
|
|
156
|
+
- **Read the Authoritative Context Set before acting**: the reviewer checks `Spec References` IDs against the full `06-spec.md` text, not the IDs alone
|
|
157
|
+
- The `brief_path` returned by `r2p-task-brief` (not pasted task text). The reviewer reads `Spec References` from the task brief. Do not pass separate `Spec References`.
|
|
158
|
+
- The implementer report file path (`execution/task-N-report.md`)
|
|
87
159
|
- The diff file path (`.req-to-plan/<work-id>/logs/task-N-diff.md`)
|
|
160
|
+
- A review report file path (`execution/task-N-review.md`)
|
|
88
161
|
- Global constraints from the plan (copy verbatim from `## Global Constraints`); never pre-judge a finding's severity; never paste prior-task summaries into a later dispatch
|
|
89
162
|
|
|
90
|
-
The task-reviewer returns
|
|
91
|
-
-
|
|
163
|
+
The task-reviewer writes detailed findings, if any, to `execution/task-N-review.md` and returns only this inline summary:
|
|
164
|
+
- `status`: APPROVED / CHANGES_REQUESTED / NEEDS_CONTEXT / BLOCKED
|
|
165
|
+
- `review_report_path`: the review report file path
|
|
166
|
+
- `test_summary`: one-line test summary, or `not run: <reason>`
|
|
167
|
+
- `concerns`: `none` or a concise list of decision-relevant concerns, missing context, or blockers
|
|
168
|
+
|
|
169
|
+
Surface in `concerns` every ⚠️ "cannot verify from diff" item and every unfixed Minor finding — do not leave them only in the report. This is how the controller learns there is something to adjudicate (§6) without opening the report on a clean task.
|
|
170
|
+
|
|
171
|
+
The review report records:
|
|
172
|
+
- **Spec compliance**: checked against the task brief's `Spec References` and `Verification`
|
|
92
173
|
- **Code quality**: clean, tested, maintainable
|
|
174
|
+
- **⚠️ DEFER items**: explicit `cannot verify from diff` warnings for requirements satisfied by unchanged code, by sibling task work, or by evidence outside the task diff
|
|
93
175
|
|
|
94
176
|
### 6. Fix loop
|
|
95
177
|
|
|
96
|
-
- Dispatch fix subagents for Critical and Important findings
|
|
97
|
-
-
|
|
98
|
-
-
|
|
178
|
+
- Dispatch fix subagents for Critical and Important findings. Pass the `review_report_path` to the fix subagent with the instruction: Fix all Critical and Important findings in the review report. Do not paste the finding bodies into the dispatch. Also hand: **Read the Authoritative Context Set before acting**, the task brief path (`brief_path`), and the current task diff path (`logs/task-N-diff.md`).
|
|
179
|
+
- After each fix wave: the fix subagent commits only its intentionally-changed files (staging only files changed for this task, exactly as the §2 implementer does); then the loop regenerates `logs/task-N-diff.md` from the task's BASE to `HEAD` (`git diff -U10 <base-commit> HEAD > .req-to-plan/<work-id>/logs/task-N-diff.md`) — commit-then-diff — before re-dispatching the task-reviewer. The re-review must not run against an uncommitted working tree.
|
|
180
|
+
- Re-dispatch the task-reviewer after each fix wave with the refreshed diff path
|
|
181
|
+
- Before flipping the checkbox, adjudicate each reviewer "cannot verify from diff" warning. When `concerns` lists ⚠️ items, open `review_report_path` to adjudicate each; a `none`/empty `concerns` means no ⚠️ remains to adjudicate. Record one line per finding in `execution/progress.md`:
|
|
182
|
+
- `Resolved: <finding>` — clears the warning; a `Resolved:` claim about unchanged code must cite implementation and test evidence
|
|
183
|
+
- `Gap: <finding>` — blocks the flip and cannot be overridden on the controller's own judgment
|
|
184
|
+
- `Unresolved: <finding>` — blocks the flip and cannot be overridden on the controller's own judgment
|
|
185
|
+
- Minor findings not fixed within a task: record each as `Minor: <finding>` in `execution/progress.md` and carry them into the final whole-branch review input rather than dropping them per task.
|
|
186
|
+
- Only when the task-reviewer is clean (both spec ✅ and quality Approved, and `Verification` satisfied, and no open `Gap:` or `Unresolved:` entries), update the matching `execution/progress.md` checkbox from `- [ ] PLAN-TASK-NNN ...` to `- [x] PLAN-TASK-NNN ...` and append one line:
|
|
99
187
|
`Task N: complete (commits <base7>..<head7>, review clean)`
|
|
100
188
|
|
|
101
189
|
**Continuous execution**: execute all PLAN-TASKs without pausing to ask "should I continue?" between tasks. Stop only on: unresolvable `BLOCKED`, upstream defect requiring repair, dirty-tree block, or all tasks complete. `Verification` requires fresh command output; "should pass" / "looks correct" is not evidence; do not report `DONE` without it.
|
|
@@ -106,9 +194,10 @@ After all tasks complete, dispatch a final whole-branch review subagent on the *
|
|
|
106
194
|
- First create the whole-branch diff: `mkdir -p .req-to-plan/<work-id>/logs` then `git diff -U10 <execution-base-commit> HEAD > .req-to-plan/<work-id>/logs/final-diff.md`
|
|
107
195
|
- Scope: review the complete execution range `git diff -U10 <execution-base-commit> HEAD`, where `<execution-base-commit>` is the Task 1 BASE captured before dispatching the first implementer
|
|
108
196
|
- Include the diff file path (`.req-to-plan/<work-id>/logs/final-diff.md`) in the reviewer dispatch; do not ask the reviewer to infer the changed range
|
|
197
|
+
- Provide a final review report path (`execution/final-review-report.md`) and require detailed findings there
|
|
109
198
|
- **re-run the full verification suite** on the final HEAD and attach the fresh output (per-task greens do not catch cross-task regressions)
|
|
110
199
|
- Walk the PLAN task-by-task as a line-by-line requirements checklist; report any gap
|
|
111
|
-
- Dispatch ONE fix subagent carrying the complete findings list (not one fixer per finding)
|
|
200
|
+
- Dispatch ONE fix subagent carrying the complete findings list by passing `execution/final-review-report.md`, not pasted findings (not one fixer per finding)
|
|
112
201
|
- This whole-branch review is the merge gate
|
|
113
202
|
|
|
114
203
|
After the review settles, write `execution/final-review.md` recording the reviewed range, a one-line summary, and the verdict:
|
|
@@ -68,7 +68,12 @@ from tools.workflow_cli.output import (
|
|
|
68
68
|
from tools.workflow_cli.tier import estimate_tier, scan_keywords
|
|
69
69
|
from tools.workflow_cli.workspace import ensure_workspace_gitignore, commit_requirement_dir
|
|
70
70
|
from tools.workflow_cli.atomic import atomic_write_text
|
|
71
|
-
from tools.workflow_cli.markdown import
|
|
71
|
+
from tools.workflow_cli.markdown import (
|
|
72
|
+
PLAN_TASK_ANCHOR_RE,
|
|
73
|
+
heading_bounded_bodies,
|
|
74
|
+
plan_task_anchors,
|
|
75
|
+
strip_readonly_sections,
|
|
76
|
+
)
|
|
72
77
|
|
|
73
78
|
|
|
74
79
|
# ---------------------------------------------------------------------------
|
|
@@ -936,9 +941,9 @@ def _cmd_gap_open(args):
|
|
|
936
941
|
)
|
|
937
942
|
mgr.save(record)
|
|
938
943
|
except Exception as e:
|
|
939
|
-
run_md_path
|
|
944
|
+
atomic_write_text(run_md_path, run_md_before)
|
|
940
945
|
for _d, _aa, _artifact_file, artifact_path, artifact_before in reversed(affected):
|
|
941
|
-
artifact_path
|
|
946
|
+
atomic_write_text(artifact_path, artifact_before)
|
|
942
947
|
print_and_exit(
|
|
943
948
|
format_error(
|
|
944
949
|
f"Cannot gap-open: failed to mark downstream stale atomically ({e})",
|
|
@@ -1624,6 +1629,76 @@ def _cmd_stage_ready(args):
|
|
|
1624
1629
|
)
|
|
1625
1630
|
|
|
1626
1631
|
|
|
1632
|
+
# ---------------------------------------------------------------------------
|
|
1633
|
+
# plan-task-brief
|
|
1634
|
+
# ---------------------------------------------------------------------------
|
|
1635
|
+
|
|
1636
|
+
|
|
1637
|
+
def _positive_int(raw: str) -> int:
|
|
1638
|
+
"""argparse type: positive integer (>= 1); raises ArgumentTypeError → exit 2."""
|
|
1639
|
+
try:
|
|
1640
|
+
n = int(raw)
|
|
1641
|
+
except ValueError:
|
|
1642
|
+
raise argparse.ArgumentTypeError(f"expected a positive integer, got {raw!r}")
|
|
1643
|
+
if n < 1:
|
|
1644
|
+
raise argparse.ArgumentTypeError(f"task number must be >= 1, got {n}")
|
|
1645
|
+
return n
|
|
1646
|
+
|
|
1647
|
+
|
|
1648
|
+
def _cmd_plan_task_brief(args):
|
|
1649
|
+
"""Write a read-only task brief for one PLAN-TASK-NNN to logs/task-N-brief.md."""
|
|
1650
|
+
record, mgr, run_dir = _load_run(args.work_id, args.base_path)
|
|
1651
|
+
if record.status != RunStatus.EXECUTING:
|
|
1652
|
+
print_and_exit(
|
|
1653
|
+
format_error(
|
|
1654
|
+
"plan-task-brief requires an EXECUTING run",
|
|
1655
|
+
exit_code=EXIT_CONFLICT,
|
|
1656
|
+
),
|
|
1657
|
+
EXIT_CONFLICT,
|
|
1658
|
+
)
|
|
1659
|
+
try:
|
|
1660
|
+
plan_text = read_artifact(run_dir, Stage.PLAN)
|
|
1661
|
+
except FileNotFoundError:
|
|
1662
|
+
print_and_exit(
|
|
1663
|
+
format_error("PLAN artifact not found", exit_code=EXIT_NOT_FOUND),
|
|
1664
|
+
EXIT_NOT_FOUND,
|
|
1665
|
+
)
|
|
1666
|
+
stripped = strip_readonly_sections(plan_text)
|
|
1667
|
+
anchors = plan_task_anchors(stripped)
|
|
1668
|
+
bodies = list(heading_bounded_bodies(stripped, PLAN_TASK_ANCHOR_RE.match))
|
|
1669
|
+
target_idx = None
|
|
1670
|
+
for i, (tid, _title) in enumerate(anchors):
|
|
1671
|
+
m = re.match(r"PLAN-TASK-(\d+)", tid)
|
|
1672
|
+
if m and int(m.group(1)) == args.task:
|
|
1673
|
+
target_idx = i
|
|
1674
|
+
break
|
|
1675
|
+
if target_idx is None:
|
|
1676
|
+
print_and_exit(
|
|
1677
|
+
format_error(f"task {args.task} not found in PLAN", exit_code=EXIT_NOT_FOUND),
|
|
1678
|
+
EXIT_NOT_FOUND,
|
|
1679
|
+
)
|
|
1680
|
+
task_id = anchors[target_idx][0]
|
|
1681
|
+
body = bodies[target_idx]
|
|
1682
|
+
logs_dir = run_dir / "logs"
|
|
1683
|
+
_reject_symlink_or_exit(logs_dir, "logs dir is a symlink")
|
|
1684
|
+
brief_path = logs_dir / f"task-{args.task}-brief.md"
|
|
1685
|
+
_reject_symlink_or_exit(brief_path, "brief target is a symlink")
|
|
1686
|
+
logs_dir.mkdir(parents=True, exist_ok=True)
|
|
1687
|
+
atomic_write_text(brief_path, body)
|
|
1688
|
+
rel = brief_path.relative_to(run_dir.parent.parent)
|
|
1689
|
+
print_and_exit(
|
|
1690
|
+
format_success(
|
|
1691
|
+
{
|
|
1692
|
+
"work_id": args.work_id,
|
|
1693
|
+
"task_id": task_id,
|
|
1694
|
+
"brief_path": rel.as_posix(),
|
|
1695
|
+
},
|
|
1696
|
+
message="task brief written",
|
|
1697
|
+
),
|
|
1698
|
+
EXIT_OK,
|
|
1699
|
+
)
|
|
1700
|
+
|
|
1701
|
+
|
|
1627
1702
|
# ---------------------------------------------------------------------------
|
|
1628
1703
|
# Subparser registration
|
|
1629
1704
|
# ---------------------------------------------------------------------------
|
|
@@ -1684,6 +1759,20 @@ def _register_run_commands(subparsers):
|
|
|
1684
1759
|
p.add_argument("--work-id", required=True)
|
|
1685
1760
|
p.set_defaults(func=_cmd_run_execute_start)
|
|
1686
1761
|
|
|
1762
|
+
# plan-task-brief
|
|
1763
|
+
p = subparsers.add_parser(
|
|
1764
|
+
"plan-task-brief",
|
|
1765
|
+
help="Write a read-only brief for one PLAN task to logs/task-N-brief.md",
|
|
1766
|
+
)
|
|
1767
|
+
p.add_argument("--work-id", required=True, help="Workflow run ID")
|
|
1768
|
+
p.add_argument(
|
|
1769
|
+
"--task",
|
|
1770
|
+
required=True,
|
|
1771
|
+
type=_positive_int,
|
|
1772
|
+
help="Task number to extract (positive integer, e.g. 2 for PLAN-TASK-002)",
|
|
1773
|
+
)
|
|
1774
|
+
p.set_defaults(func=_cmd_plan_task_brief)
|
|
1775
|
+
|
|
1687
1776
|
|
|
1688
1777
|
def _register_tier_commands(subparsers):
|
|
1689
1778
|
# tier-estimate
|
|
@@ -18,6 +18,7 @@ from pathlib import Path
|
|
|
18
18
|
from typing import Any
|
|
19
19
|
|
|
20
20
|
from tools.workflow_cli.version import R2P_VERSION
|
|
21
|
+
from tools.workflow_cli.atomic import atomic_write_text
|
|
21
22
|
|
|
22
23
|
|
|
23
24
|
# ---------------------------------------------------------------------------
|
|
@@ -1120,6 +1121,6 @@ def _safe_write(
|
|
|
1120
1121
|
backup = _backup_path(backup_dir, dest)
|
|
1121
1122
|
shutil.copy2(str(dest), str(backup))
|
|
1122
1123
|
backups.append({"target": str(dest), "backup": str(backup)})
|
|
1123
|
-
dest
|
|
1124
|
+
atomic_write_text(dest, content)
|
|
1124
1125
|
installed_paths.append(str(dest))
|
|
1125
1126
|
written.append(dest)
|
|
@@ -1 +1 @@
|
|
|
1
|
-
R2P_VERSION = "0.6.
|
|
1
|
+
R2P_VERSION = "0.6.1"
|