gsd-lite 0.4.1 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,7 +13,7 @@
13
13
  "name": "gsd",
14
14
  "source": "./",
15
15
  "description": "AI orchestration tool — GSD management shell + Superpowers quality core. 5 commands, 4 agents, 5 workflows, MCP server, context monitoring.",
16
- "version": "0.4.1",
16
+ "version": "0.5.0",
17
17
  "keywords": [
18
18
  "orchestration",
19
19
  "mcp",
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gsd",
3
- "version": "0.4.1",
3
+ "version": "0.5.0",
4
4
  "description": "AI orchestration tool for Claude Code — GSD management shell + Superpowers quality core",
5
5
  "author": {
6
6
  "name": "sdsrss",
@@ -55,7 +55,10 @@ tools: Read, Write, Edit, Bash, Grep, Glob
55
55
  "decisions": ["[DECISION] use optimistic locking by version column"],
56
56
  "blockers": [],
57
57
  "contract_changed": true,
58
- "evidence": ["ev:test:users-update", "ev:typecheck:phase-2"]
58
+ "evidence": [
59
+ {"id": "ev:test:users-update", "scope": "task:2.3"},
60
+ {"id": "ev:typecheck:phase-2", "scope": "task:2.3"}
61
+ ]
59
62
  }
60
63
  `contract_changed` 判定指南:
61
64
  - 改了函数/方法签名 (参数、返回类型) → true
@@ -92,7 +92,7 @@ Minor = 建议修复 (命名/风格)
92
92
  ```json
93
93
  {
94
94
  "scope": "task | phase",
95
- "scope_id": "2.3 | phase-2",
95
+ "scope_id": "2.3 (task scope: string ID) | 2 (phase scope: number ID)",
96
96
  "review_level": "L2 | L1-batch",
97
97
  "spec_passed": true,
98
98
  "quality_passed": false,
@@ -107,7 +107,10 @@ Minor = 建议修复 (命名/风格)
107
107
  "minor_issues": [],
108
108
  "accepted_tasks": [],
109
109
  "rework_tasks": ["2.3", "2.4"],
110
- "evidence": ["ev:test:phase-2", "ev:lint:phase-2"]
110
+ "evidence": [
111
+ {"id": "ev:test:phase-2", "scope": "task:2.3"},
112
+ {"id": "ev:lint:phase-2", "scope": "task:2.3"}
113
+ ]
111
114
  }
112
115
  ```
113
116
 
package/commands/prd.md CHANGED
@@ -51,99 +51,9 @@ argument-hint: File path to requirements doc, or inline description text
51
51
  - 使用 references/questioning.md 的提问技巧 (如可用)
52
52
  - 用户回答后,可适当追问直到需求清晰
53
53
 
54
- <!-- 以下 STEP 5-12 start.md -->
54
+ <!-- STEP 5-12: 共享执行流程 — 修改 workflows/execution-flow.md 即同步所有入口 -->
55
55
 
56
- ## STEP 5: 智能判断是否需要研究
57
-
58
- - 新项目 / 涉及新技术栈 → 必须研究
59
- - 简单 bug 修复 / 已有研究且未过期 → 跳过
60
- - 用户明确要求 → 研究
61
- - 需要时 → 派发 researcher 子代理 → 展示关键发现
62
- - 不需要 → 跳过,进入下一步
63
-
64
- ## STEP 6: 深度思考
65
-
66
- - 如有 sequential-thinking MCP → 调用深入思考
67
- - 无则跳过,不影响流程
68
-
69
- ## STEP 7: 生成分阶段计划
70
-
71
- - phase 负责管理与验收,task 负责执行
72
- - 每阶段控制在 5-8 个 task (便于 phase-level 收口)
73
- - 每个 task = 原子化 todo (含文件、操作、验证条件)
74
- - 每个 task 补充元数据: `requires` / `review_required` / `research_basis`
75
- - 审查级别按影响面判定: L0(无运行时语义变化) / L1(普通) / L2(高风险)
76
- - 标注可并行任务组 [PARALLEL] (当前仅作未来升级标记)
77
-
78
- ## STEP 8: 计划自审
79
-
80
- 轻量替代 plan-checker:
81
- - 检查: 是否有遗漏的需求点?
82
- - 检查: 阶段划分是否合理?(phase 过大则拆分)
83
- - 检查: 任务依赖关系是否正确?
84
- - 检查: 验证条件是否可执行?
85
- - 如属高风险项目 → 升级为增强计划审查:
86
-
87
- <enhanced_plan_review>
88
- 触发条件: 涉及 auth / payment / security / public API / DB migration / 核心架构变更
89
-
90
- 审查维度:
91
- 1. 需求覆盖: 原始需求的每个要点是否都映射到了至少一个 task?
92
- 2. 风险排序: 高风险 task 是否排在前面?(fail-fast 原则)
93
- 3. 依赖安全: L2 task 的下游是否都用了 gate:accepted?
94
- 4. 验证充分: 涉及 auth/payment 的 task 是否都有明确的安全验证条件?
95
- 5. 陷阱规避: research/PITFALLS.md 中的每个陷阱是否都有对应的防御 task 或验证条件?
96
-
97
- 输出: pass / revise (附具体修正建议)
98
- 轮次: 最多 2 轮自审修正;2 轮后仍有问题 → 标注风险展示给用户
99
- </enhanced_plan_review>
100
-
101
- → 自审修正后再展示给用户
102
-
103
- <HARD-GATE id="plan-confirmation">
104
- ## STEP 9: 展示计划,等待用户确认
105
-
106
- - 展示完整分阶段计划
107
- - 用户指出问题 → 调整 → 再展示
108
- - 用户确认 → 继续
109
-
110
- ⛔ 不得在用户确认前执行 STEP 10-12。未确认 = 不写文件、不执行代码。
111
- </HARD-GATE>
112
-
113
- ## STEP 10: 生成文档
114
-
115
- - 创建 .gsd/ 目录
116
- - 写入 state.json + plan.md + phases/*.md
117
- - 初始化 `workflow_mode` / `current_task` / `current_review` / phase 状态与 handoff 信息
118
- - 如有研究: 写入 .gsd/research/
119
-
120
- <HARD-GATE id="docs-written">
121
- □ state.json 已写入且包含所有 canonical fields
122
- □ plan.md 已写入
123
- □ phases/*.md 已写入 (每个 phase 一个文件)
124
- □ 所有 task 都有 lifecycle / level / requires / review_required
125
- → 全部满足才可继续
126
- </HARD-GATE>
127
-
128
- ## STEP 11 — 自动执行主路径
129
-
130
- 进入执行主循环。phase = 管理边界,task = 执行边界。
131
-
132
- <execution_loop>
133
- 参考 `references/execution-loop.md` 获取完整 9 步执行循环规范 (11.1-11.9) 及依赖门槛语义。
134
-
135
- 编排器必须严格按照该参考文档中的步骤顺序执行:
136
- 加载 phase → 选择 task → 构建上下文 → 派发 executor → 处理结果 → 审查 → phase handoff → 批量更新 → 上下文检查
137
- </execution_loop>
138
-
139
- ## STEP 12 — 全部完成
140
-
141
- 全部 phase 完成后,输出最终报告:
142
- - 项目总结
143
- - 各阶段完成情况
144
- - 关键 decisions 汇总
145
- - 验证 evidence 汇总
146
- - 遗留问题 / 后续建议 (如有)
56
+ 使用 Read 工具读取 `workflows/execution-flow.md`,严格按照其中 STEP 5-12 执行。
147
57
 
148
58
  </process>
149
59
 
@@ -212,6 +212,79 @@ description: Resume project execution from saved state with workspace validation
212
212
 
213
213
  所有展示数据从 canonical fields 实时推导,不使用 derived fields。
214
214
 
215
+ ## STEP 5: 自动执行循环
216
+
217
+ <HARD-GATE id="auto-execution-loop">
218
+ STEP 3 完成初次恢复后,进入自动执行循环。这是编排器的核心 —— 不要停在某一步等待用户,除非遇到终止条件。
219
+
220
+ ```
221
+ 循环入口:
222
+ 1. 调用 MCP tool `orchestrator-resume` 获取 action
223
+ 2. 根据 action 分派:
224
+
225
+ dispatch_executor:
226
+ → 使用 Agent tool 派发 executor 子代理 (subagent_type: gsd:executor)
227
+ → 传入 orchestrator 返回的 executor_context
228
+ → 收到 executor 结果后 → 调用 MCP tool `orchestrator-handle-executor-result`
229
+ → 回到步骤 1
230
+
231
+ dispatch_reviewer:
232
+ → 使用 Agent tool 派发 reviewer 子代理 (subagent_type: gsd:reviewer)
233
+ → 传入 review_targets / current_review
234
+ → 收到 reviewer 结果后 → 调用 MCP tool `orchestrator-handle-reviewer-result`
235
+ → 回到步骤 1
236
+
237
+ dispatch_researcher:
238
+ → 使用 Agent tool 派发 researcher 子代理 (subagent_type: gsd:researcher)
239
+ → 传入过期的研究信息
240
+ → 收到 researcher 结果后 → 调用 MCP tool `orchestrator-handle-researcher-result`
241
+ → 回到步骤 1
242
+
243
+ dispatch_debugger:
244
+ → 使用 Agent tool 派发 debugger 子代理 (subagent_type: gsd:debugger)
245
+ → 传入 debug_target
246
+ → 收到 debugger 结果后 → 调用 MCP tool `orchestrator-handle-debugger-result`
247
+ → 回到步骤 1
248
+
249
+ trigger_review:
250
+ → 直接派发 reviewer,scope 和 targets 从 action 响应中获取
251
+ → 回到步骤 1
252
+
253
+ complete_phase:
254
+ → 调用 MCP tool `phase-complete`,传入 phase_id + run_verify: true
255
+ → 回到步骤 1 (编排器会自动推进到下一 phase)
256
+
257
+ retry_executor:
258
+ → 重新调用 orchestrator-resume 获取更新后的 executor 上下文
259
+ → 回到步骤 1
260
+
261
+ rollback_to_dirty_phase:
262
+ → 编排器已自动回滚 current_phase,输出回滚通知
263
+ → 回到步骤 1
264
+
265
+ continue_execution:
266
+ → 直接回到步骤 1
267
+
268
+ 3. 终止条件 — 遇到以下 action 时退出循环:
269
+
270
+ idle → 输出 "无可执行任务",停止
271
+ awaiting_user → 展示 blockers / drift 信息,等待用户输入
272
+ await_manual_intervention → 展示需要人工干预的信息,停止
273
+ noop (completed) → 展示完成报告,停止
274
+ await_recovery_decision (failed) → 展示失败信息和恢复选项,停止
275
+
276
+ 4. 上下文安全阀:
277
+ 每次循环迭代前检查上下文健康度
278
+ remaining <= 35% → 保存状态 + 输出 "请 /clear 后 /gsd:resume" → 退出循环
279
+ ```
280
+
281
+ **关键原则:**
282
+ - 循环是连续的: dispatch → handle result → resume → dispatch → ...
283
+ - 不在中间步骤停下来等用户确认(除非是终止条件)
284
+ - 每次 handle result 后立即 resume,让编排器决定下一步
285
+ - Phase 审查通过后 → complete_phase → 自动推进下一 phase → 继续执行
286
+ </HARD-GATE>
287
+
215
288
  </process>
216
289
 
217
290
  <EXTREMELY-IMPORTANT>
@@ -221,4 +294,5 @@ description: Resume project execution from saved state with workspace validation
221
294
  - awaiting_user / reconcile_workspace / replan_required 模式下不自动执行代码
222
295
  - 只有编排器写 state.json,子代理不直接写
223
296
  - 上下文 < 35% → 保存状态 + workflow_mode = awaiting_clear + 停止执行
297
+ - **进入自动执行循环后,不要在循环中间停下来等用户 — 让编排器驱动**
224
298
  </EXTREMELY-IMPORTANT>
package/commands/start.md CHANGED
@@ -44,133 +44,8 @@ argument-hint: Optional feature or project description
44
44
  └── 否 → 追问
45
45
  ```
46
46
 
47
- ## STEP 5 — 智能研究判断
47
+ <!-- STEP 5-12: 共享执行流程 修改 workflows/execution-flow.md 即同步所有入口 -->
48
48
 
49
- 判断是否需要研究:
50
- ```
51
- ├── 新项目 → 必须研究
52
- ├── 涉及新技术栈 → 必须研究
53
- ├── 简单 bug 修复 / 小功能 → 跳过研究
54
- ├── 已有 .gsd/research/ 且未过期 → 跳过研究
55
- ├── 用户明确要求 → 研究
56
- └── 已有研究但需求方向变了 → 增量研究 (只研究新方向)
57
- ```
58
-
59
- 需要研究时:
60
- 1. 派发 `researcher` 子代理 (新鲜上下文)
61
- 2. 研究输出写入 `.gsd/research/` (STACK.md, ARCHITECTURE.md, PITFALLS.md, SUMMARY.md)
62
- 3. 向用户展示关键发现: 技术栈推荐 + 陷阱警告 + ⭐ 推荐方案
63
-
64
- 不需要时: 跳过,直接进入 STEP 6。
65
-
66
- ## STEP 6 — 深度思考
67
-
68
- 如有 `sequential-thinking` MCP 可用 → 调用深入思考:
69
- - 输入: 需求摘要 + 代码库分析 + 研究结果 (如有)
70
- - 目的: 在生成计划前进行系统性架构思考
71
-
72
- 如无 `sequential-thinking` MCP → 降级为内联思考,继续。
73
-
74
- ## STEP 7 — 生成分阶段计划
75
-
76
- 生成 plan.md + phases/*.md:
77
- - **phase** 负责管理与验收,**task** 负责执行
78
- - 每阶段控制在 **5-8 个 task** (便于 phase-level 收口)
79
- - 每个 task = 原子化 todo (含文件、操作、验证条件)
80
- - 每个 task 补充元数据:
81
- - `requires` — 依赖列表 (含 gate 类型)
82
- - `review_required` — 是否需要审查
83
- - `research_basis` — 引用的 research decision id
84
- - 审查级别按影响面判定:
85
- - **L0** — 无运行时语义变化 (docs/config/style)
86
- - **L1** — 普通编码任务 (默认)
87
- - **L2** — 高风险 (auth/payment/public API/DB migration/核心架构)
88
- - 标注可并行任务组 `[PARALLEL]` (当前仅作未来升级标记)
89
-
90
- ## STEP 8 — 计划自审
91
-
92
- 轻量自审 (编排器自身执行,不派发子代理):
93
-
94
- ### 基础审查 (所有项目)
95
- - [ ] 是否有遗漏的需求点?
96
- - [ ] 阶段划分是否合理?(phase 过大则拆分)
97
- - [ ] 任务依赖关系是否正确?
98
- - [ ] 验证条件是否可执行?
99
-
100
- ### 增强审查 (高风险项目)
101
-
102
- 触发条件: 项目涉及 auth / payment / security / public API / DB migration / 核心架构变更
103
-
104
- 维度:
105
- 1. **需求覆盖:** 原始需求的每个要点是否都映射到了至少一个 task?
106
- 2. **风险排序:** 高风险 task 是否排在前面?(fail-fast 原则)
107
- 3. **依赖安全:** L2 task 的下游是否都用了 `gate:accepted`?
108
- 4. **验证充分:** 涉及 auth/payment 的 task 是否都有明确的安全验证条件?
109
- 5. **陷阱规避:** `research/PITFALLS.md` 中的每个陷阱是否都有对应的防御 task 或验证条件?
110
-
111
- 输出: `pass` / `revise` (附具体修正建议)
112
- 轮次: 最多 2 轮自审修正;2 轮后仍有问题 → 标注风险展示给用户
113
-
114
- → 自审修正后再展示给用户。
115
-
116
- <HARD-GATE id="plan-confirmation">
117
- ## STEP 9 — 用户确认计划
118
-
119
- 展示计划给用户,等待确认:
120
- - 用户指出问题 → 调整计划 → 重新展示
121
- - 用户确认 → 继续
122
-
123
- ⛔ 不得在用户确认前执行 STEP 10-12。未确认 = 不写文件、不执行代码。
124
- </HARD-GATE>
125
-
126
- <HARD-GATE id="docs-written">
127
- ## STEP 10 — 生成文档
128
-
129
- 1. 创建 `.gsd/` 目录
130
- 2. 写入 `state.json`:
131
- - 初始化 `workflow_mode: "executing_task"`
132
- - 初始化 `current_phase: 1`
133
- - 初始化 `current_task: null` (由执行循环填充)
134
- - 初始化 `current_review: null`
135
- - 初始化所有 phase lifecycle = `pending` (第一个 = `active`)
136
- - 初始化所有 task lifecycle = `pending`
137
- - 初始化 phase_handoff 信息
138
- - 初始化 `decisions: []`
139
- - 初始化 `context.remaining_percentage`
140
- 3. 写入 `plan.md` — 项目总览索引 (不含 task 级细节)
141
- 4. 写入 `phases/*.md` — 每阶段详细 task 规格 (source of truth)
142
- 5. 如有研究: 确认 `.gsd/research/` 已写入
143
-
144
- 规则:
145
- - `plan.md` 是只读索引: 生成后不再修改 (除非 replan)
146
- - `phases/*.md` 是 task 规格的唯一 source of truth
147
- - `plan.md` 不包含 task 级细节,避免与 `phases/*.md` 重复
148
-
149
- □ state.json 已写入且包含所有 canonical fields
150
- □ plan.md 已写入
151
- □ phases/*.md 已写入 (每个 phase 一个文件)
152
- □ 所有 task 都有 lifecycle / level / requires / review_required
153
- → 全部满足才可继续
154
- </HARD-GATE>
155
-
156
- ## STEP 11 — 自动执行主路径
157
-
158
- 进入执行主循环。phase = 管理边界,task = 执行边界。
159
-
160
- <execution_loop>
161
- 参考 `references/execution-loop.md` 获取完整 9 步执行循环规范 (11.1-11.9) 及依赖门槛语义。
162
-
163
- 编排器必须严格按照该参考文档中的步骤顺序执行:
164
- 加载 phase → 选择 task → 构建上下文 → 派发 executor → 处理结果 → 审查 → phase handoff → 批量更新 → 上下文检查
165
- </execution_loop>
166
-
167
- ## STEP 12 — 最终报告
168
-
169
- 全部 phase 完成后,输出最终报告:
170
- - 项目总结
171
- - 各阶段完成情况
172
- - 关键 decisions 汇总
173
- - 验证 evidence 汇总
174
- - 遗留问题 / 后续建议 (如有)
49
+ 使用 Read 工具读取 `workflows/execution-flow.md`,严格按照其中 STEP 5-12 执行。
175
50
 
176
51
  </process>
@@ -56,8 +56,8 @@ process.stdin.on('end', () => {
56
56
  }
57
57
 
58
58
  // Context window display (USED percentage scaled to usable context)
59
- // Claude Code reserves ~16.5% for autocompact buffer
60
- const AUTO_COMPACT_BUFFER_PCT = 16.5;
59
+ // Claude Code reserves ~16.5% for autocompact buffer (configurable via env)
60
+ const AUTO_COMPACT_BUFFER_PCT = Number(process.env.GSD_AUTOCOMPACT_BUFFER) || 16.5;
61
61
  let ctx = '';
62
62
  if (remaining != null) {
63
63
  const usableRemaining = Math.max(0, ((remaining - AUTO_COMPACT_BUFFER_PCT) / (100 - AUTO_COMPACT_BUFFER_PCT)) * 100);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gsd-lite",
3
- "version": "0.4.1",
3
+ "version": "0.5.0",
4
4
  "description": "AI orchestration tool for Claude Code — GSD management shell + Superpowers quality core",
5
5
  "type": "module",
6
6
  "bin": {
package/src/schema.js CHANGED
@@ -19,7 +19,7 @@ export const WORKFLOW_MODES = [
19
19
 
20
20
  export const TASK_LIFECYCLE = {
21
21
  pending: ['running', 'blocked'],
22
- running: ['checkpointed', 'blocked', 'failed'],
22
+ running: ['checkpointed', 'blocked', 'failed', 'accepted'], // accepted: auto-accept for L0/review_required=false (atomic, skips checkpointed)
23
23
  checkpointed: ['accepted', 'needs_revalidation'],
24
24
  accepted: ['needs_revalidation'],
25
25
  blocked: ['pending'],
@@ -106,7 +106,7 @@ export function validateResearchDecisionIndex(decisionIndex, requiredIds = []) {
106
106
  return { valid: errors.length === 0, errors };
107
107
  }
108
108
 
109
- export function validateResearchArtifacts(artifacts, { decisionIds = [], volatility, expiresAt } = {}) {
109
+ export function validateResearchArtifacts(artifacts) {
110
110
  const errors = [];
111
111
  if (!isPlainObject(artifacts)) {
112
112
  return { valid: false, errors: ['artifacts must be an object'] };
@@ -119,19 +119,6 @@ export function validateResearchArtifacts(artifacts, { decisionIds = [], volatil
119
119
  }
120
120
  }
121
121
 
122
- const summary = typeof artifacts['SUMMARY.md'] === 'string' ? artifacts['SUMMARY.md'] : '';
123
- if (volatility && !summary.includes(volatility)) {
124
- errors.push('artifacts.SUMMARY.md must mention volatility');
125
- }
126
- if (expiresAt && !summary.includes(expiresAt)) {
127
- errors.push('artifacts.SUMMARY.md must mention expires_at');
128
- }
129
- for (const id of decisionIds) {
130
- if (!summary.includes(id)) {
131
- errors.push(`artifacts.SUMMARY.md must mention decision id ${id}`);
132
- }
133
- }
134
-
135
122
  return { valid: errors.length === 0, errors };
136
123
  }
137
124
 
@@ -452,6 +439,13 @@ export function validateState(state) {
452
439
  if (!Number.isFinite(phase.done)) {
453
440
  errors.push(`Phase ${phase.id}: done must be a finite number`);
454
441
  }
442
+ // Cross-validate done against actual accepted tasks
443
+ if (Number.isFinite(phase.done) && Array.isArray(phase.todo)) {
444
+ const acceptedCount = phase.todo.filter(t => t.lifecycle === 'accepted').length;
445
+ if (phase.done !== acceptedCount) {
446
+ errors.push(`Phase ${phase.id}: done (${phase.done}) does not match accepted task count (${acceptedCount})`);
447
+ }
448
+ }
455
449
  if (!Array.isArray(phase.todo)) {
456
450
  errors.push(`Phase ${phase.id}: todo must be an array`);
457
451
  continue;
@@ -4,7 +4,6 @@ import {
4
4
  read,
5
5
  storeResearch,
6
6
  update,
7
- addEvidence,
8
7
  selectRunnableTask,
9
8
  buildExecutorContext,
10
9
  matchDecisionForBlocker,
@@ -278,6 +277,19 @@ function buildDecisionEntries(decisions, phaseId, taskId, existingCount = 0) {
278
277
  .filter(Boolean);
279
278
  }
280
279
 
280
+ function buildErrorFingerprint(result) {
281
+ const parts = [];
282
+ if (result.blockers?.length > 0) {
283
+ const b = result.blockers[0];
284
+ parts.push(typeof b === 'string' ? b : (b.reason || b.type || ''));
285
+ }
286
+ if (result.files_changed?.length > 0) {
287
+ parts.push([...result.files_changed].sort().join(','));
288
+ }
289
+ const combined = parts.filter(Boolean).join('|');
290
+ return combined.length > 0 ? combined.slice(0, 120) : result.summary.slice(0, 80);
291
+ }
292
+
281
293
  function getBlockedReasonFromResult(result) {
282
294
  const firstBlocker = (result.blockers || [])[0];
283
295
  if (!firstBlocker) return { blocked_reason: result.summary, unblock_condition: null };
@@ -705,15 +717,44 @@ export async function resumeWorkflow({ basePath = process.cwd(), _depth = 0 } =
705
717
  message: 'Project is paused. Confirm to resume execution.',
706
718
  };
707
719
  case 'planning':
708
- case 'reconcile_workspace':
709
- case 'replan_required':
710
- case 'research_refresh_needed':
711
720
  return {
712
721
  success: true,
713
722
  action: 'await_manual_intervention',
714
723
  workflow_mode: state.workflow_mode,
715
- message: `workflow_mode "${state.workflow_mode}" is recognized but not yet automated by the orchestrator`,
724
+ guidance: 'Complete planning and call state-init to initialize the project',
725
+ message: 'Project is in planning mode; complete the plan and initialize with state-init',
726
+ };
727
+ case 'reconcile_workspace': {
728
+ const reconGitHead = await getGitHead(basePath);
729
+ return {
730
+ success: true,
731
+ action: 'reconcile_workspace',
732
+ workflow_mode: state.workflow_mode,
733
+ expected_head: state.git_head,
734
+ actual_head: reconGitHead,
735
+ guidance: 'Workspace git HEAD has diverged. Verify changes and update git_head via state-update, then set workflow_mode to executing_task',
736
+ message: `Git HEAD mismatch: saved=${state.git_head}, current=${reconGitHead}`,
737
+ };
738
+ }
739
+ case 'replan_required':
740
+ return {
741
+ success: true,
742
+ action: 'replan_required',
743
+ workflow_mode: state.workflow_mode,
744
+ guidance: 'Plan files modified since last session. Review changes, update the plan if needed, then set workflow_mode to executing_task via state-update',
745
+ message: 'Plan artifacts modified since last session; review and re-align before resuming',
716
746
  };
747
+ case 'research_refresh_needed': {
748
+ const expiredResearch = collectExpiredResearch(state);
749
+ return {
750
+ success: true,
751
+ action: 'dispatch_researcher',
752
+ workflow_mode: state.workflow_mode,
753
+ expired_research: expiredResearch,
754
+ guidance: 'Research cache expired. Dispatch researcher sub-agent to refresh, then call orchestrator-handle-researcher-result',
755
+ message: 'Research has expired and must be refreshed before execution can resume',
756
+ };
757
+ }
717
758
  default:
718
759
  return {
719
760
  error: true,
@@ -746,55 +787,47 @@ export async function handleExecutorResult({ result, basePath = process.cwd() }
746
787
  if (result.outcome === 'checkpointed') {
747
788
  const reviewLevel = reclassifyReviewLevel(task, result);
748
789
  const isL0 = reviewLevel === 'L0';
790
+ const autoAccept = isL0 || task.review_required === false;
749
791
 
750
792
  const current_review = !isL0 && reviewLevel === 'L2' && task.review_required !== false
751
793
  ? { scope: 'task', scope_id: task.id, stage: 'spec' }
752
794
  : null;
753
795
  const workflow_mode = current_review ? 'reviewing_task' : 'executing_task';
754
796
 
755
- // First persist: checkpoint the task (running → checkpointed)
797
+ // Single atomic persist: auto-accept goes directly running → accepted,
798
+ // otherwise running → checkpointed (awaiting review)
799
+ const taskPatch = {
800
+ id: task.id,
801
+ lifecycle: autoAccept ? 'accepted' : 'checkpointed',
802
+ checkpoint_commit: result.checkpoint_commit,
803
+ files_changed: result.files_changed || [],
804
+ evidence_refs: result.evidence || [],
805
+ level: reviewLevel,
806
+ blocked_reason: null,
807
+ unblock_condition: null,
808
+ debug_context: null,
809
+ };
810
+ const phasePatch = { id: phase.id, todo: [taskPatch] };
811
+ // done is auto-recomputed by update() — no manual increment needed
812
+
813
+ // Bundle evidence into the same atomic persist to prevent inconsistency
814
+ const evidenceUpdates = {};
815
+ for (const ev of (result.evidence || [])) {
816
+ if (ev && typeof ev === 'object' && typeof ev.id === 'string' && typeof ev.scope === 'string') {
817
+ evidenceUpdates[ev.id] = ev;
818
+ }
819
+ }
820
+
756
821
  const persistError = await persist(basePath, {
757
822
  workflow_mode,
758
823
  current_task: null,
759
824
  current_review,
760
825
  decisions,
761
- phases: [{
762
- id: phase.id,
763
- todo: [{
764
- id: task.id,
765
- lifecycle: 'checkpointed',
766
- checkpoint_commit: result.checkpoint_commit,
767
- files_changed: result.files_changed || [],
768
- evidence_refs: result.evidence || [],
769
- level: reviewLevel,
770
- blocked_reason: null,
771
- unblock_condition: null,
772
- debug_context: null,
773
- }],
774
- }],
826
+ phases: [phasePatch],
827
+ ...(Object.keys(evidenceUpdates).length > 0 ? { evidence: evidenceUpdates } : {}),
775
828
  });
776
829
  if (persistError) return persistError;
777
830
 
778
- // Store structured evidence entries
779
- for (const ev of (result.evidence || [])) {
780
- if (ev && typeof ev === 'object' && typeof ev.id === 'string' && typeof ev.scope === 'string') {
781
- await addEvidence({ id: ev.id, data: ev, basePath });
782
- }
783
- }
784
-
785
- // Auto-accept: L0 tasks or tasks with review_required: false
786
- const autoAccept = isL0 || task.review_required === false;
787
- if (autoAccept) {
788
- const acceptError = await persist(basePath, {
789
- phases: [{
790
- id: phase.id,
791
- done: (phase.done || 0) + 1,
792
- todo: [{ id: task.id, lifecycle: 'accepted' }],
793
- }],
794
- });
795
- if (acceptError) return acceptError;
796
- }
797
-
798
831
  return {
799
832
  success: true,
800
833
  action: current_review ? 'dispatch_reviewer' : 'continue_execution',
@@ -841,7 +874,7 @@ export async function handleExecutorResult({ result, basePath = process.cwd() }
841
874
  const retry_count = (task.retry_count || 0) + 1;
842
875
  const error_fingerprint = typeof result.error_fingerprint === 'string' && result.error_fingerprint.length > 0
843
876
  ? result.error_fingerprint
844
- : result.summary.slice(0, 80);
877
+ : buildErrorFingerprint(result);
845
878
  const shouldDebug = retry_count >= MAX_DEBUG_RETRY;
846
879
  const current_review = shouldDebug
847
880
  ? {
@@ -989,8 +1022,6 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
989
1022
  }
990
1023
 
991
1024
  const taskPatches = [];
992
- let doneIncrement = 0;
993
- let doneDecrement = 0;
994
1025
 
995
1026
  // Accept tasks
996
1027
  for (const taskId of (result.accepted_tasks || [])) {
@@ -998,7 +1029,6 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
998
1029
  if (!task) continue;
999
1030
  if (task.lifecycle === 'checkpointed') {
1000
1031
  taskPatches.push({ id: taskId, lifecycle: 'accepted' });
1001
- doneIncrement += 1;
1002
1032
  }
1003
1033
  }
1004
1034
 
@@ -1007,19 +1037,10 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
1007
1037
  const task = getTaskById(phase, taskId);
1008
1038
  if (!task) continue;
1009
1039
  if (task.lifecycle === 'checkpointed' || task.lifecycle === 'accepted') {
1010
- if (task.lifecycle === 'accepted') doneDecrement += 1;
1011
1040
  taskPatches.push({ id: taskId, lifecycle: 'needs_revalidation', evidence_refs: [] });
1012
1041
  }
1013
1042
  }
1014
1043
 
1015
- // Snapshot accepted task IDs before propagation (for done counter adjustment).
1016
- // Note: rework_tasks patches above are NOT yet applied in-memory, so tasks demoted
1017
- // by the rework loop are still 'accepted' here. The guard below
1018
- // `!taskPatches.some(p => p.id === task.id)` prevents double-counting.
1019
- const acceptedBeforePropagation = new Set(
1020
- (phase.todo || []).filter(t => t.lifecycle === 'accepted').map(t => t.id),
1021
- );
1022
-
1023
1044
  // Propagation for critical issues with invalidates_downstream
1024
1045
  for (const issue of (result.critical_issues || [])) {
1025
1046
  if (issue.invalidates_downstream && issue.task_id) {
@@ -1031,18 +1052,15 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
1031
1052
  for (const task of (phase.todo || [])) {
1032
1053
  if (task.lifecycle === 'needs_revalidation' && !taskPatches.some((p) => p.id === task.id)) {
1033
1054
  taskPatches.push({ id: task.id, lifecycle: 'needs_revalidation', evidence_refs: [] });
1034
- if (acceptedBeforePropagation.has(task.id)) {
1035
- doneDecrement += 1;
1036
- }
1037
1055
  }
1038
1056
  }
1039
1057
 
1040
1058
  const hasCritical = (result.critical_issues || []).length > 0;
1041
1059
  const reviewStatus = hasCritical ? 'rework_required' : 'accepted';
1042
1060
 
1061
+ // done is auto-recomputed by update() — no manual tracking needed
1043
1062
  const phaseUpdates = {
1044
1063
  id: phase.id,
1045
- done: Math.max(0, (phase.done || 0) + doneIncrement - doneDecrement),
1046
1064
  phase_review: {
1047
1065
  status: reviewStatus,
1048
1066
  ...(hasCritical
@@ -1063,20 +1081,22 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
1063
1081
 
1064
1082
  const workflowMode = 'executing_task';
1065
1083
 
1084
+ // Bundle evidence into the same atomic persist
1085
+ const evidenceUpdates = {};
1086
+ for (const ev of (result.evidence || [])) {
1087
+ if (ev && typeof ev === 'object' && typeof ev.id === 'string' && typeof ev.scope === 'string') {
1088
+ evidenceUpdates[ev.id] = ev;
1089
+ }
1090
+ }
1091
+
1066
1092
  const persistError = await persist(basePath, {
1067
1093
  workflow_mode: workflowMode,
1068
1094
  current_review: null,
1069
1095
  phases: [phaseUpdates],
1096
+ ...(Object.keys(evidenceUpdates).length > 0 ? { evidence: evidenceUpdates } : {}),
1070
1097
  });
1071
1098
  if (persistError) return persistError;
1072
1099
 
1073
- // Store evidence entries if provided
1074
- for (const ev of (result.evidence || [])) {
1075
- if (ev && typeof ev === 'object' && typeof ev.id === 'string' && typeof ev.scope === 'string') {
1076
- await addEvidence({ id: ev.id, data: ev, basePath });
1077
- }
1078
- }
1079
-
1080
1100
  return {
1081
1101
  success: true,
1082
1102
  action: hasCritical ? 'rework_required' : 'review_accepted',
@@ -42,6 +42,16 @@ export function setLockPath(lockPath) {
42
42
  _fileLockPath = lockPath;
43
43
  }
44
44
 
45
+ /**
46
+ * Ensure _fileLockPath is set from a known state path.
47
+ * Must be called before withStateLock in all mutation paths.
48
+ */
49
+ function ensureLockPathFromStatePath(statePath) {
50
+ if (!_fileLockPath && statePath) {
51
+ _fileLockPath = join(dirname(statePath), 'state.lock');
52
+ }
53
+ }
54
+
45
55
  function withStateLock(fn) {
46
56
  const p = _mutationQueue.then(() => {
47
57
  if (_fileLockPath) {
@@ -80,6 +90,7 @@ export async function init({ project, phases, research, force = false, basePath
80
90
  }
81
91
  const gsdDir = join(basePath, '.gsd');
82
92
  const statePath = join(gsdDir, 'state.json');
93
+ ensureLockPathFromStatePath(statePath);
83
94
 
84
95
  return withStateLock(async () => {
85
96
  // Guard: reject re-initialization unless force is set
@@ -129,6 +140,9 @@ export async function init({ project, phases, research, force = false, basePath
129
140
  ...state.phases.map((phase) => join(phasesDir, `phase-${phase.id}.md`)),
130
141
  ];
131
142
  const mtimes = await Promise.all(trackedFiles.map(async (filePath) => (await stat(filePath)).mtimeMs));
143
+ // Math.ceil is required: mtimeMs has sub-millisecond precision (float), but
144
+ // Date.toISOString() truncates to milliseconds. Without ceil, the stored timestamp
145
+ // can be slightly less than the file's actual mtime, causing false plan-drift detection.
132
146
  state.context.last_session = new Date(Math.ceil(Math.max(...mtimes))).toISOString();
133
147
  await writeJson(statePath, state);
134
148
 
@@ -195,8 +209,7 @@ export async function update({ updates, basePath = process.cwd() } = {}) {
195
209
  if (!statePath) {
196
210
  return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
197
211
  }
198
- // C-2: Initialize cross-process lock path on first mutation
199
- if (!_fileLockPath) _fileLockPath = join(dirname(statePath), 'state.lock');
212
+ ensureLockPathFromStatePath(statePath);
200
213
 
201
214
  return withStateLock(async () => {
202
215
  const result = await readJson(statePath);
@@ -242,6 +255,12 @@ export async function update({ updates, basePath = process.cwd() } = {}) {
242
255
 
243
256
  // Deep merge phases by ID instead of shallow replace [I-1]
244
257
  const merged = { ...state, ...updates };
258
+
259
+ // Deep merge evidence by key (preserves existing entries)
260
+ if (updates.evidence && isPlainObject(updates.evidence)) {
261
+ merged.evidence = { ...(state.evidence || {}), ...updates.evidence };
262
+ }
263
+
245
264
  if (updates.phases && Array.isArray(updates.phases)) {
246
265
  merged.phases = state.phases.map(oldPhase => {
247
266
  const newPhase = updates.phases.find(p => p.id === oldPhase.id);
@@ -276,6 +295,21 @@ export async function update({ updates, basePath = process.cwd() } = {}) {
276
295
  }
277
296
  }
278
297
 
298
+ // Recompute `done` from actual accepted tasks (prevents counter drift)
299
+ if (updates.phases && Array.isArray(updates.phases)) {
300
+ for (const phase of merged.phases) {
301
+ if (Array.isArray(phase.todo)) {
302
+ phase.done = phase.todo.filter(t => t.lifecycle === 'accepted').length;
303
+ }
304
+ }
305
+ }
306
+
307
+ // Auto-prune evidence when entries exceed limit
308
+ if (merged.evidence && Object.keys(merged.evidence).length > MAX_EVIDENCE_ENTRIES) {
309
+ const gsdDir = dirname(statePath);
310
+ await _pruneEvidenceFromState(merged, merged.current_phase, gsdDir);
311
+ }
312
+
279
313
  // Use incremental validation for simple updates (no phases changes)
280
314
  const validation = !updates.phases
281
315
  ? validateStateUpdate(state, updates)
@@ -336,11 +370,12 @@ export async function phaseComplete({
336
370
  if (!statePath) {
337
371
  return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
338
372
  }
373
+ ensureLockPathFromStatePath(statePath);
339
374
 
340
375
  return withStateLock(async () => {
341
376
  const result = await readJson(statePath);
342
377
  if (!result.ok) {
343
- return { error: true, message: result.error };
378
+ return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: result.error };
344
379
  }
345
380
  const state = result.data;
346
381
 
@@ -484,11 +519,12 @@ export async function addEvidence({ id, data, basePath = process.cwd() }) {
484
519
  if (!statePath) {
485
520
  return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
486
521
  }
522
+ ensureLockPathFromStatePath(statePath);
487
523
 
488
524
  return withStateLock(async () => {
489
525
  const result = await readJson(statePath);
490
526
  if (!result.ok) {
491
- return { error: true, message: result.error };
527
+ return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: result.error };
492
528
  }
493
529
  const state = result.data;
494
530
 
@@ -563,11 +599,12 @@ export async function pruneEvidence({ currentPhase, basePath = process.cwd() })
563
599
  if (!statePath) {
564
600
  return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
565
601
  }
602
+ ensureLockPathFromStatePath(statePath);
566
603
 
567
604
  return withStateLock(async () => {
568
605
  const result = await readJson(statePath);
569
606
  if (!result.ok) {
570
- return { error: true, message: result.error };
607
+ return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: result.error };
571
608
  }
572
609
  const state = result.data;
573
610
 
@@ -831,16 +868,27 @@ export function reclassifyReviewLevel(task, executorResult) {
831
868
  const MIN_TOKEN_LENGTH = 2;
832
869
  const MIN_OVERLAP = 2;
833
870
 
871
+ // High-frequency words too generic for meaningful keyword matching
872
+ const STOPWORDS = new Set([
873
+ 'the', 'and', 'for', 'with', 'this', 'that', 'from', 'have', 'not',
874
+ 'but', 'are', 'was', 'been', 'will', 'can', 'should', 'would', 'could',
875
+ 'use', 'using', 'need', 'needs', 'into', 'also', 'when', 'then',
876
+ 'than', 'more', 'some', 'does', 'did', 'its', 'has', 'all', 'any',
877
+ 'error', 'data', 'type', 'value', 'file', 'code', 'function',
878
+ 'return', 'null', 'true', 'false', 'undefined', 'object', 'string',
879
+ 'number', 'array', 'list', 'map', 'set', 'key', 'name',
880
+ ]);
881
+
834
882
  /**
835
883
  * Tokenize a string into lowercase tokens, splitting on whitespace and punctuation.
836
- * Filters out short tokens (< MIN_TOKEN_LENGTH).
884
+ * Filters out short tokens (< MIN_TOKEN_LENGTH) and stopwords.
837
885
  */
838
886
  function tokenize(text) {
839
887
  if (!text) return [];
840
888
  return text
841
889
  .toLowerCase()
842
890
  .split(/[\s,.:;!?()[\]{}<>/\\|@#$%^&*+=~`'",。:;!?()【】、]+/)
843
- .filter(t => t.length >= MIN_TOKEN_LENGTH);
891
+ .filter(t => t.length >= MIN_TOKEN_LENGTH && !STOPWORDS.has(t));
844
892
  }
845
893
 
846
894
  /**
@@ -949,11 +997,7 @@ export async function storeResearch({ result, artifacts, decision_index, basePat
949
997
  return { error: true, code: ERROR_CODES.VALIDATION_FAILED, message: `Invalid researcher result: ${resultValidation.errors.join('; ')}` };
950
998
  }
951
999
 
952
- const artifactsValidation = validateResearchArtifacts(artifacts, {
953
- decisionIds: result.decision_ids,
954
- volatility: result.volatility,
955
- expiresAt: result.expires_at,
956
- });
1000
+ const artifactsValidation = validateResearchArtifacts(artifacts);
957
1001
  if (!artifactsValidation.valid) {
958
1002
  return { error: true, code: ERROR_CODES.VALIDATION_FAILED, message: `Invalid research artifacts: ${artifactsValidation.errors.join('; ')}` };
959
1003
  }
@@ -967,11 +1011,12 @@ export async function storeResearch({ result, artifacts, decision_index, basePat
967
1011
  if (!statePath) {
968
1012
  return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
969
1013
  }
1014
+ ensureLockPathFromStatePath(statePath);
970
1015
 
971
1016
  return withStateLock(async () => {
972
1017
  const current = await readJson(statePath);
973
1018
  if (!current.ok) {
974
- return { error: true, message: current.error };
1019
+ return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: current.error };
975
1020
  }
976
1021
 
977
1022
  const state = current.data;
@@ -1026,6 +1071,13 @@ export async function storeResearch({ result, artifacts, decision_index, basePat
1026
1071
  state.workflow_mode = inferWorkflowModeAfterResearch(state);
1027
1072
  }
1028
1073
 
1074
+ // Recompute done after applyResearchRefresh may have invalidated tasks
1075
+ for (const phase of (state.phases || [])) {
1076
+ if (Array.isArray(phase.todo)) {
1077
+ phase.done = phase.todo.filter(t => t.lifecycle === 'accepted').length;
1078
+ }
1079
+ }
1080
+
1029
1081
  const validation = validateState(state);
1030
1082
  if (!validation.valid) {
1031
1083
  return { error: true, code: ERROR_CODES.VALIDATION_FAILED, message: `State validation failed: ${validation.errors.join('; ')}` };
@@ -0,0 +1,147 @@
1
+ # 共享执行流程 (STEP 5-12)
2
+
3
+ > 由 start.md 和 prd.md 共享引用。修改此文件即同步两个入口。
4
+
5
+ ## STEP 5 — 智能研究判断
6
+
7
+ 判断是否需要研究:
8
+ ```
9
+ ├── 新项目 → 必须研究
10
+ ├── 涉及新技术栈 → 必须研究
11
+ ├── 简单 bug 修复 / 小功能 → 跳过研究
12
+ ├── 已有 .gsd/research/ 且未过期 → 跳过研究
13
+ ├── 用户明确要求 → 研究
14
+ └── 已有研究但需求方向变了 → 增量研究 (只研究新方向)
15
+ ```
16
+
17
+ 需要研究时:
18
+ 1. 派发 `researcher` 子代理 (新鲜上下文)
19
+ 2. 研究输出写入 `.gsd/research/` (STACK.md, ARCHITECTURE.md, PITFALLS.md, SUMMARY.md)
20
+ 3. 向用户展示关键发现: 技术栈推荐 + 陷阱警告 + ⭐ 推荐方案
21
+
22
+ 不需要时: 跳过,直接进入 STEP 6。
23
+
24
+ ## STEP 6 — 深度思考
25
+
26
+ 如有 `sequential-thinking` MCP 可用 → 调用深入思考:
27
+ - 输入: 需求摘要 + 代码库分析 + 研究结果 (如有)
28
+ - 目的: 在生成计划前进行系统性架构思考
29
+
30
+ 如无 `sequential-thinking` MCP → 降级为内联思考,继续。
31
+
32
+ ## STEP 7 — 生成分阶段计划
33
+
34
+ 生成 plan.md + phases/*.md:
35
+ - **phase** 负责管理与验收,**task** 负责执行
36
+ - 每阶段控制在 **5-8 个 task** (便于 phase-level 收口)
37
+ - 每个 task = 原子化 todo (含文件、操作、验证条件)
38
+ - 每个 task 补充元数据:
39
+ - `requires` — 依赖列表 (含 gate 类型)
40
+ - `review_required` — 是否需要审查
41
+ - `research_basis` — 引用的 research decision id
42
+ - 审查级别按影响面判定:
43
+ - **L0** — 无运行时语义变化 (docs/config/style)
44
+ - **L1** — 普通编码任务 (默认)
45
+ - **L2** — 高风险 (auth/payment/public API/DB migration/核心架构)
46
+ - 标注可并行任务组 `[PARALLEL]` (当前仅作未来升级标记)
47
+
48
+ ## STEP 8 — 计划自审
49
+
50
+ 轻量自审 (编排器自身执行,不派发子代理):
51
+
52
+ ### 基础审查 (所有项目)
53
+ - [ ] 是否有遗漏的需求点?
54
+ - [ ] 阶段划分是否合理?(phase 过大则拆分)
55
+ - [ ] 任务依赖关系是否正确?
56
+ - [ ] 验证条件是否可执行?
57
+
58
+ ### 增强审查 (高风险项目)
59
+
60
+ 触发条件: 项目涉及 auth / payment / security / public API / DB migration / 核心架构变更
61
+
62
+ 维度:
63
+ 1. **需求覆盖:** 原始需求的每个要点是否都映射到了至少一个 task?
64
+ 2. **风险排序:** 高风险 task 是否排在前面?(fail-fast 原则)
65
+ 3. **依赖安全:** L2 task 的下游是否都用了 `gate:accepted`?
66
+ 4. **验证充分:** 涉及 auth/payment 的 task 是否都有明确的安全验证条件?
67
+ 5. **陷阱规避:** `research/PITFALLS.md` 中的每个陷阱是否都有对应的防御 task 或验证条件?
68
+
69
+ 输出: `pass` / `revise` (附具体修正建议)
70
+ 轮次: 最多 2 轮自审修正;2 轮后仍有问题 → 标注风险展示给用户
71
+
72
+ → 自审修正后再展示给用户。
73
+
74
+ <HARD-GATE id="plan-confirmation">
75
+ ## STEP 9 — 用户确认计划
76
+
77
+ 展示计划给用户,等待确认:
78
+ - 用户指出问题 → 调整计划 → 重新展示
79
+ - 用户确认 → 继续
80
+
81
+ ⛔ 不得在用户确认前执行 STEP 10-12。未确认 = 不写文件、不执行代码。
82
+ </HARD-GATE>
83
+
84
+ <HARD-GATE id="docs-written">
85
+ ## STEP 10 — 生成文档
86
+
87
+ 1. 调用 `state-init` MCP 工具初始化项目:
88
+ ```
89
+ state-init({
90
+ project: "<项目名>",
91
+ phases: [
92
+ {
93
+ name: "<阶段名>",
94
+ tasks: [
95
+ { name: "<任务名>", level: "L1", review_required: true, requires: [] },
96
+ ...
97
+ ]
98
+ },
99
+ ...
100
+ ]
101
+ })
102
+ ```
103
+ ⚠️ 必须使用 `state-init` MCP 工具,禁止手写 state.json — 工具自动生成 id/lifecycle/phase_review/phase_handoff,内置 schema 校验和循环依赖检测。
104
+ 2. 写入 `plan.md` — 项目总览索引 (不含 task 级细节)
105
+ 3. 写入 `phases/*.md` — 每阶段详细 task 规格 (source of truth)
106
+ 4. 如有研究: 确认 `.gsd/research/` 已写入
107
+
108
+ 规则:
109
+ - `plan.md` 是只读索引: 生成后不再修改 (除非 replan)
110
+ - `phases/*.md` 是 task 规格的唯一 source of truth
111
+ - `plan.md` 不包含 task 级细节,避免与 `phases/*.md` 重复
112
+
113
+ □ state-init 调用成功 (返回 success: true)
114
+ □ plan.md 已写入
115
+ □ phases/*.md 已写入 (每个 phase 一个文件)
116
+ □ 所有 task 都有 lifecycle / level / requires / review_required
117
+ → 全部满足才可继续
118
+ </HARD-GATE>
119
+
120
+ ## STEP 11 — 自动执行主路径
121
+
122
+ 进入执行主循环。phase = 管理边界,task = 执行边界。
123
+
124
+ <execution_loop>
125
+ 参考 `references/execution-loop.md` 获取完整 9 步执行循环规范 (11.1-11.9) 及依赖门槛语义。
126
+
127
+ 编排器必须严格按照该参考文档中的步骤顺序执行:
128
+ 加载 phase → 选择 task → 构建上下文 → 派发 executor → 处理结果 → 审查 → phase handoff → 批量更新 → 上下文检查
129
+
130
+ **自动执行循环:** 进入执行后,持续循环直到遇到终止条件:
131
+ 1. 调用 `orchestrator-resume` 获取 action
132
+ 2. 按 action 派发对应子代理 (executor/reviewer/researcher/debugger)
133
+ 3. 收到结果后调用对应 `orchestrator-handle-*-result`
134
+ 4. 回到步骤 1
135
+ 5. 终止: action ∈ {idle, awaiting_user, completed, failed, await_manual_intervention}
136
+
137
+ 不要在循环中间停下来等用户确认 — 让编排器驱动。`complete_phase` action → 调 `phase-complete` MCP tool → 自动推进下一 phase。
138
+ </execution_loop>
139
+
140
+ ## STEP 12 — 最终报告
141
+
142
+ 全部 phase 完成后,输出最终报告:
143
+ - 项目总结
144
+ - 各阶段完成情况
145
+ - 关键 decisions 汇总
146
+ - 验证 evidence 汇总
147
+ - 遗留问题 / 后续建议 (如有)