npm - gsd-lite - Versions diffs - 0.4.1 → 0.5.0 - Mend

gsd-lite 0.4.1 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/agents/executor.md +4 -1
package/agents/reviewer.md +5 -2
package/commands/prd.md +2 -92
package/commands/resume.md +74 -0
package/commands/start.md +2 -127
package/hooks/gsd-statusline.cjs +2 -2
package/package.json +1 -1
package/src/schema.js +9 -15
package/src/tools/orchestrator.js +84 -64
package/src/tools/state.js +65 -13
package/workflows/execution-flow.md +147 -0

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -13,7 +13,7 @@
       "name": "gsd",
       "source": "./",
       "description": "AI orchestration tool — GSD management shell + Superpowers quality core. 5 commands, 4 agents, 5 workflows, MCP server, context monitoring.",
-      "version": "0.4.1",
+      "version": "0.5.0",
       "keywords": [
         "orchestration",
         "mcp",

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gsd",
-  "version": "0.4.1",
+  "version": "0.5.0",
   "description": "AI orchestration tool for Claude Code — GSD management shell + Superpowers quality core",
   "author": {
     "name": "sdsrss",

package/agents/executor.md CHANGED Viewed

@@ -55,7 +55,10 @@ tools: Read, Write, Edit, Bash, Grep, Glob
   "decisions": ["[DECISION] use optimistic locking by version column"],
   "blockers": [],
   "contract_changed": true,
-  "evidence": ["ev:test:users-update", "ev:typecheck:phase-2"]
+  "evidence": [
+    {"id": "ev:test:users-update", "scope": "task:2.3"},
+    {"id": "ev:typecheck:phase-2", "scope": "task:2.3"}
+  ]
 }
 `contract_changed` 判定指南:
 - 改了函数/方法签名 (参数、返回类型) → true

package/agents/reviewer.md CHANGED Viewed

@@ -92,7 +92,7 @@ Minor = 建议修复 (命名/风格)
 ```json
 {
   "scope": "task | phase",
-  "scope_id": "2.3 | phase-2",
+  "scope_id": "2.3 (task scope: string ID) | 2 (phase scope: number ID)",
   "review_level": "L2 | L1-batch",
   "spec_passed": true,
   "quality_passed": false,
@@ -107,7 +107,10 @@ Minor = 建议修复 (命名/风格)
   "minor_issues": [],
   "accepted_tasks": [],
   "rework_tasks": ["2.3", "2.4"],
-  "evidence": ["ev:test:phase-2", "ev:lint:phase-2"]
+  "evidence": [
+    {"id": "ev:test:phase-2", "scope": "task:2.3"},
+    {"id": "ev:lint:phase-2", "scope": "task:2.3"}
+  ]
 }
 ```

package/commands/prd.md CHANGED Viewed

@@ -51,99 +51,9 @@ argument-hint: File path to requirements doc, or inline description text
 - 使用 references/questioning.md 的提问技巧 (如可用)
 - 用户回答后，可适当追问直到需求清晰
-<!-- 以下 STEP 5-12 同 start.md -->
+<!-- STEP 5-12: 共享执行流程 — 修改 workflows/execution-flow.md 即同步所有入口 -->
-## STEP 5: 智能判断是否需要研究
-- 新项目 / 涉及新技术栈 → 必须研究
-- 简单 bug 修复 / 已有研究且未过期 → 跳过
-- 用户明确要求 → 研究
-- 需要时 → 派发 researcher 子代理 → 展示关键发现
-- 不需要 → 跳过，进入下一步
-## STEP 6: 深度思考
-- 如有 sequential-thinking MCP → 调用深入思考
-- 无则跳过，不影响流程
-## STEP 7: 生成分阶段计划
-- phase 负责管理与验收，task 负责执行
-- 每阶段控制在 5-8 个 task (便于 phase-level 收口)
-- 每个 task = 原子化 todo (含文件、操作、验证条件)
-- 每个 task 补充元数据: `requires` / `review_required` / `research_basis`
-- 审查级别按影响面判定: L0(无运行时语义变化) / L1(普通) / L2(高风险)
-- 标注可并行任务组 [PARALLEL] (当前仅作未来升级标记)
-## STEP 8: 计划自审
-轻量替代 plan-checker:
-- 检查: 是否有遗漏的需求点？
-- 检查: 阶段划分是否合理？(phase 过大则拆分)
-- 检查: 任务依赖关系是否正确？
-- 检查: 验证条件是否可执行？
-- 如属高风险项目 → 升级为增强计划审查:
-<enhanced_plan_review>
-触发条件: 涉及 auth / payment / security / public API / DB migration / 核心架构变更
-审查维度:
-1. 需求覆盖: 原始需求的每个要点是否都映射到了至少一个 task？
-2. 风险排序: 高风险 task 是否排在前面？(fail-fast 原则)
-3. 依赖安全: L2 task 的下游是否都用了 gate:accepted？
-4. 验证充分: 涉及 auth/payment 的 task 是否都有明确的安全验证条件？
-5. 陷阱规避: research/PITFALLS.md 中的每个陷阱是否都有对应的防御 task 或验证条件？
-输出: pass / revise (附具体修正建议)
-轮次: 最多 2 轮自审修正；2 轮后仍有问题 → 标注风险展示给用户
-</enhanced_plan_review>
-→ 自审修正后再展示给用户
-<HARD-GATE id="plan-confirmation">
-## STEP 9: 展示计划，等待用户确认
-- 展示完整分阶段计划
-- 用户指出问题 → 调整 → 再展示
-- 用户确认 → 继续
-⛔ 不得在用户确认前执行 STEP 10-12。未确认 = 不写文件、不执行代码。
-</HARD-GATE>
-## STEP 10: 生成文档
-- 创建 .gsd/ 目录
-- 写入 state.json + plan.md + phases/*.md
-- 初始化 `workflow_mode` / `current_task` / `current_review` / phase 状态与 handoff 信息
-- 如有研究: 写入 .gsd/research/
-<HARD-GATE id="docs-written">
-□ state.json 已写入且包含所有 canonical fields
-□ plan.md 已写入
-□ phases/*.md 已写入 (每个 phase 一个文件)
-□ 所有 task 都有 lifecycle / level / requires / review_required
-→ 全部满足才可继续
-</HARD-GATE>
-## STEP 11 — 自动执行主路径
-进入执行主循环。phase = 管理边界，task = 执行边界。
-<execution_loop>
-参考 `references/execution-loop.md` 获取完整 9 步执行循环规范 (11.1-11.9) 及依赖门槛语义。
-编排器必须严格按照该参考文档中的步骤顺序执行:
-加载 phase → 选择 task → 构建上下文 → 派发 executor → 处理结果 → 审查 → phase handoff → 批量更新 → 上下文检查
-</execution_loop>
-## STEP 12 — 全部完成
-全部 phase 完成后，输出最终报告:
-- 项目总结
-- 各阶段完成情况
-- 关键 decisions 汇总
-- 验证 evidence 汇总
-- 遗留问题 / 后续建议 (如有)
+使用 Read 工具读取 `workflows/execution-flow.md`，严格按照其中 STEP 5-12 执行。
 </process>

package/commands/resume.md CHANGED Viewed

@@ -212,6 +212,79 @@ description: Resume project execution from saved state with workspace validation
 所有展示数据从 canonical fields 实时推导，不使用 derived fields。
+## STEP 5: 自动执行循环
+<HARD-GATE id="auto-execution-loop">
+STEP 3 完成初次恢复后，进入自动执行循环。这是编排器的核心 —— 不要停在某一步等待用户，除非遇到终止条件。
+```
+循环入口:
+  1. 调用 MCP tool `orchestrator-resume` 获取 action
+  2. 根据 action 分派:
+     dispatch_executor:
+       → 使用 Agent tool 派发 executor 子代理 (subagent_type: gsd:executor)
+       → 传入 orchestrator 返回的 executor_context
+       → 收到 executor 结果后 → 调用 MCP tool `orchestrator-handle-executor-result`
+       → 回到步骤 1
+     dispatch_reviewer:
+       → 使用 Agent tool 派发 reviewer 子代理 (subagent_type: gsd:reviewer)
+       → 传入 review_targets / current_review
+       → 收到 reviewer 结果后 → 调用 MCP tool `orchestrator-handle-reviewer-result`
+       → 回到步骤 1
+     dispatch_researcher:
+       → 使用 Agent tool 派发 researcher 子代理 (subagent_type: gsd:researcher)
+       → 传入过期的研究信息
+       → 收到 researcher 结果后 → 调用 MCP tool `orchestrator-handle-researcher-result`
+       → 回到步骤 1
+     dispatch_debugger:
+       → 使用 Agent tool 派发 debugger 子代理 (subagent_type: gsd:debugger)
+       → 传入 debug_target
+       → 收到 debugger 结果后 → 调用 MCP tool `orchestrator-handle-debugger-result`
+       → 回到步骤 1
+     trigger_review:
+       → 直接派发 reviewer，scope 和 targets 从 action 响应中获取
+       → 回到步骤 1
+     complete_phase:
+       → 调用 MCP tool `phase-complete`，传入 phase_id + run_verify: true
+       → 回到步骤 1 (编排器会自动推进到下一 phase)
+     retry_executor:
+       → 重新调用 orchestrator-resume 获取更新后的 executor 上下文
+       → 回到步骤 1
+     rollback_to_dirty_phase:
+       → 编排器已自动回滚 current_phase，输出回滚通知
+       → 回到步骤 1
+     continue_execution:
+       → 直接回到步骤 1
+  3. 终止条件 — 遇到以下 action 时退出循环:
+     idle              → 输出 "无可执行任务"，停止
+     awaiting_user     → 展示 blockers / drift 信息，等待用户输入
+     await_manual_intervention → 展示需要人工干预的信息，停止
+     noop (completed)  → 展示完成报告，停止
+     await_recovery_decision (failed) → 展示失败信息和恢复选项，停止
+  4. 上下文安全阀:
+     每次循环迭代前检查上下文健康度
+     remaining <= 35% → 保存状态 + 输出 "请 /clear 后 /gsd:resume" → 退出循环
+```
+**关键原则:**
+- 循环是连续的: dispatch → handle result → resume → dispatch → ...
+- 不在中间步骤停下来等用户确认（除非是终止条件）
+- 每次 handle result 后立即 resume，让编排器决定下一步
+- Phase 审查通过后 → complete_phase → 自动推进下一 phase → 继续执行
+</HARD-GATE>
 </process>
 <EXTREMELY-IMPORTANT>
@@ -221,4 +294,5 @@ description: Resume project execution from saved state with workspace validation
 - awaiting_user / reconcile_workspace / replan_required 模式下不自动执行代码
 - 只有编排器写 state.json，子代理不直接写
 - 上下文 < 35% → 保存状态 + workflow_mode = awaiting_clear + 停止执行
+- **进入自动执行循环后，不要在循环中间停下来等用户 — 让编排器驱动**
 </EXTREMELY-IMPORTANT>

package/commands/start.md CHANGED Viewed

@@ -44,133 +44,8 @@ argument-hint: Optional feature or project description
     └── 否 → 追问
 ```
-## STEP 5 — 智能研究判断
+<!-- STEP 5-12: 共享执行流程 — 修改 workflows/execution-flow.md 即同步所有入口 -->
-判断是否需要研究:
-```
-├── 新项目                    → 必须研究
-├── 涉及新技术栈              → 必须研究
-├── 简单 bug 修复 / 小功能    → 跳过研究
-├── 已有 .gsd/research/ 且未过期 → 跳过研究
-├── 用户明确要求               → 研究
-└── 已有研究但需求方向变了     → 增量研究 (只研究新方向)
-```
-需要研究时:
-1. 派发 `researcher` 子代理 (新鲜上下文)
-2. 研究输出写入 `.gsd/research/` (STACK.md, ARCHITECTURE.md, PITFALLS.md, SUMMARY.md)
-3. 向用户展示关键发现: 技术栈推荐 + 陷阱警告 + ⭐ 推荐方案
-不需要时: 跳过，直接进入 STEP 6。
-## STEP 6 — 深度思考
-如有 `sequential-thinking` MCP 可用 → 调用深入思考:
-- 输入: 需求摘要 + 代码库分析 + 研究结果 (如有)
-- 目的: 在生成计划前进行系统性架构思考
-如无 `sequential-thinking` MCP → 降级为内联思考，继续。
-## STEP 7 — 生成分阶段计划
-生成 plan.md + phases/*.md:
-- **phase** 负责管理与验收，**task** 负责执行
-- 每阶段控制在 **5-8 个 task** (便于 phase-level 收口)
-- 每个 task = 原子化 todo (含文件、操作、验证条件)
-- 每个 task 补充元数据:
-  - `requires` — 依赖列表 (含 gate 类型)
-  - `review_required` — 是否需要审查
-  - `research_basis` — 引用的 research decision id
-- 审查级别按影响面判定:
-  - **L0** — 无运行时语义变化 (docs/config/style)
-  - **L1** — 普通编码任务 (默认)
-  - **L2** — 高风险 (auth/payment/public API/DB migration/核心架构)
-- 标注可并行任务组 `[PARALLEL]` (当前仅作未来升级标记)
-## STEP 8 — 计划自审
-轻量自审 (编排器自身执行，不派发子代理):
-### 基础审查 (所有项目)
-- [ ] 是否有遗漏的需求点？
-- [ ] 阶段划分是否合理？(phase 过大则拆分)
-- [ ] 任务依赖关系是否正确？
-- [ ] 验证条件是否可执行？
-### 增强审查 (高风险项目)
-触发条件: 项目涉及 auth / payment / security / public API / DB migration / 核心架构变更
-维度:
-1. **需求覆盖:** 原始需求的每个要点是否都映射到了至少一个 task？
-2. **风险排序:** 高风险 task 是否排在前面？(fail-fast 原则)
-3. **依赖安全:** L2 task 的下游是否都用了 `gate:accepted`？
-4. **验证充分:** 涉及 auth/payment 的 task 是否都有明确的安全验证条件？
-5. **陷阱规避:** `research/PITFALLS.md` 中的每个陷阱是否都有对应的防御 task 或验证条件？
-输出: `pass` / `revise` (附具体修正建议)
-轮次: 最多 2 轮自审修正；2 轮后仍有问题 → 标注风险展示给用户
-→ 自审修正后再展示给用户。
-<HARD-GATE id="plan-confirmation">
-## STEP 9 — 用户确认计划
-展示计划给用户，等待确认:
-- 用户指出问题 → 调整计划 → 重新展示
-- 用户确认 → 继续
-⛔ 不得在用户确认前执行 STEP 10-12。未确认 = 不写文件、不执行代码。
-</HARD-GATE>
-<HARD-GATE id="docs-written">
-## STEP 10 — 生成文档
-1. 创建 `.gsd/` 目录
-2. 写入 `state.json`:
-   - 初始化 `workflow_mode: "executing_task"`
-   - 初始化 `current_phase: 1`
-   - 初始化 `current_task: null` (由执行循环填充)
-   - 初始化 `current_review: null`
-   - 初始化所有 phase lifecycle = `pending` (第一个 = `active`)
-   - 初始化所有 task lifecycle = `pending`
-   - 初始化 phase_handoff 信息
-   - 初始化 `decisions: []`
-   - 初始化 `context.remaining_percentage`
-3. 写入 `plan.md` — 项目总览索引 (不含 task 级细节)
-4. 写入 `phases/*.md` — 每阶段详细 task 规格 (source of truth)
-5. 如有研究: 确认 `.gsd/research/` 已写入
-规则:
-- `plan.md` 是只读索引: 生成后不再修改 (除非 replan)
-- `phases/*.md` 是 task 规格的唯一 source of truth
-- `plan.md` 不包含 task 级细节，避免与 `phases/*.md` 重复
-□ state.json 已写入且包含所有 canonical fields
-□ plan.md 已写入
-□ phases/*.md 已写入 (每个 phase 一个文件)
-□ 所有 task 都有 lifecycle / level / requires / review_required
-→ 全部满足才可继续
-</HARD-GATE>
-## STEP 11 — 自动执行主路径
-进入执行主循环。phase = 管理边界，task = 执行边界。
-<execution_loop>
-参考 `references/execution-loop.md` 获取完整 9 步执行循环规范 (11.1-11.9) 及依赖门槛语义。
-编排器必须严格按照该参考文档中的步骤顺序执行:
-加载 phase → 选择 task → 构建上下文 → 派发 executor → 处理结果 → 审查 → phase handoff → 批量更新 → 上下文检查
-</execution_loop>
-## STEP 12 — 最终报告
-全部 phase 完成后，输出最终报告:
-- 项目总结
-- 各阶段完成情况
-- 关键 decisions 汇总
-- 验证 evidence 汇总
-- 遗留问题 / 后续建议 (如有)
+使用 Read 工具读取 `workflows/execution-flow.md`，严格按照其中 STEP 5-12 执行。
 </process>

package/hooks/gsd-statusline.cjs CHANGED Viewed

@@ -56,8 +56,8 @@ process.stdin.on('end', () => {
     }
     // Context window display (USED percentage scaled to usable context)
-    // Claude Code reserves ~16.5% for autocompact buffer
-    const AUTO_COMPACT_BUFFER_PCT = 16.5;
+    // Claude Code reserves ~16.5% for autocompact buffer (configurable via env)
+    const AUTO_COMPACT_BUFFER_PCT = Number(process.env.GSD_AUTOCOMPACT_BUFFER) || 16.5;
     let ctx = '';
     if (remaining != null) {
       const usableRemaining = Math.max(0, ((remaining - AUTO_COMPACT_BUFFER_PCT) / (100 - AUTO_COMPACT_BUFFER_PCT)) * 100);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gsd-lite",
-  "version": "0.4.1",
+  "version": "0.5.0",
   "description": "AI orchestration tool for Claude Code — GSD management shell + Superpowers quality core",
   "type": "module",
   "bin": {

package/src/schema.js CHANGED Viewed

@@ -19,7 +19,7 @@ export const WORKFLOW_MODES = [
 export const TASK_LIFECYCLE = {
   pending:              ['running', 'blocked'],
-  running:              ['checkpointed', 'blocked', 'failed'],
+  running:              ['checkpointed', 'blocked', 'failed', 'accepted'], // accepted: auto-accept for L0/review_required=false (atomic, skips checkpointed)
   checkpointed:         ['accepted', 'needs_revalidation'],
   accepted:             ['needs_revalidation'],
   blocked:              ['pending'],
@@ -106,7 +106,7 @@ export function validateResearchDecisionIndex(decisionIndex, requiredIds = []) {
   return { valid: errors.length === 0, errors };
 }
-export function validateResearchArtifacts(artifacts, { decisionIds = [], volatility, expiresAt } = {}) {
+export function validateResearchArtifacts(artifacts) {
   const errors = [];
   if (!isPlainObject(artifacts)) {
     return { valid: false, errors: ['artifacts must be an object'] };
@@ -119,19 +119,6 @@ export function validateResearchArtifacts(artifacts, { decisionIds = [], volatil
     }
   }
-  const summary = typeof artifacts['SUMMARY.md'] === 'string' ? artifacts['SUMMARY.md'] : '';
-  if (volatility && !summary.includes(volatility)) {
-    errors.push('artifacts.SUMMARY.md must mention volatility');
-  }
-  if (expiresAt && !summary.includes(expiresAt)) {
-    errors.push('artifacts.SUMMARY.md must mention expires_at');
-  }
-  for (const id of decisionIds) {
-    if (!summary.includes(id)) {
-      errors.push(`artifacts.SUMMARY.md must mention decision id ${id}`);
-    }
-  }
   return { valid: errors.length === 0, errors };
 }
@@ -452,6 +439,13 @@ export function validateState(state) {
       if (!Number.isFinite(phase.done)) {
         errors.push(`Phase ${phase.id}: done must be a finite number`);
       }
+      // Cross-validate done against actual accepted tasks
+      if (Number.isFinite(phase.done) && Array.isArray(phase.todo)) {
+        const acceptedCount = phase.todo.filter(t => t.lifecycle === 'accepted').length;
+        if (phase.done !== acceptedCount) {
+          errors.push(`Phase ${phase.id}: done (${phase.done}) does not match accepted task count (${acceptedCount})`);
+        }
+      }
       if (!Array.isArray(phase.todo)) {
         errors.push(`Phase ${phase.id}: todo must be an array`);
         continue;

package/src/tools/orchestrator.js CHANGED Viewed

@@ -4,7 +4,6 @@ import {
   read,
   storeResearch,
   update,
-  addEvidence,
   selectRunnableTask,
   buildExecutorContext,
   matchDecisionForBlocker,
@@ -278,6 +277,19 @@ function buildDecisionEntries(decisions, phaseId, taskId, existingCount = 0) {
     .filter(Boolean);
 }
+function buildErrorFingerprint(result) {
+  const parts = [];
+  if (result.blockers?.length > 0) {
+    const b = result.blockers[0];
+    parts.push(typeof b === 'string' ? b : (b.reason || b.type || ''));
+  }
+  if (result.files_changed?.length > 0) {
+    parts.push([...result.files_changed].sort().join(','));
+  }
+  const combined = parts.filter(Boolean).join('|');
+  return combined.length > 0 ? combined.slice(0, 120) : result.summary.slice(0, 80);
+}
 function getBlockedReasonFromResult(result) {
   const firstBlocker = (result.blockers || [])[0];
   if (!firstBlocker) return { blocked_reason: result.summary, unblock_condition: null };
@@ -705,15 +717,44 @@ export async function resumeWorkflow({ basePath = process.cwd(), _depth = 0 } =
         message: 'Project is paused. Confirm to resume execution.',
       };
     case 'planning':
-    case 'reconcile_workspace':
-    case 'replan_required':
-    case 'research_refresh_needed':
       return {
         success: true,
         action: 'await_manual_intervention',
         workflow_mode: state.workflow_mode,
-        message: `workflow_mode "${state.workflow_mode}" is recognized but not yet automated by the orchestrator`,
+        guidance: 'Complete planning and call state-init to initialize the project',
+        message: 'Project is in planning mode; complete the plan and initialize with state-init',
+      };
+    case 'reconcile_workspace': {
+      const reconGitHead = await getGitHead(basePath);
+      return {
+        success: true,
+        action: 'reconcile_workspace',
+        workflow_mode: state.workflow_mode,
+        expected_head: state.git_head,
+        actual_head: reconGitHead,
+        guidance: 'Workspace git HEAD has diverged. Verify changes and update git_head via state-update, then set workflow_mode to executing_task',
+        message: `Git HEAD mismatch: saved=${state.git_head}, current=${reconGitHead}`,
+      };
+    }
+    case 'replan_required':
+      return {
+        success: true,
+        action: 'replan_required',
+        workflow_mode: state.workflow_mode,
+        guidance: 'Plan files modified since last session. Review changes, update the plan if needed, then set workflow_mode to executing_task via state-update',
+        message: 'Plan artifacts modified since last session; review and re-align before resuming',
       };
+    case 'research_refresh_needed': {
+      const expiredResearch = collectExpiredResearch(state);
+      return {
+        success: true,
+        action: 'dispatch_researcher',
+        workflow_mode: state.workflow_mode,
+        expired_research: expiredResearch,
+        guidance: 'Research cache expired. Dispatch researcher sub-agent to refresh, then call orchestrator-handle-researcher-result',
+        message: 'Research has expired and must be refreshed before execution can resume',
+      };
+    }
     default:
       return {
         error: true,
@@ -746,55 +787,47 @@ export async function handleExecutorResult({ result, basePath = process.cwd() }
   if (result.outcome === 'checkpointed') {
     const reviewLevel = reclassifyReviewLevel(task, result);
     const isL0 = reviewLevel === 'L0';
+    const autoAccept = isL0 || task.review_required === false;
     const current_review = !isL0 && reviewLevel === 'L2' && task.review_required !== false
       ? { scope: 'task', scope_id: task.id, stage: 'spec' }
       : null;
     const workflow_mode = current_review ? 'reviewing_task' : 'executing_task';
-    // First persist: checkpoint the task (running → checkpointed)
+    // Single atomic persist: auto-accept goes directly running → accepted,
+    // otherwise running → checkpointed (awaiting review)
+    const taskPatch = {
+      id: task.id,
+      lifecycle: autoAccept ? 'accepted' : 'checkpointed',
+      checkpoint_commit: result.checkpoint_commit,
+      files_changed: result.files_changed || [],
+      evidence_refs: result.evidence || [],
+      level: reviewLevel,
+      blocked_reason: null,
+      unblock_condition: null,
+      debug_context: null,
+    };
+    const phasePatch = { id: phase.id, todo: [taskPatch] };
+    // done is auto-recomputed by update() — no manual increment needed
+    // Bundle evidence into the same atomic persist to prevent inconsistency
+    const evidenceUpdates = {};
+    for (const ev of (result.evidence || [])) {
+      if (ev && typeof ev === 'object' && typeof ev.id === 'string' && typeof ev.scope === 'string') {
+        evidenceUpdates[ev.id] = ev;
+      }
+    }
     const persistError = await persist(basePath, {
       workflow_mode,
       current_task: null,
       current_review,
       decisions,
-      phases: [{
-        id: phase.id,
-        todo: [{
-          id: task.id,
-          lifecycle: 'checkpointed',
-          checkpoint_commit: result.checkpoint_commit,
-          files_changed: result.files_changed || [],
-          evidence_refs: result.evidence || [],
-          level: reviewLevel,
-          blocked_reason: null,
-          unblock_condition: null,
-          debug_context: null,
-        }],
-      }],
+      phases: [phasePatch],
+      ...(Object.keys(evidenceUpdates).length > 0 ? { evidence: evidenceUpdates } : {}),
     });
     if (persistError) return persistError;
-    // Store structured evidence entries
-    for (const ev of (result.evidence || [])) {
-      if (ev && typeof ev === 'object' && typeof ev.id === 'string' && typeof ev.scope === 'string') {
-        await addEvidence({ id: ev.id, data: ev, basePath });
-      }
-    }
-    // Auto-accept: L0 tasks or tasks with review_required: false
-    const autoAccept = isL0 || task.review_required === false;
-    if (autoAccept) {
-      const acceptError = await persist(basePath, {
-        phases: [{
-          id: phase.id,
-          done: (phase.done || 0) + 1,
-          todo: [{ id: task.id, lifecycle: 'accepted' }],
-        }],
-      });
-      if (acceptError) return acceptError;
-    }
     return {
       success: true,
       action: current_review ? 'dispatch_reviewer' : 'continue_execution',
@@ -841,7 +874,7 @@ export async function handleExecutorResult({ result, basePath = process.cwd() }
   const retry_count = (task.retry_count || 0) + 1;
   const error_fingerprint = typeof result.error_fingerprint === 'string' && result.error_fingerprint.length > 0
     ? result.error_fingerprint
-    : result.summary.slice(0, 80);
+    : buildErrorFingerprint(result);
   const shouldDebug = retry_count >= MAX_DEBUG_RETRY;
   const current_review = shouldDebug
     ? {
@@ -989,8 +1022,6 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
   }
   const taskPatches = [];
-  let doneIncrement = 0;
-  let doneDecrement = 0;
   // Accept tasks
   for (const taskId of (result.accepted_tasks || [])) {
@@ -998,7 +1029,6 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
     if (!task) continue;
     if (task.lifecycle === 'checkpointed') {
       taskPatches.push({ id: taskId, lifecycle: 'accepted' });
-      doneIncrement += 1;
     }
   }
@@ -1007,19 +1037,10 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
     const task = getTaskById(phase, taskId);
     if (!task) continue;
     if (task.lifecycle === 'checkpointed' || task.lifecycle === 'accepted') {
-      if (task.lifecycle === 'accepted') doneDecrement += 1;
       taskPatches.push({ id: taskId, lifecycle: 'needs_revalidation', evidence_refs: [] });
     }
   }
-  // Snapshot accepted task IDs before propagation (for done counter adjustment).
-  // Note: rework_tasks patches above are NOT yet applied in-memory, so tasks demoted
-  // by the rework loop are still 'accepted' here. The guard below
-  // `!taskPatches.some(p => p.id === task.id)` prevents double-counting.
-  const acceptedBeforePropagation = new Set(
-    (phase.todo || []).filter(t => t.lifecycle === 'accepted').map(t => t.id),
-  );
   // Propagation for critical issues with invalidates_downstream
   for (const issue of (result.critical_issues || [])) {
     if (issue.invalidates_downstream && issue.task_id) {
@@ -1031,18 +1052,15 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
   for (const task of (phase.todo || [])) {
     if (task.lifecycle === 'needs_revalidation' && !taskPatches.some((p) => p.id === task.id)) {
       taskPatches.push({ id: task.id, lifecycle: 'needs_revalidation', evidence_refs: [] });
-      if (acceptedBeforePropagation.has(task.id)) {
-        doneDecrement += 1;
-      }
     }
   }
   const hasCritical = (result.critical_issues || []).length > 0;
   const reviewStatus = hasCritical ? 'rework_required' : 'accepted';
+  // done is auto-recomputed by update() — no manual tracking needed
   const phaseUpdates = {
     id: phase.id,
-    done: Math.max(0, (phase.done || 0) + doneIncrement - doneDecrement),
     phase_review: {
       status: reviewStatus,
       ...(hasCritical
@@ -1063,20 +1081,22 @@ export async function handleReviewerResult({ result, basePath = process.cwd() }
   const workflowMode = 'executing_task';
+  // Bundle evidence into the same atomic persist
+  const evidenceUpdates = {};
+  for (const ev of (result.evidence || [])) {
+    if (ev && typeof ev === 'object' && typeof ev.id === 'string' && typeof ev.scope === 'string') {
+      evidenceUpdates[ev.id] = ev;
+    }
+  }
   const persistError = await persist(basePath, {
     workflow_mode: workflowMode,
     current_review: null,
     phases: [phaseUpdates],
+    ...(Object.keys(evidenceUpdates).length > 0 ? { evidence: evidenceUpdates } : {}),
   });
   if (persistError) return persistError;
-  // Store evidence entries if provided
-  for (const ev of (result.evidence || [])) {
-    if (ev && typeof ev === 'object' && typeof ev.id === 'string' && typeof ev.scope === 'string') {
-      await addEvidence({ id: ev.id, data: ev, basePath });
-    }
-  }
   return {
     success: true,
     action: hasCritical ? 'rework_required' : 'review_accepted',

package/src/tools/state.js CHANGED Viewed

@@ -42,6 +42,16 @@ export function setLockPath(lockPath) {
   _fileLockPath = lockPath;
 }
+/**
+ * Ensure _fileLockPath is set from a known state path.
+ * Must be called before withStateLock in all mutation paths.
+ */
+function ensureLockPathFromStatePath(statePath) {
+  if (!_fileLockPath && statePath) {
+    _fileLockPath = join(dirname(statePath), 'state.lock');
+  }
+}
 function withStateLock(fn) {
   const p = _mutationQueue.then(() => {
     if (_fileLockPath) {
@@ -80,6 +90,7 @@ export async function init({ project, phases, research, force = false, basePath
   }
   const gsdDir = join(basePath, '.gsd');
   const statePath = join(gsdDir, 'state.json');
+  ensureLockPathFromStatePath(statePath);
   return withStateLock(async () => {
     // Guard: reject re-initialization unless force is set
@@ -129,6 +140,9 @@ export async function init({ project, phases, research, force = false, basePath
       ...state.phases.map((phase) => join(phasesDir, `phase-${phase.id}.md`)),
     ];
     const mtimes = await Promise.all(trackedFiles.map(async (filePath) => (await stat(filePath)).mtimeMs));
+    // Math.ceil is required: mtimeMs has sub-millisecond precision (float), but
+    // Date.toISOString() truncates to milliseconds. Without ceil, the stored timestamp
+    // can be slightly less than the file's actual mtime, causing false plan-drift detection.
     state.context.last_session = new Date(Math.ceil(Math.max(...mtimes))).toISOString();
     await writeJson(statePath, state);
@@ -195,8 +209,7 @@ export async function update({ updates, basePath = process.cwd() } = {}) {
   if (!statePath) {
     return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
   }
-  // C-2: Initialize cross-process lock path on first mutation
-  if (!_fileLockPath) _fileLockPath = join(dirname(statePath), 'state.lock');
+  ensureLockPathFromStatePath(statePath);
   return withStateLock(async () => {
     const result = await readJson(statePath);
@@ -242,6 +255,12 @@ export async function update({ updates, basePath = process.cwd() } = {}) {
     // Deep merge phases by ID instead of shallow replace [I-1]
     const merged = { ...state, ...updates };
+    // Deep merge evidence by key (preserves existing entries)
+    if (updates.evidence && isPlainObject(updates.evidence)) {
+      merged.evidence = { ...(state.evidence || {}), ...updates.evidence };
+    }
     if (updates.phases && Array.isArray(updates.phases)) {
       merged.phases = state.phases.map(oldPhase => {
         const newPhase = updates.phases.find(p => p.id === oldPhase.id);
@@ -276,6 +295,21 @@ export async function update({ updates, basePath = process.cwd() } = {}) {
       }
     }
+    // Recompute `done` from actual accepted tasks (prevents counter drift)
+    if (updates.phases && Array.isArray(updates.phases)) {
+      for (const phase of merged.phases) {
+        if (Array.isArray(phase.todo)) {
+          phase.done = phase.todo.filter(t => t.lifecycle === 'accepted').length;
+        }
+      }
+    }
+    // Auto-prune evidence when entries exceed limit
+    if (merged.evidence && Object.keys(merged.evidence).length > MAX_EVIDENCE_ENTRIES) {
+      const gsdDir = dirname(statePath);
+      await _pruneEvidenceFromState(merged, merged.current_phase, gsdDir);
+    }
     // Use incremental validation for simple updates (no phases changes)
     const validation = !updates.phases
       ? validateStateUpdate(state, updates)
@@ -336,11 +370,12 @@ export async function phaseComplete({
   if (!statePath) {
     return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
   }
+  ensureLockPathFromStatePath(statePath);
   return withStateLock(async () => {
     const result = await readJson(statePath);
     if (!result.ok) {
-      return { error: true, message: result.error };
+      return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: result.error };
     }
     const state = result.data;
@@ -484,11 +519,12 @@ export async function addEvidence({ id, data, basePath = process.cwd() }) {
   if (!statePath) {
     return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
   }
+  ensureLockPathFromStatePath(statePath);
   return withStateLock(async () => {
     const result = await readJson(statePath);
     if (!result.ok) {
-      return { error: true, message: result.error };
+      return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: result.error };
     }
     const state = result.data;
@@ -563,11 +599,12 @@ export async function pruneEvidence({ currentPhase, basePath = process.cwd() })
   if (!statePath) {
     return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
   }
+  ensureLockPathFromStatePath(statePath);
   return withStateLock(async () => {
     const result = await readJson(statePath);
     if (!result.ok) {
-      return { error: true, message: result.error };
+      return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: result.error };
     }
     const state = result.data;
@@ -831,16 +868,27 @@ export function reclassifyReviewLevel(task, executorResult) {
 const MIN_TOKEN_LENGTH = 2;
 const MIN_OVERLAP = 2;
+// High-frequency words too generic for meaningful keyword matching
+const STOPWORDS = new Set([
+  'the', 'and', 'for', 'with', 'this', 'that', 'from', 'have', 'not',
+  'but', 'are', 'was', 'been', 'will', 'can', 'should', 'would', 'could',
+  'use', 'using', 'need', 'needs', 'into', 'also', 'when', 'then',
+  'than', 'more', 'some', 'does', 'did', 'its', 'has', 'all', 'any',
+  'error', 'data', 'type', 'value', 'file', 'code', 'function',
+  'return', 'null', 'true', 'false', 'undefined', 'object', 'string',
+  'number', 'array', 'list', 'map', 'set', 'key', 'name',
+]);
 /**
  * Tokenize a string into lowercase tokens, splitting on whitespace and punctuation.
- * Filters out short tokens (< MIN_TOKEN_LENGTH).
+ * Filters out short tokens (< MIN_TOKEN_LENGTH) and stopwords.
  */
 function tokenize(text) {
   if (!text) return [];
   return text
     .toLowerCase()
     .split(/[\s,.:;!?()[\]{}<>/\\|@#$%^&*+=~`'"，。：；！？（）【】、]+/)
-    .filter(t => t.length >= MIN_TOKEN_LENGTH);
+    .filter(t => t.length >= MIN_TOKEN_LENGTH && !STOPWORDS.has(t));
 }
 /**
@@ -949,11 +997,7 @@ export async function storeResearch({ result, artifacts, decision_index, basePat
     return { error: true, code: ERROR_CODES.VALIDATION_FAILED, message: `Invalid researcher result: ${resultValidation.errors.join('; ')}` };
   }
-  const artifactsValidation = validateResearchArtifacts(artifacts, {
-    decisionIds: result.decision_ids,
-    volatility: result.volatility,
-    expiresAt: result.expires_at,
-  });
+  const artifactsValidation = validateResearchArtifacts(artifacts);
   if (!artifactsValidation.valid) {
     return { error: true, code: ERROR_CODES.VALIDATION_FAILED, message: `Invalid research artifacts: ${artifactsValidation.errors.join('; ')}` };
   }
@@ -967,11 +1011,12 @@ export async function storeResearch({ result, artifacts, decision_index, basePat
   if (!statePath) {
     return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: 'No .gsd directory found' };
   }
+  ensureLockPathFromStatePath(statePath);
   return withStateLock(async () => {
     const current = await readJson(statePath);
     if (!current.ok) {
-      return { error: true, message: current.error };
+      return { error: true, code: ERROR_CODES.NO_PROJECT_DIR, message: current.error };
     }
     const state = current.data;
@@ -1026,6 +1071,13 @@ export async function storeResearch({ result, artifacts, decision_index, basePat
       state.workflow_mode = inferWorkflowModeAfterResearch(state);
     }
+    // Recompute done after applyResearchRefresh may have invalidated tasks
+    for (const phase of (state.phases || [])) {
+      if (Array.isArray(phase.todo)) {
+        phase.done = phase.todo.filter(t => t.lifecycle === 'accepted').length;
+      }
+    }
     const validation = validateState(state);
     if (!validation.valid) {
       return { error: true, code: ERROR_CODES.VALIDATION_FAILED, message: `State validation failed: ${validation.errors.join('; ')}` };

package/workflows/execution-flow.md ADDED Viewed

@@ -0,0 +1,147 @@
+# 共享执行流程 (STEP 5-12)
+> 由 start.md 和 prd.md 共享引用。修改此文件即同步两个入口。
+## STEP 5 — 智能研究判断
+判断是否需要研究:
+```
+├── 新项目                    → 必须研究
+├── 涉及新技术栈              → 必须研究
+├── 简单 bug 修复 / 小功能    → 跳过研究
+├── 已有 .gsd/research/ 且未过期 → 跳过研究
+├── 用户明确要求               → 研究
+└── 已有研究但需求方向变了     → 增量研究 (只研究新方向)
+```
+需要研究时:
+1. 派发 `researcher` 子代理 (新鲜上下文)
+2. 研究输出写入 `.gsd/research/` (STACK.md, ARCHITECTURE.md, PITFALLS.md, SUMMARY.md)
+3. 向用户展示关键发现: 技术栈推荐 + 陷阱警告 + ⭐ 推荐方案
+不需要时: 跳过，直接进入 STEP 6。
+## STEP 6 — 深度思考
+如有 `sequential-thinking` MCP 可用 → 调用深入思考:
+- 输入: 需求摘要 + 代码库分析 + 研究结果 (如有)
+- 目的: 在生成计划前进行系统性架构思考
+如无 `sequential-thinking` MCP → 降级为内联思考，继续。
+## STEP 7 — 生成分阶段计划
+生成 plan.md + phases/*.md:
+- **phase** 负责管理与验收，**task** 负责执行
+- 每阶段控制在 **5-8 个 task** (便于 phase-level 收口)
+- 每个 task = 原子化 todo (含文件、操作、验证条件)
+- 每个 task 补充元数据:
+  - `requires` — 依赖列表 (含 gate 类型)
+  - `review_required` — 是否需要审查
+  - `research_basis` — 引用的 research decision id
+- 审查级别按影响面判定:
+  - **L0** — 无运行时语义变化 (docs/config/style)
+  - **L1** — 普通编码任务 (默认)
+  - **L2** — 高风险 (auth/payment/public API/DB migration/核心架构)
+- 标注可并行任务组 `[PARALLEL]` (当前仅作未来升级标记)
+## STEP 8 — 计划自审
+轻量自审 (编排器自身执行，不派发子代理):
+### 基础审查 (所有项目)
+- [ ] 是否有遗漏的需求点？
+- [ ] 阶段划分是否合理？(phase 过大则拆分)
+- [ ] 任务依赖关系是否正确？
+- [ ] 验证条件是否可执行？
+### 增强审查 (高风险项目)
+触发条件: 项目涉及 auth / payment / security / public API / DB migration / 核心架构变更
+维度:
+1. **需求覆盖:** 原始需求的每个要点是否都映射到了至少一个 task？
+2. **风险排序:** 高风险 task 是否排在前面？(fail-fast 原则)
+3. **依赖安全:** L2 task 的下游是否都用了 `gate:accepted`？
+4. **验证充分:** 涉及 auth/payment 的 task 是否都有明确的安全验证条件？
+5. **陷阱规避:** `research/PITFALLS.md` 中的每个陷阱是否都有对应的防御 task 或验证条件？
+输出: `pass` / `revise` (附具体修正建议)
+轮次: 最多 2 轮自审修正；2 轮后仍有问题 → 标注风险展示给用户
+→ 自审修正后再展示给用户。
+<HARD-GATE id="plan-confirmation">
+## STEP 9 — 用户确认计划
+展示计划给用户，等待确认:
+- 用户指出问题 → 调整计划 → 重新展示
+- 用户确认 → 继续
+⛔ 不得在用户确认前执行 STEP 10-12。未确认 = 不写文件、不执行代码。
+</HARD-GATE>
+<HARD-GATE id="docs-written">
+## STEP 10 — 生成文档
+1. 调用 `state-init` MCP 工具初始化项目:
+   ```
+   state-init({
+     project: "<项目名>",
+     phases: [
+       {
+         name: "<阶段名>",
+         tasks: [
+           { name: "<任务名>", level: "L1", review_required: true, requires: [] },
+           ...
+         ]
+       },
+       ...
+     ]
+   })
+   ```
+   ⚠️ 必须使用 `state-init` MCP 工具，禁止手写 state.json — 工具自动生成 id/lifecycle/phase_review/phase_handoff，内置 schema 校验和循环依赖检测。
+2. 写入 `plan.md` — 项目总览索引 (不含 task 级细节)
+3. 写入 `phases/*.md` — 每阶段详细 task 规格 (source of truth)
+4. 如有研究: 确认 `.gsd/research/` 已写入
+规则:
+- `plan.md` 是只读索引: 生成后不再修改 (除非 replan)
+- `phases/*.md` 是 task 规格的唯一 source of truth
+- `plan.md` 不包含 task 级细节，避免与 `phases/*.md` 重复
+□ state-init 调用成功 (返回 success: true)
+□ plan.md 已写入
+□ phases/*.md 已写入 (每个 phase 一个文件)
+□ 所有 task 都有 lifecycle / level / requires / review_required
+→ 全部满足才可继续
+</HARD-GATE>
+## STEP 11 — 自动执行主路径
+进入执行主循环。phase = 管理边界，task = 执行边界。
+<execution_loop>
+参考 `references/execution-loop.md` 获取完整 9 步执行循环规范 (11.1-11.9) 及依赖门槛语义。
+编排器必须严格按照该参考文档中的步骤顺序执行:
+加载 phase → 选择 task → 构建上下文 → 派发 executor → 处理结果 → 审查 → phase handoff → 批量更新 → 上下文检查
+**自动执行循环:** 进入执行后，持续循环直到遇到终止条件:
+1. 调用 `orchestrator-resume` 获取 action
+2. 按 action 派发对应子代理 (executor/reviewer/researcher/debugger)
+3. 收到结果后调用对应 `orchestrator-handle-*-result`
+4. 回到步骤 1
+5. 终止: action ∈ {idle, awaiting_user, completed, failed, await_manual_intervention}
+不要在循环中间停下来等用户确认 — 让编排器驱动。`complete_phase` action → 调 `phase-complete` MCP tool → 自动推进下一 phase。
+</execution_loop>
+## STEP 12 — 最终报告
+全部 phase 完成后，输出最终报告:
+- 项目总结
+- 各阶段完成情况
+- 关键 decisions 汇总
+- 验证 evidence 汇总
+- 遗留问题 / 后续建议 (如有)