npm - agent-project-sdlc - Versions diffs - 0.1.20 → 0.1.22 - Mend

agent-project-sdlc 0.1.20 → 0.1.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/README.md +5 -2
package/assets/docs/README.md +5 -2
package/assets/policies/allowed_paths.yaml +1 -0
package/assets/policies/phase_contracts.yaml +3 -0
package/assets/skills/pjsdlc_architect_design/SKILL.md +1 -0
package/assets/skills/pjsdlc_dev_sprint/SKILL.md +9 -3
package/assets/skills/pjsdlc_implementation_doc/SKILL.md +6 -3
package/assets/skills/pjsdlc_reviewer/SKILL.md +2 -2
package/assets/skills/pjsdlc_tester/SKILL.md +3 -2
package/assets/templates/EVIDENCE_INDEX_TEMPLATE.md +17 -0
package/assets/templates/EXPLORATION_APPENDIX_TEMPLATE.md +22 -0
package/assets/templates/IMPLEMENTATION_DOC_TEMPLATE.md +40 -8
package/assets/templates/PLAN_TEMPLATE.yaml +18 -1
package/assets/templates/RUNBOOK_TEMPLATE.md +47 -0
package/assets/tools/harness_utils.py +83 -0
package/assets/tools/validate_harness.py +1 -0
package/dist/lib/init.js +1 -0
package/dist/lib/validators.js +355 -12
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -31,6 +31,7 @@ npx sdlc-harness init --adopt
 | Stage task control | `plan.yaml`, `make validate-plan`, `npx sdlc-harness validate-plan` | Keeps each stage's agent work in small `TASK-*` tasks with `phase` metadata and scoped paths/gates. |
 | Natural-language control | `AGENTS.md` plus workflow skills | Lets users say things like "continue", "start development", "run tests" or "requirements changed"; agents map these to workflow actions. |
 | Default parallel scheduling contract | `plan.yaml#parallel_execution` | Stage tasks default to a safe-parallelism check; suitable work uses Codex native subagents first, with user-orchestrated and worktree fallbacks. |
+| Resume-first runtime handoff | `plan.yaml#resume_capsule`, `.docs/09_runbooks/**` | Keeps high-risk runtime/live/remote-operator tasks recoverable through a short resume card, runbook, evidence index and exploration appendix. |
 | Workflow skills | `<harnessRoot>/skills/pjsdlc_*/SKILL.md` | Provides role prompts for product, architecture, development, implementation docs, review, testing, release and RFC recalibration. |
 | Project-local skill overrides | `<harnessRoot>/pjsdlc_managed/override_skills/<skill_name>.md` + `npx sdlc-harness sync` | Appends project-specific role instructions to generated Skill output without editing managed Skill files. |
 | Local policy overrides | `<harnessRoot>/pjsdlc_managed/policies/*.local.yaml` | Preserves project-specific policy additions separately from package defaults. |
@@ -86,9 +87,11 @@ Before development starts, `ARCHITECTING` can return to `REQUIREMENT_GATHERING`
 `validate-design` treats semantic slicing as a hard gate. Generated `overview.md` files do not count as deliverables, development draft tasks in `plan.draft.yaml` must reference existing tech plan slices through `docs.tech_plan`, multiple development draft tasks need distinct primary tech plan slices, and explicit AI provider/copilot, external-system, or compliance/permission/audit themes require dedicated architecture slices. Draft tasks with runnable boundaries must also include `self_test_contract`, backed by a `Development Self-Test Contract` section in the tech plan; the contract must include `module_key_test_path` from local start or invocation to all self-test scenarios completion, covering every runnable entry promised by the current task/module and its internal key paths.
-SPRINTING Definition of Done includes module-level runnable delivery boundaries. API, CLI, server route, service, agent, runtime, adapter, worker, provider, config-contract and fixture/live boundaries promised by a technical plan or task must be implemented or marked `BLOCKED` during development. Runtime/app/provider/live tasks must declare `evidence_level.required`, `target_runtime_environment` and `self_test_contract` in `plan.yaml`; every gate in `self_test_contract.required_gates` must also appear in task `required_gates`, and `self_test_contract.module_key_test_path` must describe the path from local start or invocation to all self-test scenarios completion, covering every runnable entry promised by the current task/module and its internal key paths. `deployed_runtime` cannot be closed by `unit`, `local_runtime`, `external_provider_live`, provider smoke, fake adapters or localhost smoke alone, and `business_handoff_ready` requires a Testing Handoff Contract. The current task implementation doc must include `Development Evidence` and a completed `Development Self-Test Report` with contract source, scenario results, executed gates, Module Key Test Path, actual evidence, missing/blockers and Testing Handoff Readiness; Module Key Test Path records actual entries, internal key paths, boundaries, checkpoints and observable completion evidence. Provider smoke, fixture smoke, fake adapters and one-shot smoke prove only local links; they do not by themselves prove application readiness. REVIEWING treats missing entry/exit, initialization, config contract, target runtime, evidence level or development evidence as blocking, and TESTING only exercises entrypoints that Review has confirmed as `PASS`; it must not add product runtime, bootstrap, provider adapter, deploy code or package runtime scripts.
+SPRINTING Definition of Done includes module-level runnable delivery boundaries. API, CLI, server route, service, agent, runtime, adapter, worker, provider, config-contract and fixture/live boundaries promised by a technical plan or task must be implemented or marked `BLOCKED` during development. Runtime/app/provider/live tasks must declare `evidence_level.required`, `target_runtime_environment` and `self_test_contract` in `plan.yaml`; every gate in `self_test_contract.required_gates` must also appear in task `required_gates`, and `self_test_contract.module_key_test_path` must describe the path from local start or invocation to all self-test scenarios completion, covering every runnable entry promised by the current task/module and its internal key paths. `deployed_runtime` cannot be closed by `unit`, `local_runtime`, `external_provider_live`, provider smoke, fake adapters or localhost smoke alone, and `business_handoff_ready` requires a Testing Handoff Contract. The current task implementation doc must include `Development Evidence` and a completed `Development Self-Test Report` with `Report Status: PASS | BLOCKED | IN_PROGRESS | STALE`, contract source, scenario results, executed gates, Module Key Test Path, actual evidence, missing/blockers and Testing Handoff Readiness; only `Report Status: PASS` with every scenario `PASS` can close a development task. The report proves module entry, core path, exit and minimal evidence; it is not a debug log, operator log, runbook or exploration history. Fallback/diagnostic detail belongs in `.docs/09_runbooks/**` appendices or git history. Module Key Test Path records actual entries, internal key paths, boundaries, checkpoints and observable completion evidence. Provider smoke, fixture smoke, fake adapters and one-shot smoke prove only local links; they do not by themselves prove application readiness. REVIEWING treats missing entry/exit, initialization, config contract, target runtime, evidence level or development evidence as blocking, and TESTING only exercises entrypoints that Review has confirmed as `PASS`; it must not add product runtime, bootstrap, provider adapter, deploy code or package runtime scripts.
-`make validate-dev` and `npx sdlc-harness validate-dev` are in-development SPRINTING gates. They allow the current `current_task_id` open task to remain in `plan.yaml` while checking that it is a valid `phase: "SPRINTING"` task with `docs`, `allowed_paths`, `required_gates`, `acceptance_criteria`, `implementation_doc`, scoped dirty files, an empty `plan.draft.yaml` queue, runtime evidence task contract, `self_test_contract`, linked runnable-entry implementation docs, structured development evidence and a completed Development Self-Test Report. The report must include Module Key Test Path so later agents can reuse the debug path from local entry to all self-test scenarios completion; that path is scoped to entries and internal key paths promised by the current task/module, not the whole system. Page tasks need a dev server or page URL plus browser/Playwright/screenshot/equivalent interaction evidence; API/CLI/worker/service/agent/runtime tasks need a startup or invocation command, endpoint/health/status, and observable response/output/side effect. `make validate-current` and `/advance` are phase-exit gates; before moving to REVIEWING, the implementation commit and completion ledger must be done and no open task may remain.
+High-risk runtime/live/remote-operator tasks are resume-first. When the current SPRINTING task requires `external_provider_live`, `deployed_runtime` or `business_handoff_ready`, or its target runtime is `cloud_vm`, `managed_service`, `browser` or `worker`, `plan.yaml` must include top-level `resume_capsule` with the current state, canonical path, next step, blocker, last passed gate, do-not-retry list and recovery refs. Open task `working_notes` stays short, with a 5-8 item target and an 8 item validator limit. Long-term implementation facts stay in the implementation doc; operator paths, credential references and remote entrypoints live in `.docs/09_runbooks/**`; the implementation doc only keeps a short `Current Operator Path` with canonical operator path, runbook link, credential reference name, command/UI channel and do-not-retry summary. Evidence bodies live in an evidence index or external system; failed exploration stays in an exploration appendix. The Development Self-Test Report for these tasks must include a Gate Breakdown that separates local gate, cloud/service gate, executor/operator readiness and live smoke or handoff evidence.
+`make validate-dev` and `npx sdlc-harness validate-dev` are in-development SPRINTING gates. They allow the current `current_task_id` open task to remain in `plan.yaml` while checking that it is a valid `phase: "SPRINTING"` task with `docs`, `allowed_paths`, `required_gates`, `acceptance_criteria`, `implementation_doc`, scoped dirty files, an empty `plan.draft.yaml` queue, runtime evidence task contract, `self_test_contract`, linked runnable-entry implementation docs, structured development evidence and a completed Development Self-Test Report. The report must include legal `Report Status` and Module Key Test Path so later agents can reuse the debug path from local entry to all self-test scenarios completion; that path is scoped to entries and internal key paths promised by the current task/module, not the whole system. `validate-dev` only passes completion-oriented dev evidence when `Report Status: PASS` and every scenario is `PASS`; `BLOCKED`, `IN_PROGRESS` and `STALE` reports may exist as recovery facts but cannot close the current development task. Page tasks need a dev server or page URL plus browser/Playwright/screenshot/equivalent interaction evidence; API/CLI/worker/service/agent/runtime tasks need a startup or invocation command, endpoint/health/status, and observable response/output/side effect. `validate-dev` checks content consistency and completeness between the report and current `self_test_contract`; it does not prove commands really executed in the current run. Agents must execute the current task `required_gates` before filling the report, and writing `PASS` without running those gates is an Agent execution violation. `make validate-current` and `/advance` are phase-exit gates; before moving to REVIEWING, the implementation commit and completion ledger must be done and no open task may remain.
 `validate-test` keeps its command name as the TESTING phase gate. `.docs/07_test/TEST_STRATEGY.md` describes scope, environment, priority and execution strategy; `.docs/07_test/TEST_CASES.md` describes cases bound to real runnable entry/exit; `.docs/07_test/TEST_REPORT.md` only records executed TESTING evidence, test matrix, regression evidence, runnable entry/exit coverage, coverage gaps and final decision. `validate-test` only accepts `TEST_REPORT.md`; it no longer treats `TEST_PLAN.md` as a report fallback.

package/assets/docs/README.md CHANGED Viewed

@@ -106,9 +106,11 @@ Agent 会读取 `<harnessRoot>/state/lifecycle.yaml` 和 `<harnessRoot>/state/pl
 `validate-design` 会把架构阶段的语义切片作为硬 gate：`overview.md` 不计入 deliverables，`plan.draft.yaml` 中每个开发 draft task 必须通过 `docs.tech_plan` 指向存在的 tech plan slice；多个开发 draft task 默认需要不同 primary tech plan slice。PRD、tech plan 或 draft task 明确出现 AI provider / copilot、外部系统边界、合规 / 权限 / 审计等横切主题时，也需要对应的专门 architecture slice。可运行边界类 draft task 还必须带 `self_test_contract`，并在 tech plan 中有 `Development Self-Test Contract`；合同必须记录 `module_key_test_path`，说明从本地启动或调用入口开始，到完成全部自测 scenario 的模块关键测试路径，并覆盖本 task / 本模块承诺的所有可运行入口和内部关键路径。
-SPRINTING 的 Definition of Done 包含模块级可运行交付边界：技术方案或 task 承诺的 API、CLI、server route、service、agent、runtime、adapter、worker、provider、配置契约和 fixture/live 边界必须在开发阶段实现或明确 `BLOCKED`。runtime/app/provider/live 类 task 必须在 `plan.yaml` 声明 `evidence_level.required`、`target_runtime_environment` 和 `self_test_contract`；`self_test_contract.required_gates` 必须同步出现在 task `required_gates`，`self_test_contract.module_key_test_path` 必须描述从本地启动或调用入口开始，到完成全部自测 scenario 的模块关键测试路径，并覆盖本 task / 本模块承诺的所有可运行入口和内部关键路径。`deployed_runtime` 不能用 `unit`、`local_runtime`、`external_provider_live`、provider smoke、fake adapter 或 localhost smoke 单独关闭，`business_handoff_ready` 还必须有 Testing Handoff Contract。当前 task 的 implementation doc 还必须写入 `Development Evidence` 和 `Development Self-Test Report`，其中自测报告记录 contract source、scenario results、executed gates、Module Key Test Path、actual evidence、missing/blockers 和 Testing Handoff Readiness；`Module Key Test Path` 必须记录实际入口、内部关键路径、关键边界、观察点和可观测完成证据。provider smoke、fixture smoke、fake adapter 或 one-shot smoke 只能证明局部链路，不能单独证明 application readiness。REVIEWING 会把缺少入口/出口、初始化、配置契约、目标运行环境、证据等级或开发自测证据作为阻断项；TESTING 只调用 Review 已确认 `PASS` 的既有入口做输入输出验证，不能新增 product runtime、bootstrap、provider adapter、deploy 或 package runtime script。
+SPRINTING 的 Definition of Done 包含模块级可运行交付边界：技术方案或 task 承诺的 API、CLI、server route、service、agent、runtime、adapter、worker、provider、配置契约和 fixture/live 边界必须在开发阶段实现或明确 `BLOCKED`。runtime/app/provider/live 类 task 必须在 `plan.yaml` 声明 `evidence_level.required`、`target_runtime_environment` 和 `self_test_contract`；`self_test_contract.required_gates` 必须同步出现在 task `required_gates`，`self_test_contract.module_key_test_path` 必须描述从本地启动或调用入口开始，到完成全部自测 scenario 的模块关键测试路径，并覆盖本 task / 本模块承诺的所有可运行入口和内部关键路径。`deployed_runtime` 不能用 `unit`、`local_runtime`、`external_provider_live`、provider smoke、fake adapter 或 localhost smoke 单独关闭，`business_handoff_ready` 还必须有 Testing Handoff Contract。当前 task 的 implementation doc 还必须写入 `Development Evidence` 和 `Development Self-Test Report`，其中自测报告记录 `Report Status: PASS | BLOCKED | IN_PROGRESS | STALE`、contract source、scenario results、executed gates、Module Key Test Path、actual evidence、missing/blockers 和 Testing Handoff Readiness；只有 `Report Status: PASS` 且所有 scenario 为 `PASS` 才能关闭 development task。`Development Self-Test Report` 只证明模块入口、核心路径、出口和最小证据，不承担 debug log、operator log、runbook 或探索流水职责；fallback / diagnostic 最多一句总结，详细内容进入 `.docs/09_runbooks/**` appendix 或 git history。`Module Key Test Path` 必须记录实际入口、内部关键路径、关键边界、观察点和可观测完成证据。provider smoke、fixture smoke、fake adapter 或 one-shot smoke 只能证明局部链路，不能单独证明 application readiness。REVIEWING 会把缺少入口/出口、初始化、配置契约、目标运行环境、证据等级或开发自测证据作为阻断项；TESTING 只调用 Review 已确认 `PASS` 的既有入口做输入输出验证，不能新增 product runtime、bootstrap、provider adapter、deploy 或 package runtime script。
-`make validate-dev` / `npx sdlc-harness validate-dev` 是 SPRINTING 开发中 gate：当前 `current_task_id` 指向的 open task 可以继续留在 `plan.yaml`，validator 会检查它是否是合法 `phase: "SPRINTING"` task、是否具备 `docs`、`allowed_paths`、`required_gates`、`acceptance_criteria`、`implementation_doc`，并校验 dirty files、`plan.draft.yaml`、runtime evidence task contract、`self_test_contract`、implementation doc、结构化 `Development Evidence` 和 `Development Self-Test Report`。自测报告必须记录 `Module Key Test Path`，便于后续 Agent 复用从本地入口到全部自测用例完成的 debug 路径；该路径只要求覆盖本 task / 本模块承诺范围内的可运行入口和内部关键路径，不要求覆盖全系统所有模块。页面类证据需要 dev server/page URL 与 browser check；API/CLI/worker/service/agent/runtime 类证据需要 startup/invocation command、endpoint/health/status 与 response/output/side effect。`make validate-current` / `/advance` 是阶段出口 gate；进入 REVIEWING 前仍必须先完成 implementation commit 和 completion ledger，把 open task 从 `plan.yaml` 移除。
+复杂 runtime/live/remote-operator 任务采用 resume-first 分层：当当前 SPRINTING task 要求 `external_provider_live`、`deployed_runtime`、`business_handoff_ready`，或目标环境是 `cloud_vm`、`managed_service`、`browser`、`worker` 时，`plan.yaml` 顶层必须维护 `resume_capsule`，只保留当前状态、canonical path、下一步、blocker、last passed gate、do-not-retry 和 recovery refs；open task 的 `working_notes` 只保留恢复短备注，目标 5-8 条且 validator 上限 8 条。长期实现事实写 implementation doc；操作路径、凭证引用、远端入口写 `.docs/09_runbooks/**` runbook；implementation doc 只放短的 `Current Operator Path`，记录 canonical operator path、runbook link、credential reference name、command/UI channel 和 do-not-retry summary；证据正文只进入 evidence index 或外部证据系统；失败探索隔离到 exploration appendix。高风险 task 的 `Development Self-Test Report` 还必须有 `Gate Breakdown`，把 local gate、cloud/service gate、executor/operator readiness 和 live smoke / handoff 分开记录，不能只用一个 `validate-dev PASS` 覆盖全部进度。
+`make validate-dev` / `npx sdlc-harness validate-dev` 是 SPRINTING 开发中 gate：当前 `current_task_id` 指向的 open task 可以继续留在 `plan.yaml`，validator 会检查它是否是合法 `phase: "SPRINTING"` task、是否具备 `docs`、`allowed_paths`、`required_gates`、`acceptance_criteria`、`implementation_doc`，并校验 dirty files、`plan.draft.yaml`、runtime evidence task contract、`self_test_contract`、implementation doc、结构化 `Development Evidence` 和 `Development Self-Test Report`。自测报告必须记录合法 `Report Status` 和 `Module Key Test Path`，便于后续 Agent 复用从本地入口到全部自测用例完成的 debug 路径；该路径只要求覆盖本 task / 本模块承诺范围内的可运行入口和内部关键路径，不要求覆盖全系统所有模块。`validate-dev` 只接受 `Report Status: PASS` 且所有 scenario 为 `PASS` 的完成态；`BLOCKED`、`IN_PROGRESS`、`STALE` 可以记录恢复事实，但不能关闭当前 development task。页面类证据需要 dev server/page URL 与 browser check；API/CLI/worker/service/agent/runtime 类证据需要 startup/invocation command、endpoint/health/status 与 response/output/side effect。`validate-dev` 只校验自测报告内容与当前 `self_test_contract` 的一致性和完整性，不证明命令在本轮真实执行；Agent 必须先实际运行 current task `required_gates` 后再填写 `Development Self-Test Report`，未执行 required gates 却写 `PASS` 属于 Agent execution violation。`make validate-current` / `/advance` 是阶段出口 gate；进入 REVIEWING 前仍必须先完成 implementation commit 和 completion ledger，把 open task 从 `plan.yaml` 移除。
 `validate-test` 仍然是 TESTING 阶段 gate 名称。`.docs/07_test/TEST_STRATEGY.md` 描述测试范围、环境、优先级和执行策略；`.docs/07_test/TEST_CASES.md` 描述绑定真实 runnable entry/exit 的测试用例；`.docs/07_test/TEST_REPORT.md` 只记录 TESTING 阶段实际执行后的 test matrix、regression evidence、runnable entry/exit coverage、coverage gaps 和 final decision。`validate-test` 只接受 `TEST_REPORT.md`，不会把 `TEST_PLAN.md` 当作 report fallback。
@@ -243,6 +245,7 @@ make docs-overview
 | `.docs/06_review/` | Review 报告 |
 | `.docs/07_test/` | 测试策略、测试用例、执行后测试报告、回归证据和覆盖缺口 |
 | `.docs/08_release/` | 当前发布状态、smoke evidence、回滚方案和已知限制 |
+| `.docs/09_runbooks/` | runtime/live/remote-operator 恢复路径、证据索引和探索附录 |
 | `.docs/rfc/` | 需求变更和影响分析 |
 `overview.md` 是生成物，用于浏览和阶段交接；Markdown slices 和 `.docs/INDEX.md` 才是事实源。

package/assets/policies/allowed_paths.yaml CHANGED Viewed

@@ -55,5 +55,6 @@ phases:
       - ".docs/01_product/**"
       - ".docs/03_tech_plan/**"
       - ".docs/07_test/**"
+      - ".docs/09_runbooks/**"
       - "<harnessRoot>/state/plan.yaml"
       - ".docs/INDEX.md"

package/assets/policies/phase_contracts.yaml CHANGED Viewed

@@ -58,6 +58,7 @@ phases:
       - "src/"
       - "tests/"
       - ".docs/04_implementation/"
+      - ".docs/09_runbooks/"
       - "<harnessRoot>/state/plan.draft.yaml"
     gates:
       - "make validate-dev"
@@ -89,6 +90,7 @@ phases:
       - ".docs/01_product/"
       - ".docs/03_tech_plan/"
       - ".docs/04_implementation/"
+      - ".docs/09_runbooks/"
       - ".docs/06_review/"
     outputs:
       - "<harnessRoot>/state/plan.yaml"
@@ -138,6 +140,7 @@ phases:
     outputs:
       - ".docs/rfc/"
       - ".docs/07_test/"
+      - ".docs/09_runbooks/"
       - "<harnessRoot>/state/plan.yaml"
       - ".docs/INDEX.md"
     gates:

package/assets/skills/pjsdlc_architect_design/SKILL.md CHANGED Viewed

@@ -57,6 +57,7 @@ ADR 用来解决“后来的人只看到结果，看不到当年取舍”的问
 - 如果实现计划改变了已有模块边界，应更新相关 architecture slice，而不是只在 task 描述里补一句。
 - 只要技术方案或 draft task 出现 service、agent、runtime、worker、frontend app、provider/live integration 或外部可运行边界，task breakdown 必须包含最后一公里 runtime 初始化和 testing handoff 交付：目标运行环境、启动/部署或预览方式、health/readiness、smoke 输入输出、日志/错误证据、测试可调用入口和出口。
 - 这类开发 draft task 必须写入 `evidence_level.required`、`target_runtime_environment` 和 `self_test_contract`。`evidence_level.required` 只能使用 `unit`、`local_runtime`、`external_provider_live`、`deployed_runtime`、`business_handoff_ready`；`target_runtime_environment.kind` 只能使用 `local`、`ci`、`staging`、`cloud_vm`、`managed_service`、`browser`、`worker`、`not_applicable`。`self_test_contract` 的 `source` 必须引用当前 tech plan slice，`required_gates` 必须同步到 task `required_gates`，`scenarios[]` 至少覆盖一个可运行入口和可观测出口，`module_key_test_path` 必须描述从本地启动或调用入口开始，到完成所有自测 scenario 的模块关键测试路径，并覆盖本 task / 本模块承诺的所有可运行入口和内部关键路径。
+- 如果 draft task 属于 high-risk runtime/live/remote-operator 工作（`external_provider_live`、`deployed_runtime`、`business_handoff_ready`，或目标环境为 `cloud_vm`、`managed_service`、`browser`、`worker`），还必须预留恢复分层：`docs.runbook` 指向 `.docs/09_runbooks/**` 下的 runbook / evidence index / exploration appendix，`allowed_paths` 覆盖这些文件，acceptance criteria 要求 promote 后维护 `plan.yaml#resume_capsule`。runbook 写 canonical operator path，evidence index 只写证据指针，exploration appendix 隔离失败尝试；不要把这些内容塞进 implementation doc 主线。
 - 如果用户明确要求把既有完整技术方案文件切成多个 `.docs/03_tech_plan/` slices，先确认 replacement slices 覆盖原文件中仍有效的接口契约、数据模型、模块方案、任务组和 gate；切片完成并更新 `plan.draft.yaml` 引用、`.docs/INDEX.md`、刷新 `overview.md` 后，删除被替代的完整 tech plan file，避免同一事实由完整文件和 slices 双重保留。
 - 每次新增、拆分、合并或废弃 slice 后，都要更新 `.docs/INDEX.md`。

package/assets/skills/pjsdlc_dev_sprint/SKILL.md CHANGED Viewed

@@ -17,7 +17,11 @@ description: Use during SPRINTING to execute one task from plan.yaml, respecting
 开发阶段的 Definition of Done 包含可运行的系统入口/出口。凡技术方案或 task 承诺 API、CLI、server route、service、agent、runtime、adapter、worker、provider、外部发送/写入执行器、配置契约或 live/fixture 双模式边界，当前实现必须提供对应入口、调用方式、初始化方式、输出/副作用边界和验证方式；如果真实入口/出口尚不可运行，不能把 task 当作完成，也不能把缺口留给 TESTING 补 runtime。runtime/app/provider/live 类 task 必须在 `plan.yaml` 声明 `evidence_level.required`、`target_runtime_environment` 和 `self_test_contract`，并按合同交付：`deployed_runtime` 不能用 `unit`、`local_runtime`、`external_provider_live`、provider smoke、fake adapter 或 localhost smoke 单独关闭；`business_handoff_ready` 必须提供 Testing Handoff Contract。Implementation doc 必须写明 `Runnable Entry/Exit`，并在 `Development Evidence` 中记录 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries` 和 `Basic Self-test Evidence`；其中 `Basic Self-test Evidence` 应指向已执行的 `Development Self-Test Report`。确实不适用时也要显式写 `Not applicable` 和具体原因。provider smoke、fixture smoke、fake adapter 或 one-shot smoke 只能证明局部链路，不能单独证明 `Application readiness`。此时应保留或创建 `BLOCKED`/后续 dev task，或通过 RFC/ARCHITECTING 处理边界变更。
-`self_test_contract` 是开发阶段自测合同，由 ARCHITECTING 或 RFC_RECALIBRATION 先定义，SPRINTING 负责执行并在 implementation doc 填写 `Development Self-Test Report`。开发者不得在开发结束后用现有实现反推自测合同；如果合同缺失、入口不匹配、required gates 未同步或场景无法执行，要先回到 ARCHITECTING/RFC 或把 task 保持为 `BLOCKED`。自测报告不是 TESTING 阶段产物，它只证明当前模块级可运行交付边界已经能被 Review/Testing 消费。报告还必须记录 `Module Key Test Path`：从本地启动或调用入口开始，执行并完成 `self_test_contract` 中全部自测用例的模块关键测试路径。该路径应覆盖本 task / 本模块承诺的所有可运行入口，以及自测用例实际经过的内部关键路径、关键边界、观察点和可观测完成证据，供后续 Agent 复用和 debug。
+`self_test_contract` 是开发阶段自测合同，由 ARCHITECTING 或 RFC_RECALIBRATION 先定义，SPRINTING 负责执行并在 implementation doc 填写 `Development Self-Test Report`。开发者不得在开发结束后用现有实现反推自测合同；如果合同缺失、入口不匹配、required gates 未同步或场景无法执行，要先回到 ARCHITECTING/RFC 或把 task 保持为 `BLOCKED`。自测报告不是 TESTING 阶段产物，也不是 debug log、operator log、runbook 或历史流水；它只证明当前模块级可运行交付边界已经能被 Review/Testing 消费。报告必须写 `Report Status: PASS | BLOCKED | IN_PROGRESS | STALE`，只有 `PASS` 且所有 scenario 都是 `PASS` 才能关闭当前 development task；`BLOCKED`、`IN_PROGRESS`、`STALE` 可以作为恢复事实存在，但不能作为交接通过。报告还必须记录 `Module Key Test Path`：从本地启动或调用入口开始，执行并完成 `self_test_contract` 中全部自测用例的模块关键测试路径。该路径应覆盖本 task / 本模块承诺的所有可运行入口，以及自测用例实际经过的内部关键路径、关键边界、观察点和可观测完成证据，供后续 Agent 复用和 debug。
+开发阶段交付包含两类产物：实现产物（代码、配置、脚本、测试等）和开发自测产物。`Development Self-Test Report` 是开发阶段产物，不是计划、模板或历史记录。若当前 task 或关联技术方案声明 `self_test_contract.status: "required"`，必须先逐条执行 `self_test_contract.scenarios[]` 和 `self_test_contract.required_gates`，再填写或更新 `Development Self-Test Report`。没有本轮执行过的 runnable entry、内部关键路径、observable exit / artifact / screenshot / response / log 等证据时，不得写 `PASS`，不得完成 task。fallback / diagnostic 在主报告最多一句总结，详细命令、截图过程、UI 操作细节和失败路径进入 `.docs/09_runbooks/**` exploration appendix 或 git history。
+高风险 runtime/live/remote-operator task 必须维护恢复优先级。若 `evidence_level.required` 是 `external_provider_live`、`deployed_runtime`、`business_handoff_ready`，或 `target_runtime_environment.kind` 是 `cloud_vm`、`managed_service`、`browser`、`worker`，`plan.yaml` 顶层必须有 `resume_capsule`，并在路径选择结论变化时立即更新：`state`、`canonical_path`、`next_step`、`blocker`、`last_passed_gate`、`do_not_retry` 和 `recovery_refs`。`working_notes` 只保留短恢复备注，目标 5-8 条且不得超过 8 条；canonical operator path 写入 `.docs/09_runbooks/**` runbook，并在 implementation doc 写一个短的 `Current Operator Path` 链接 runbook、credential reference name、command/UI channel 和 do-not-retry summary。证据正文只在 evidence 文件或外部系统，失败探索写入 exploration appendix。不要把 A/B/C 路径探索流水混进 implementation doc 主线或 scenario evidence。
 页面类任务在开发阶段必须启动 dev server 或等价预览入口，并用浏览器、Playwright、截图或等价方式验证页面可加载、主入口可访问、核心按钮/表单/跳转可用、没有明显报错或空白页。API/CLI/worker/RPA/service/agent/runtime 类任务必须记录实际启动或调用命令、endpoint、worker command、dry-run/live preflight、health/status 或 server action，以及可观察的 response、队列 item、审计日志、文件产物、发送结果、错误码或 PASS/BLOCKED 结果。
@@ -41,6 +45,7 @@ description: Use during SPRINTING to execute one task from plan.yaml, respecting
 - 当前 task `allowed_paths` 范围内的测试改动
 - `.docs/04_implementation/` 下相关模块、子系统或核心数据流的 implementation doc
 - 当前 task `working_notes` 或 implementation doc `Verification` 中的 gate evidence
+- high-risk runtime/live task 的 `plan.yaml#resume_capsule` 和 `.docs/09_runbooks/**` runbook / evidence index / exploration appendix
 - implementation doc 中的 runnable entry/exit、observable exit、Development Self-Test Report、Module Key Test Path、配置契约和 fixture/live 边界事实
 - 更新后的 `<harnessRoot>/state/plan.yaml`
 - 如果本轮 promote draft，更新后的 `<harnessRoot>/state/plan.draft.yaml`
@@ -72,7 +77,7 @@ description: Use during SPRINTING to execute one task from plan.yaml, respecting
 1. `current_task_id` 指向正在执行的 open task。
 2. open task 直接声明 `phase: "SPRINTING"`、`docs`、`allowed_paths`、`required_gates`、`acceptance_criteria` 和 `implementation_doc`；runtime/app/provider/live 类 task 还必须声明 `evidence_level`、`target_runtime_environment` 和 `self_test_contract`。`self_test_contract.required_gates` 必须同步出现在 task `required_gates`，`self_test_contract.module_key_test_path` 必须描述从本地启动或调用入口开始，到完成所有自测 scenario 的模块关键测试路径，并覆盖本 task / 本模块承诺的所有可运行入口和内部关键路径。
 3. 如果 open task 是由 `plan.draft.yaml.tasks[]` promote 而来，创建正式 `TASK-*` 和删除源 draft 必须发生在同一次状态更新中；正式 task 的恢复现场只保存在 `plan.yaml`。
-4. 任务执行中只保留恢复所需的简短 `working_notes`。
+4. 任务执行中只保留恢复所需的简短 `working_notes`，目标 5-8 条且不得超过 8 条；high-risk runtime/live task 用 `resume_capsule` 保存恢复卡片，并链接 runbook / evidence index / exploration appendix。
 5. gate、implementation doc、`.docs/INDEX.md` 和 `overview.md` 完成后，在当前 task 仍位于 `plan.yaml` 时创建 task implementation commit。
 6. implementation commit 完成后，再把该 task 从 `plan.yaml` 的 `tasks` 列表移除，并保留/递增 `next_task_sequence`。
 7. 将移除当前 task 后的 `plan.yaml` 提交为 task completion ledger commit，并 `git push` 两个 commit 到当前 upstream branch。
@@ -108,7 +113,8 @@ done task 的执行流水不在当前 `plan.yaml` 长期保留，也不是默认
 - [ ] 当前任务仍然是单一清晰的执行单元。
 - [ ] 技术方案或 task 承诺的 API/CLI/adapter/worker/provider、配置契约、输出/副作用和 fixture/live 边界已可运行并写入 implementation doc，或已明确 `BLOCKED`/后续 dev task。
 - [ ] implementation doc `Development Evidence` 已记录 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries`、`Basic Self-test Evidence`，或写明带原因的 `Not applicable`。
-- [ ] 如果当前 task 有 `self_test_contract.status: "required"`，implementation doc 已填写 `Development Self-Test Report`，包含 contract source、scenario results、executed gates、Module Key Test Path、actual evidence、missing/blockers 和 Testing Handoff Readiness，且没有 `BLOCKED` 场景。
+- [ ] 如果当前 task 有 `self_test_contract.status: "required"`，已逐条执行当前 `self_test_contract.scenarios[]` 和 `self_test_contract.required_gates`，并在 implementation doc `Development Self-Test Report` 写入 `Report Status`、本轮 contract source、scenario results、executed gates、runnable entry、内部关键路径、observable exit/evidence、Module Key Test Path、missing/blockers 和 Testing Handoff Readiness，且 `Report Status: PASS`、所有 scenario 都是 `PASS`。
+- [ ] 如果当前 task 是 high-risk runtime/live/remote-operator 工作，`resume_capsule` 已更新为 5-8 条恢复事实，`recovery_refs` 链接 implementation doc 和 `.docs/09_runbooks/**` runbook/evidence，implementation doc 已填写短的 `Current Operator Path` 和分层 `Gate Breakdown`。
 - [ ] 如果 task 要求 `business_handoff_ready`，implementation doc 已写入 Testing Handoff Contract，包含入口、配置、初始化/health、输入样例、预期出口、清理方式和证据等级。
 - [ ] 如果当前 task 来自 `plan.draft.yaml.tasks[]`，源 draft 已在 promote 时从 draft 列表删除。
 - [ ] implementation doc 已生成或更新，并反映相关模块的真实代码。

package/assets/skills/pjsdlc_implementation_doc/SKILL.md CHANGED Viewed

@@ -17,9 +17,11 @@ description: Use after development gates pass to update module-level implementat
 文档应帮助后来者快速理解：某个模块或核心数据流的当前实现是什么、关键对象/函数职责是什么、行为如何从输入流到输出、测试覆盖了什么、还有什么未覆盖。task id 只作为 provenance，不作为默认切片粒度。
+implementation doc 只写长期实现事实，不写完整操作日记。对于 high-risk runtime/live/remote-operator task，主线只保留当前 canonical path、当前实现边界、短的 `Current Operator Path` 和指向 `plan.yaml#resume_capsule`、`.docs/09_runbooks/**` runbook、evidence index、exploration appendix 的链接；失败路径和探索细节进入 exploration appendix，证据正文进入 evidence index 或外部系统。恢复入口必须比探索历史更显眼。
 如果模块包含或承诺可运行系统边界，implementation doc 必须记录 runnable entry/exit：API/CLI/server route/service/agent/runtime/adapter/worker/provider 的调用方式、初始化方式、配置契约、输入来源、输出或副作用、fixture/live 模式边界，以及哪些真实外部执行器尚未实现。还必须在 `Development Evidence` 中记录开发阶段实际验证过的 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries` 和 `Basic Self-test Evidence`；`Basic Self-test Evidence` 应指向已执行的 `Development Self-Test Report`。确实没有应用入口时，`Not applicable` 必须写清原因。不能把未来才会实现的入口写成当前事实，不能把 provider smoke、fixture smoke、fake adapter 或 one-shot smoke 单独写成 application readiness。如果 task 要求 `business_handoff_ready`，还必须写 Testing Handoff Contract，包含入口、配置、初始化/health、输入样例、预期出口、清理/reset/幂等说明和证据等级。
-如果当前 task 有 `self_test_contract.status: "required"`，implementation doc 必须填写 `Development Self-Test Report`，把设计/RFC 阶段定义的自测合同执行完成：记录 contract source、每个 scenario 的 `PASS` / `BLOCKED` 结果、实际执行入口、实际出口、证据位置或命令输出、executed gates、Module Key Test Path、missing/blockers 和 Testing Handoff Readiness。`Module Key Test Path` 必须说明从本地启动或调用入口开始，执行并完成 `self_test_contract` 中全部自测用例的模块关键测试路径。该路径应覆盖本 task / 本模块承诺的所有可运行入口，以及自测用例实际经过的内部关键路径、关键边界、观察点和可观测完成证据，供后续 Agent 复用和 debug。任何 scenario 为 `BLOCKED` 时，不得把开发 task 写成完成。
+如果当前 task 有 `self_test_contract.status: "required"`，implementation doc 必须填写 `Development Self-Test Report`，把设计/RFC 阶段定义的自测合同执行完成：记录 `Report Status: PASS | BLOCKED | IN_PROGRESS | STALE`、contract source、每个 scenario 的结果、实际执行入口、实际出口、证据位置或命令输出、executed gates、Module Key Test Path、missing/blockers 和 Testing Handoff Readiness。`Development Self-Test Report` 不是 debug log、operator log、runbook 或历史流水；fallback / diagnostic 在主报告最多一句总结，详细命令、截图过程、UI 操作细节和失败路径进入 exploration appendix 或 git history。High-risk runtime/live task 还必须写 `Current Operator Path` 和 `Gate Breakdown`，把 canonical operator path、runbook link、credential reference name、command/UI channel、do-not-retry summary 以及 local gate、cloud/service gate、executor/operator readiness、live smoke / handoff 分层记录，不能只写一个大 `validate-dev PASS`。`Development Self-Test Report` 只能记录当前 task 本轮实际执行后的结果；不得用历史报告、模板字段、代码阅读或无关通用 gate 替代本轮 self-test scenario 执行。`Module Key Test Path` 必须说明从本地启动或调用入口开始，执行并完成 `self_test_contract` 中全部自测用例的模块关键测试路径。该路径应覆盖本 task / 本模块承诺的所有可运行入口，以及自测用例实际经过的内部关键路径、关键边界、观察点和可观测完成证据，供后续 Agent 复用和 debug。任何 scenario 非 `PASS`，或 `Report Status` 为 `BLOCKED`、`IN_PROGRESS`、`STALE` 时，不得把开发 task 写成完成。
 ## 输入
@@ -51,7 +53,7 @@ description: Use after development gates pass to update module-level implementat
 3. 与技术方案的偏移必须明确记录，即便该偏移是合理的。
 4. runnable entry/exit、配置契约和 fixture/live 边界必须记录当前事实；缺失项写入 `未覆盖（Not covered）` 或方案偏移。
 5. `Development Evidence` 必须包含 task 合同要求的证据等级、目标运行环境、实际可调用入口、可观察出口、初始化方式、配置契约、测试交接状态、缺失 runtime 边界和开发自测证据；页面类任务记录 dev server/page URL 与 browser check，API/CLI/worker/RPA/service/agent/runtime 类任务记录 startup/invocation command、endpoint/health/status 与 response/output/side effect。
-6. `Development Self-Test Report` 必须执行 `self_test_contract` 中的全部 scenario 和 required gates，并记录从本地启动或调用入口开始，到完成所有自测用例的 `Module Key Test Path`；路径必须覆盖本 task / 本模块承诺的所有可运行入口、内部关键路径、关键边界、观察点和完成证据，不能只补一句 smoke 结果。
+6. `Development Self-Test Report` 必须记录 `Report Status`、当前 task 本轮执行 `self_test_contract` 中全部 scenario 和 required gates 后的结果，并记录从本地启动或调用入口开始，到完成所有自测用例的 `Module Key Test Path`；路径必须覆盖本 task / 本模块承诺的所有可运行入口、内部关键路径、关键边界、观察点和完成证据，不能只补一句 smoke 结果，也不能复用历史 PASS、模板字段、代码阅读或无关通用 gate 作为本轮证据。
 7. 测试覆盖必须列出具体测试，或明确记录覆盖缺口。
 8. 文档粒度保持在模块、子系统或核心数据流级别；不要默认按 task 建文档，也不要写成跨全项目的巨型百科。
@@ -63,7 +65,8 @@ description: Use after development gates pass to update module-level implementat
 - [ ] 核心数据流已说明。
 - [ ] runnable entry/exit、配置契约和 fixture/live 边界已记录，或缺失项已明确标注。
 - [ ] `Development Evidence` 已记录 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries`、`Basic Self-test Evidence`，或带原因的 `Not applicable`。
-- [ ] 如果当前 task 有 `self_test_contract.status: "required"`，`Development Self-Test Report` 已记录 contract source、scenario results、executed gates、Module Key Test Path、actual evidence、missing/blockers 和 Testing Handoff Readiness。
+- [ ] 如果当前 task 有 `self_test_contract.status: "required"`，`Development Self-Test Report` 已记录 `Report Status`、contract source、scenario results、executed gates、Module Key Test Path、actual evidence、missing/blockers 和 Testing Handoff Readiness。
+- [ ] 如果当前 task 是 high-risk runtime/live/remote-operator 工作，implementation doc 主线只保留实现事实、`Current Operator Path` 和恢复链接，`Gate Breakdown` 已分层记录，本轮失败探索已隔离到 exploration appendix。
 - [ ] `business_handoff_ready` task 已记录 Testing Handoff Contract。
 - [ ] 已判断 implementation doc 的语义切片边界。
 - [ ] 方案偏移和测试覆盖已记录。

package/assets/skills/pjsdlc_reviewer/SKILL.md CHANGED Viewed

@@ -17,7 +17,7 @@ Review 时先建立证据链：PRD 说什么、技术方案承诺什么、implem
 不要把个人偏好包装成 blocker。区分 blocking issue、follow-up improvement 和 open question。如果没有发现问题，要明确说明，同时列出剩余测试缺口或残余风险。
-Review 必须把“当前模块没有可运行入口/出口”视为阻断项，而不是普通测试缺口。凡 PRD、技术方案或 implementation doc 承诺 API、CLI、server route、service、agent、runtime、adapter、worker、provider、外部发送/写入执行器、配置契约或 live/fixture 双模式边界，Review 都要读取技术方案的 `Development Deliverable Contract`、`Development Self-Test Contract` 或等价交付边界，并核对真实代码和实现文档是否提供可调用入口、初始化方式、输出/副作用边界和验证方式；如果 task 声明了 `evidence_level.required`、`target_runtime_environment` 或 `self_test_contract`，还必须核对实际证据等级、执行地点、目标运行环境、自测 scenario 结果、`module_key_test_path` 和 required gates 是否匹配。implementation doc 还必须包含结构化 `Development Evidence`，说明 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries` 和 `Basic Self-test Evidence`，或带原因的 `Not applicable`；如果 task 有 `self_test_contract.status: "required"`，还必须包含已执行的 `Development Self-Test Report`，并记录从本地启动或调用入口开始，到完成全部自测用例的 `Module Key Test Path`。该路径应覆盖本 task / 本模块承诺的所有可运行入口、内部关键路径、关键边界、观察点和可观测完成证据。如果 task 要求 `deployed_runtime` 或 `business_handoff_ready`，但证据只是在开发机 `localhost`、provider live smoke、fixture smoke、fake adapter 或文档描述，应判 `BLOCKED`。缺失时 gate decision 应为 `BLOCKED`，并要求回到 SPRINTING/RFC，而不是允许进入 TESTING 后补 runtime。Review 报告必须写出 `Runnable Entry`、`Observable Exit`、`Initialization`、`Config Contract`、`Testing Handoff Readiness` 的 `PASS`/`BLOCKED` checklist；任一 `BLOCKED` 不得进入 TESTING。Review 不创建 `.docs/07_test/**` 正式测试产物；如果发现现有测试事实源仍链接已被 RFC supersede 的旧路线证据，应将其列为进入 TESTING 前的 blocker，并要求 RFC 清理或更新索引。
+Review 必须把“当前模块没有可运行入口/出口”视为阻断项，而不是普通测试缺口。凡 PRD、技术方案或 implementation doc 承诺 API、CLI、server route、service、agent、runtime、adapter、worker、provider、外部发送/写入执行器、配置契约或 live/fixture 双模式边界，Review 都要读取技术方案的 `Development Deliverable Contract`、`Development Self-Test Contract` 或等价交付边界，并核对真实代码和实现文档是否提供可调用入口、初始化方式、输出/副作用边界和验证方式；如果 task 声明了 `evidence_level.required`、`target_runtime_environment` 或 `self_test_contract`，还必须核对实际证据等级、执行地点、目标运行环境、自测 scenario 结果、`module_key_test_path` 和 required gates 是否匹配。high-risk runtime/live/remote-operator 工作要先看 `plan.yaml#resume_capsule` 和 `.docs/09_runbooks/**` runbook，确认 canonical path、do-not-retry、evidence index 和 exploration appendix 已把恢复主线与失败探索分开；Review 不应从失败探索中重新选择主路径。implementation doc 还必须包含结构化 `Development Evidence`，说明 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries` 和 `Basic Self-test Evidence`，或带原因的 `Not applicable`；如果 task 有 `self_test_contract.status: "required"`，还必须包含已执行的 `Development Self-Test Report`，并记录 `Report Status`、从本地启动或调用入口开始到完成全部自测用例的 `Module Key Test Path`。该报告不是 debug log、operator log、runbook 或历史流水；看到这些混入 scenario evidence 时应判为 blocker。`Report Status` 必须是 `PASS` 且所有 scenario 都是 `PASS` 才能进入 TESTING。该路径应覆盖本 task / 本模块承诺的所有可运行入口、内部关键路径、关键边界、观察点和可观测完成证据；high-risk task 还必须包含短的 `Current Operator Path` 和分层 `Gate Breakdown`。如果 task 要求 `deployed_runtime` 或 `business_handoff_ready`，但证据只是在开发机 `localhost`、provider live smoke、fixture smoke、fake adapter 或文档描述，应判 `BLOCKED`。缺失时 gate decision 应为 `BLOCKED`，并要求回到 SPRINTING/RFC，而不是允许进入 TESTING 后补 runtime。Review 报告必须写出 `Runnable Entry`、`Observable Exit`、`Initialization`、`Config Contract`、`Testing Handoff Readiness` 的 `PASS`/`BLOCKED` checklist；任一 `BLOCKED` 不得进入 TESTING。Review 不创建 `.docs/07_test/**` 正式测试产物；如果发现现有测试事实源仍链接已被 RFC supersede 的旧路线证据，应将其列为进入 TESTING 前的 blocker，并要求 RFC 清理或更新索引。
 Review 产出本身也是 workflow task。开始 review 前，先在 `<harnessRoot>/state/plan.yaml` 创建或选择一个足够小的 `TASK-*` open task，并设置 `phase: "REVIEWING"`；当前轮只产出一个 review batch、一个风险主题 slice 或一次 PR review 结论。不要在一个任务里覆盖多个互不相关的 review 主题。
@@ -81,7 +81,7 @@ Review 阶段受 `plan.yaml` 管控：
 - [ ] 已评估架构和可维护性风险。
 - [ ] 已评估 runnable entry/exit、配置契约和 fixture/live 边界是否足以进入 TESTING。
 - [ ] 已评估 implementation doc 是否包含 Evidence Level、Target Runtime Environment、Runnable Entry、Observable Exit、Client / Server Initialization、Config Contract、Testing Handoff Readiness、Known Missing Runtime Boundaries 和 Basic Self-test Evidence。
-- [ ] 已评估 `self_test_contract` 对应的 Development Self-Test Report 是否执行全部 scenario 和 required gates，并记录可复用的 Module Key Test Path。
+- [ ] 已评估 `self_test_contract` 对应的 Development Self-Test Report 是否 `Report Status: PASS`、执行全部 scenario 和 required gates，并记录可复用的 Module Key Test Path。
 - [ ] 已核对证据等级和执行地点是否匹配 task / 技术方案承诺的目标运行环境。
 - [ ] 已判断 review slice 的范围和风险主题边界。
 - [ ] 已列出测试缺口。

package/assets/skills/pjsdlc_tester/SKILL.md CHANGED Viewed

@@ -17,7 +17,7 @@ description: Use during TESTING to produce a test matrix, run regression, and do
 执行回归时，优先选择能证明阶段出口的 gate。测试无法运行、环境缺失或数据不可得时，不要宣布通过；如果已经进入 TESTING，应在 `TEST_REPORT.md` 中记录 `BLOCKED`、已完成检查和恢复条件。
-TESTING 只能调用 SPRINTING/REVIEWING 已确认 `PASS` 的入口做输入/输出验证。可以补充测试、fixture、mock、assertion helper 和测试文档，但不能在 TESTING 中新增或长期维护 product runtime、server/API/CLI/adapter、direct poller、cloud bootstrap、systemd unit、真实 provider adapter、package runtime script 或部署脚本。如果发现真实入口/出口不存在、implementation doc 缺少 `Development Evidence` 或 `Development Self-Test Report`、自测报告缺少从本地启动或调用入口到完成全部自测用例的 `Module Key Test Path`、或该路径没有覆盖本 task / 本模块承诺的入口、内部关键路径、关键边界、观察点和完成证据，live 模式不可调用、配置契约缺失、Review readiness checklist 不是全 `PASS`，或 `Evidence Level` / `Target Runtime Environment` / `self_test_contract` 与 task 或技术方案承诺不一致，应记录 `BLOCKED`、生成 RFC 或后续 dev task 建议，并停止把测试阶段扩大成开发/集成搭建。开发尚未交付可测试 entry/exit、目标运行环境、Development Self-Test Report 或 Testing Handoff Contract 时，不要在 `.docs/07_test/**` 提前生成正式测试用例或正式报告；验收思路应留在 PRD acceptance criteria、tech plan verification strategy 或非 `.docs/07_test/**` 的草稿说明里。`TEST_REPORT.md` 不能在描述缺少 entry/exit、缺少 Development Evidence、缺少 Development Self-Test Report、证据等级不匹配或未交付应用入口时给出 `PASS`。
+TESTING 只能调用 SPRINTING/REVIEWING 已确认 `PASS` 的入口做输入/输出验证。可以补充测试、fixture、mock、assertion helper 和测试文档，但不能在 TESTING 中新增或长期维护 product runtime、server/API/CLI/adapter、direct poller、cloud bootstrap、systemd unit、真实 provider adapter、package runtime script 或部署脚本。如果发现真实入口/出口不存在、implementation doc 缺少 `Development Evidence` 或 `Development Self-Test Report`、自测报告缺少 `Report Status: PASS`、缺少从本地启动或调用入口到完成全部自测用例的 `Module Key Test Path`、或该路径没有覆盖本 task / 本模块承诺的入口、内部关键路径、关键边界、观察点和完成证据，live 模式不可调用、配置契约缺失、Review readiness checklist 不是全 `PASS`，或 `Evidence Level` / `Target Runtime Environment` / `self_test_contract` 与 task 或技术方案承诺不一致，应记录 `BLOCKED`、生成 RFC 或后续 dev task 建议，并停止把测试阶段扩大成开发/集成搭建。`Development Self-Test Report` 不是 debug log、operator log、runbook 或探索流水；测试只消费其模块入口、核心路径、出口和最小证据。high-risk runtime/live/remote-operator 验证要先读 `plan.yaml#resume_capsule`，再读 `.docs/09_runbooks/**` runbook 和 evidence index，最后才读 exploration appendix；测试只沿 canonical path 验证，不重新尝试 `do_not_retry` 中的失败路径。开发尚未交付可测试 entry/exit、目标运行环境、Development Self-Test Report 或 Testing Handoff Contract 时，不要在 `.docs/07_test/**` 提前生成正式测试用例或正式报告；验收思路应留在 PRD acceptance criteria、tech plan verification strategy 或非 `.docs/07_test/**` 的草稿说明里。`TEST_REPORT.md` 不能在描述缺少 entry/exit、缺少 Development Evidence、缺少 Development Self-Test Report、证据等级不匹配或未交付应用入口时给出 `PASS`。
 测试设计和回归证据产出本身也是 workflow task。开始测试前，先在 `<harnessRoot>/state/plan.yaml` 创建或选择一个足够小的 `TASK-*` open task，并设置 `phase: "TESTING"`；当前轮只产出一个测试策略 slice、测试用例 slice、回归批次、风险验证片区或一组 scoped test changes。`plan.yaml` 仍是唯一执行计划事实源，`.docs/07_test/**` 只记录当前方案的 test strategy、test cases、executed regression evidence、coverage gaps 和 final decision，不表达“下一步如何开发”，也不保留已被 RFC supersede 的旧测试结果。
@@ -87,7 +87,8 @@ TESTING 只能调用 SPRINTING/REVIEWING 已确认 `PASS` 的入口做输入/输
 - [ ] Regression checklist 已完成。
 - [ ] 测试只调用既有 runnable entry/exit；未在 TESTING 中新增 product runtime、bootstrap、provider adapter、deploy 或 package runtime script。
 - [ ] 已核对 implementation doc 中的 Development Evidence、Evidence Level、Target Runtime Environment 和 Testing Handoff Contract，并只基于已交付入口设计测试。
-- [ ] 已核对 Development Self-Test Report 中 scenario results、executed gates、Module Key Test Path 和 actual evidence。
+- [ ] 已核对 Development Self-Test Report 中 Report Status、scenario results、executed gates、Module Key Test Path 和 actual evidence。
+- [ ] high-risk runtime/live 验证已优先使用 `resume_capsule` 与 runbook/evidence index，未重复执行 exploration appendix 中的失败路径。
 - [ ] 已判断 test report / test matrix 的语义切片边界。
 - [ ] 未把测试计划、测试用例或待填内容写成 `TEST_REPORT.md`。
 - [ ] 已确认 `.docs/07_test/**` 只包含当前方案仍有效的测试事实。

package/assets/templates/EVIDENCE_INDEX_TEMPLATE.md ADDED Viewed

@@ -0,0 +1,17 @@
+# [Runtime / Live Smoke] Evidence Index
+本文件只保存证据指针和缺口，不把证据正文塞回 implementation doc 主线。
+| Scenario | Status | Evidence File / System | Gap / Next Action |
+|---|---|---|---|
+|  | PASS / BLOCKED / GAP |  |  |
+## Evidence Retention
+- Temporary evidence:
+- Stable artifact / CI / release record:
+- Evidence that must not be copied into main docs:
+## Missing Evidence
+-

package/assets/templates/EXPLORATION_APPENDIX_TEMPLATE.md ADDED Viewed

@@ -0,0 +1,22 @@
+# [Runtime / Operator Path] Exploration Appendix
+## Purpose
+记录失败尝试、诊断路径和不要重复踩坑的结论。本文件是记忆附录，不是恢复主线。
+不要把本文件的长日志复制进 `Development Self-Test Report`；主报告最多保留一句 fallback / diagnostic 总结。
+## Failed / Diagnostic Attempts
+| Attempt | Result | Why It Failed Or Stayed Diagnostic | Do Not Retry Rule |
+|---|---|---|---|
+|  |  |  |  |
+## Useful Observations
+-
+## Promoted Decisions
+| Decision | Promoted To |
+|---|---|
+|  | `plan.yaml#resume_capsule` / runbook / implementation doc |

package/assets/templates/IMPLEMENTATION_DOC_TEMPLATE.md CHANGED Viewed

@@ -51,10 +51,22 @@ Input
 - Testing Handoff Readiness:
 - Known Missing Runtime Boundaries:
 - Basic Self-test Evidence: See `Development Self-Test Report`.
+- Resume Capsule / Runbook:
 - Not applicable:
-## 7. Development Self-Test Report（开发自测报告，已执行）
+## 7. Current Operator Path（当前操作路径，仅 runtime/live/remote-operator 需要）
+- Canonical path:
+- Operator runbook: `.docs/09_runbooks/...`
+- Credential reference: Keychain item name 或 secret reference name only；不要记录明文密钥。
+- Command/UI channel:
+- Do-not-retry summary: fallback / diagnostic 只写一句结论，详细内容进 exploration appendix 或 git history。
+## 8. Development Self-Test Report（开发自测报告）
+本节只证明模块入口、核心路径、出口和最小证据，不是 debug log、operator log、runbook 或探索流水。
+- Report Status: PASS | BLOCKED | IN_PROGRESS | STALE
 - Contract Source:
 - Scenario Results:
 - Executed Gates:
@@ -63,11 +75,31 @@ Input
 - Missing / Blockers:
 - Testing Handoff Readiness:
+保留：
+- Runnable Entry / Module Key Test Path / Observable Exit
+- Scenario Results / Executed Gates / Actual Evidence
+- Missing / Blockers / Testing Handoff Readiness
+不保留：
+- 每次工具探索的完整流水
+- debug log、operator log、历史操作日记或 runbook 正文
+- fallback / diagnostic 的长篇命令、截图过程或 UI 细节
+- 与当前恢复路径无关的旧失败通道；只在 appendix 或 git history 保留
+### Gate Breakdown（Gate 分层）
+| Gate Layer | Status | Evidence | Gap / Next Action |
+|---|---|---|---|
+| Local gate |  |  |  |
+| Cloud/service gate |  |  |  |
+| Executor/operator readiness |  |  |  |
+| Live smoke / handoff |  |  |  |
 | Scenario ID | Result | Executed Entry | Actual Exit | Evidence |
 |---|---|---|---|---|
-| ST-001 | PASS / BLOCKED |  |  |  |
+|  |  |  |  |  |
-## 8. Testing Handoff Contract（测试交接合同）
+## 9. Testing Handoff Contract（测试交接合同）
 - Entry:
 - Config:
@@ -77,7 +109,7 @@ Input
 - Cleanup / reset / idempotency:
 - Evidence Level:
-## 9. 关键实现逻辑
+## 10. 关键实现逻辑
 - 输入校验（Input validation）:
 - 核心分支（Core branches）:
@@ -85,22 +117,22 @@ Input
 - 边界兜底（Boundary fallback）:
 - 性能或并发注意事项（Performance or concurrency notes）:
-## 10. 与技术方案的偏移
+## 11. 与技术方案的偏移
 -
-## 11. 测试覆盖（Test Coverage）
+## 12. 测试覆盖（Test Coverage）
 | 测试（Test） | 覆盖范围（Coverage） | 结果（Result） |
 |---|---|---|
 |  |  |  |
-## 12. 变更记录（Change Log）
+## 13. 变更记录（Change Log）
 | 日期（Date） | Task ID | Commit | 摘要（Summary） |
 |---|---|---|---|
 |  |  |  |  |
-## 13. 后续维护注意事项
+## 14. 后续维护注意事项
 -

package/assets/templates/PLAN_TEMPLATE.yaml CHANGED Viewed

@@ -1,5 +1,21 @@
 current_task_id: "TASK-001"
 next_task_sequence: 2
+# Required while the current SPRINTING task is high-risk runtime/live work
+# (`external_provider_live`, `deployed_runtime`, `business_handoff_ready`, or
+# target runtime `cloud_vm` / `managed_service` / `browser` / `worker`). Keep this
+# short: 5-8 recovery facts, not a full attempt log.
+# resume_capsule:
+#   task_id: "TASK-001"
+#   state: "in_progress | blocked | ready_for_gate"
+#   canonical_path: "operator/runtime path to continue from"
+#   next_step: "one concrete next action"
+#   blocker: "current blocker, or none with context"
+#   last_passed_gate: "last concrete PASS gate or checkpoint"
+#   do_not_retry:
+#     - "known failed path or repeated trap to avoid"
+#   recovery_refs:
+#     - ".docs/04_implementation/example.md"
+#     - ".docs/09_runbooks/example_live_smoke_runbook.md"
 # Optional top-level execution contract. Omit this block when the current task
 # stays serial after the default parallel eligibility check. Use
 # trigger: "workflow_default" when the workflow safely splits the task by
@@ -46,6 +62,7 @@ tasks:
       architecture: []
       tech_plan: []
       rfc: []
+      runbook: []
     allowed_paths:
       - ".docs/00_raw/**"
       - ".docs/01_product/**"
@@ -85,6 +102,6 @@ tasks:
           evidence: "command/browser/API/log/screenshot/etc"
       not_applicable_reason: ""
     working_notes:
-      - "执行现场备注只在 open task 保留。"
+      - "执行现场备注只保留恢复所需的短备注；目标 5-8 条，validator 上限 8 条；路径选择结论提升到 resume_capsule。"
     result_docs:
       - ".docs/01_product/example.md"

package/assets/templates/RUNBOOK_TEMPLATE.md ADDED Viewed

@@ -0,0 +1,47 @@
+# [Runtime / Operator Path] Runbook
+本文件记录 operator/provisioning 恢复路径，不是 `Development Self-Test Report` 或 scenario evidence。
+## 1. Recovery Summary
+- Canonical path:
+- Current state:
+- Next command channel:
+- Last known good checkpoint:
+- Primary blocker:
+## 2. Operator Path
+```txt
+canonical:
+credentials: Keychain item name or secret reference only
+remote host:
+command channel:
+UI channel:
+do not prefer:
+```
+## 3. Preconditions
+- Required access:
+- Required local tools:
+- Required remote services:
+- Safety / cleanup notes:
+## 4. Resume Steps
+1.
+2.
+3.
+## 5. Fallbacks And Diagnostics
+- Preferred fallback:
+- Diagnostic-only paths:
+- Do not retry:
+## 6. Linked Evidence
+- Evidence index:
+- Exploration appendix:
+- Implementation doc:

package/assets/tools/harness_utils.py CHANGED Viewed

@@ -375,6 +375,20 @@ CALLABLE_TASK_TERMS = [
     "队列",
 ]
 SELF_TEST_CONTRACT_STATUSES = {"required", "not_applicable"}
+RESUME_CAPSULE_REQUIRED_EVIDENCE_LEVELS = {"external_provider_live", "deployed_runtime", "business_handoff_ready"}
+RESUME_CAPSULE_REQUIRED_TARGET_KINDS = {"cloud_vm", "managed_service", "browser", "worker"}
+RESUME_CAPSULE_FIELDS = [
+    "task_id",
+    "state",
+    "canonical_path",
+    "next_step",
+    "blocker",
+    "last_passed_gate",
+    "do_not_retry",
+    "recovery_refs",
+]
+RUNBOOK_DOC_PREFIX = ".docs/09_runbooks/"
+MAX_WORKING_NOTES = 8
 def as_string_list(value: Any) -> list[str]:
@@ -417,6 +431,16 @@ def needs_runnable_task_contract(task: dict[str, Any]) -> bool:
     return contains_any(context, APPLICATION_READINESS_TASK_TERMS + PAGE_TASK_TERMS + CALLABLE_TASK_TERMS)
+def requires_resume_capsule(task: dict[str, Any]) -> bool:
+    if task.get("phase") != "SPRINTING":
+        return False
+    evidence_level = task.get("evidence_level")
+    target_runtime = task.get("target_runtime_environment")
+    required = str(evidence_level.get("required") or "") if isinstance(evidence_level, dict) else ""
+    kind = str(target_runtime.get("kind") or "") if isinstance(target_runtime, dict) else ""
+    return required in RESUME_CAPSULE_REQUIRED_EVIDENCE_LEVELS or kind in RESUME_CAPSULE_REQUIRED_TARGET_KINDS
 def self_test_contract_errors_for_task(task: dict[str, Any]) -> list[str]:
     task_id = str(task.get("id") or "Task")
     required_for_runnable = needs_runnable_task_contract(task)
@@ -658,6 +682,16 @@ def validate_task_shape(task: dict[str, Any], index: int) -> None:
         require(isinstance(task["allowed_paths"], list) and task["allowed_paths"], f"{task['id']} must define allowed_paths")
         require(isinstance(task["required_gates"], list) and task["required_gates"], f"{task['id']} must define required_gates")
         require(isinstance(task["acceptance_criteria"], list) and task["acceptance_criteria"], f"{task['id']} must define acceptance_criteria")
+        if "working_notes" in task:
+            require(
+                isinstance(task["working_notes"], (list, str)),
+                f"{task['id']} working_notes must be a short string or list with at most {MAX_WORKING_NOTES} items",
+            )
+            note_count = len(task["working_notes"]) if isinstance(task["working_notes"], list) else (1 if str(task["working_notes"]).strip() else 0)
+            require(
+                note_count <= MAX_WORKING_NOTES,
+                f"{task['id']} working_notes must stay resume-first and contain at most {MAX_WORKING_NOTES} items; found {note_count}",
+            )
         for error in self_test_contract_errors_for_task(task):
             require(False, error)
         for error in testing_boundary_errors_for_allowed_paths(task):
@@ -672,6 +706,54 @@ def task_sequence_number(task_id: str) -> int:
     return int(match.group(1)) if match else 0
+def validate_resume_capsule_contract(data: dict[str, Any]) -> None:
+    current_task_id = str(data.get("current_task_id") or "")
+    current_task = task_by_id(data, current_task_id) if current_task_id else None
+    if not current_task or current_task.get("status") not in OPEN_TASK_STATUSES or current_task.get("phase") != "SPRINTING":
+        require("resume_capsule" not in data, "plan.yaml resume_capsule must only be present for the current open SPRINTING task")
+        return
+    capsule = data.get("resume_capsule")
+    if not requires_resume_capsule(current_task):
+        if capsule is not None:
+            require(isinstance(capsule, dict), f"{current_task_id} resume_capsule must be a mapping when present")
+        return
+    require(isinstance(capsule, dict), f"{current_task_id} high-risk runtime task must define top-level resume_capsule")
+    for field in RESUME_CAPSULE_FIELDS:
+        require(field in capsule, f"{current_task_id} resume_capsule missing field: {field}")
+    require(str(capsule.get("task_id") or "").strip() == current_task_id, f"{current_task_id} resume_capsule.task_id must match current_task_id")
+    for field in ["state", "canonical_path", "next_step", "blocker", "last_passed_gate"]:
+        value = str(capsule.get(field) or "").strip()
+        require(value and not is_placeholder_evidence(value), f"{current_task_id} resume_capsule.{field} must contain concrete recovery information")
+    do_not_retry = as_string_list(capsule.get("do_not_retry"))
+    require(
+        do_not_retry and not any(is_placeholder_evidence(item) for item in do_not_retry),
+        f"{current_task_id} resume_capsule.do_not_retry must list concrete paths or attempts not to repeat",
+    )
+    refs = as_string_list(capsule.get("recovery_refs"))
+    require(refs, f"{current_task_id} resume_capsule.recovery_refs must link implementation doc and runbook/evidence documents")
+    implementation_doc = str(current_task.get("implementation_doc") or "").strip()
+    if implementation_doc:
+        require(
+            implementation_doc in refs,
+            f"{current_task_id} resume_capsule.recovery_refs must include current implementation_doc {implementation_doc}",
+        )
+    require(
+        any(ref.startswith(RUNBOOK_DOC_PREFIX) for ref in refs),
+        f"{current_task_id} resume_capsule.recovery_refs must include a runbook/evidence document under {RUNBOOK_DOC_PREFIX}",
+    )
+    for ref in refs:
+        require(
+            ref.startswith(".docs/04_implementation/") or ref.startswith(RUNBOOK_DOC_PREFIX),
+            f"{current_task_id} resume_capsule.recovery_refs may only point to implementation docs or runbook/evidence docs: {ref}",
+        )
+        require(repo_path(ref).exists(), f"{current_task_id} resume_capsule recovery_ref does not exist: {ref}")
 def validate_plan_contract(data: dict[str, Any], allow_open: bool) -> None:
     lifecycle = load_lifecycle()
     current_phase = str(lifecycle.get("current_phase") or "")
@@ -695,6 +777,7 @@ def validate_plan_contract(data: dict[str, Any], allow_open: bool) -> None:
     current_task_id = data.get("current_task_id") or ""
     if current_task_id:
         require(task_by_id(data, current_task_id), f"current_task_id does not match a task: {current_task_id}")
+    validate_resume_capsule_contract(data)
     open_tasks = [task.get("id") for task in tasks if task.get("status") in OPEN_TASK_STATUSES]
     if not allow_open:

package/assets/tools/validate_harness.py CHANGED Viewed

@@ -29,6 +29,7 @@ def main() -> None:
         ".docs/06_review",
         ".docs/07_test",
         ".docs/08_release",
+        ".docs/09_runbooks",
         ".docs/rfc",
         ".codex/skills",
         "tools",

package/dist/lib/init.js CHANGED Viewed

@@ -14,6 +14,7 @@ const DOC_DIRS = [
     ".docs/06_review",
     ".docs/07_test",
     ".docs/08_release",
+    ".docs/09_runbooks",
     ".docs/rfc"
 ];
 export async function runInit(projectRoot, options) {

package/dist/lib/validators.js CHANGED Viewed

@@ -89,6 +89,7 @@ const TEST_REPORT_PLACEHOLDER_TERMS = ["pending", "tbd", "todo", "待填", "待
 const TEST_FACT_SOURCE_PHASES = new Set(["TESTING", "RFC_RECALIBRATION"]);
 const TEST_FACT_SOURCE_PATTERNS = [".docs/07_test/**", ".docs/07_test/"];
 const TEST_FACT_SOURCE_REF = /\.docs\/07_test\/[^\s`,)]+/g;
+const RUNBOOK_DOC_PREFIX = ".docs/09_runbooks/";
 const RUNNABLE_ENTRY_EXIT_TERMS = [
     "runnable entry/exit",
     "entry/exit",
@@ -103,6 +104,8 @@ const DEVELOPMENT_SELF_TEST_CONTRACT_TERMS = ["development self-test contract",
 const DEVELOPMENT_SELF_TEST_REPORT_TERMS = ["development self-test report", "开发自测报告"];
 const DEVELOPMENT_SELF_TEST_IMPACT_TERMS = ["development self-test impact", "开发自测影响"];
 const MODULE_KEY_TEST_PATH_TERMS = ["module key test path", "模块关键测试路径"];
+const GATE_BREAKDOWN_TERMS = ["gate breakdown", "gate 分层", "gate breakdown（gate 分层）"];
+const CURRENT_OPERATOR_PATH_TERMS = ["current operator path", "operator path", "当前操作路径", "当前 operator path"];
 const TESTING_HANDOFF_TERMS = ["testing handoff contract", "测试交接合同"];
 const EVIDENCE_PLACEHOLDER_TERMS = [
     "pending",
@@ -113,6 +116,74 @@ const EVIDENCE_PLACEHOLDER_TERMS = [
     "待补",
     "待确认"
 ];
+const SELF_TEST_REPORT_PLACEHOLDER_TERMS = [
+    "pass / blocked",
+    "pass or blocked",
+    "pass/block",
+    "pass/blocker",
+    "local start / invocation",
+    "all self-test scenarios",
+    "all task/module promised runnable entries",
+    "actual internal key paths",
+    "observable completion evidence"
+];
+const SELF_TEST_REPORT_STATUSES = new Set(["PASS", "BLOCKED", "IN_PROGRESS", "STALE"]);
+const SELF_TEST_REPORT_DISALLOWED_SECTION_TERMS = [
+    "debug log",
+    "operator log",
+    "operation log",
+    "runbook",
+    "exploration",
+    "diagnostic attempts",
+    "fallback attempts",
+    "history log",
+    "remote operation log",
+    "调试日志",
+    "操作日志",
+    "远端操作日志",
+    "探索流水",
+    "失败探索",
+    "诊断尝试",
+    "历史流水"
+];
+const SELF_TEST_OBSERVABLE_EVIDENCE_TERMS = [
+    "pass output",
+    "response",
+    "output",
+    "side effect",
+    "log",
+    "artifact",
+    "health",
+    "status",
+    "audit",
+    "rendered",
+    "page state",
+    "screenshot",
+    "browser check",
+    "playwright",
+    "command output",
+    "queue",
+    "file"
+];
+const RESUME_CAPSULE_REQUIRED_EVIDENCE_LEVELS = new Set(["external_provider_live", "deployed_runtime", "business_handoff_ready"]);
+const RESUME_CAPSULE_REQUIRED_TARGET_KINDS = new Set(["cloud_vm", "managed_service", "browser", "worker"]);
+const RESUME_CAPSULE_FIELDS = [
+    "task_id",
+    "state",
+    "canonical_path",
+    "next_step",
+    "blocker",
+    "last_passed_gate",
+    "do_not_retry",
+    "recovery_refs"
+];
+const MAX_WORKING_NOTES = 8;
+const GATE_BREAKDOWN_LAYER_GROUPS = [
+    ["local gate", ["local", "unit", "lint", "test", "本地"]],
+    ["cloud/service gate", ["cloud", "service", "runtime", "server", "managed_service", "cloud_vm", "服务", "云端"]],
+    ["executor/operator readiness", ["executor", "operator", "worker", "browser", "provider", "adapter", "readiness", "执行器", "操控", "就绪"]],
+    ["live smoke or handoff", ["live", "smoke", "handoff", "external_provider_live", "deployed_runtime", "business_handoff_ready", "冒烟", "交接"]]
+];
 const PAGE_TASK_TERMS = ["frontend", "front-end", "browser", "page", "页面", "前端", "按钮", "表单", "跳转"];
 const PAGE_ENTRY_TERMS = ["http://", "https://", "localhost", "127.0.0.1", "page url", "页面 url", "dev server"];
 const PAGE_BROWSER_CHECK_TERMS = ["browser check", "playwright", "screenshot", "click", "button", "form", "页面可加载", "浏览器验证"];
@@ -289,6 +360,7 @@ async function validateHarness(projectRoot) {
     for (const required of [
         "AGENTS.md",
         ".docs/INDEX.md",
+        ".docs/09_runbooks",
         harnessPath(root, "config.yaml"),
         harnessPath(root, "state", "lifecycle.yaml"),
         harnessPath(root, "state", "plan.yaml"),
@@ -756,6 +828,7 @@ async function validatePlanState(projectRoot, allowOpen) {
             if (!Array.isArray(task.acceptance_criteria) || task.acceptance_criteria.length === 0) {
                 errors.push(`Open task ${task.id} must define acceptance_criteria`);
             }
+            errors.push(...validateWorkingNotesLimit(task));
             errors.push(...validateRuntimeEvidenceContract(task));
             errors.push(...testingBoundaryErrorsForAllowedPaths(task));
         }
@@ -774,8 +847,90 @@ async function validatePlanState(projectRoot, allowOpen) {
     if (currentTaskId && !tasks.some((task) => isRecord(task) && task.id === currentTaskId)) {
         errors.push(`current_task_id does not match a task: ${currentTaskId}`);
     }
+    errors.push(...(await validateResumeCapsule(projectRoot, tasksData)));
     return { taskCount: tasks.length, errors, plan: tasksData };
 }
+function validateWorkingNotesLimit(task) {
+    if (!("working_notes" in task))
+        return [];
+    const taskId = String(task.id ?? "Open task");
+    const notes = task.working_notes;
+    if (typeof notes !== "string" && !Array.isArray(notes)) {
+        return [`Open task ${taskId} working_notes must be a short string or list with at most ${MAX_WORKING_NOTES} items`];
+    }
+    const count = Array.isArray(notes) ? notes.length : notes.trim() ? 1 : 0;
+    if (count > MAX_WORKING_NOTES) {
+        return [`Open task ${taskId} working_notes must stay resume-first and contain at most ${MAX_WORKING_NOTES} items; found ${count}`];
+    }
+    return [];
+}
+async function validateResumeCapsule(projectRoot, plan) {
+    const errors = [];
+    const currentTask = currentOpenSprintTask(plan);
+    const capsule = plan.resume_capsule;
+    if (!currentTask) {
+        if (capsule !== undefined) {
+            errors.push("plan.yaml resume_capsule must only be present for the current open SPRINTING task");
+        }
+        return errors;
+    }
+    const taskId = String(currentTask.id ?? "current task");
+    const required = requiresResumeCapsule(currentTask);
+    if (!required && capsule === undefined)
+        return errors;
+    if (!isRecord(capsule)) {
+        errors.push(`${taskId} high-risk runtime task must define top-level resume_capsule`);
+        return errors;
+    }
+    for (const field of RESUME_CAPSULE_FIELDS) {
+        if (!(field in capsule)) {
+            errors.push(`${taskId} resume_capsule missing field: ${field}`);
+        }
+    }
+    const capsuleTaskId = String(capsule.task_id ?? "").trim();
+    if (capsuleTaskId !== taskId) {
+        errors.push(`${taskId} resume_capsule.task_id must match current_task_id`);
+    }
+    for (const field of ["state", "canonical_path", "next_step", "blocker", "last_passed_gate"]) {
+        const value = String(capsule[field] ?? "").trim();
+        if (!value || isPlaceholderEvidence(value)) {
+            errors.push(`${taskId} resume_capsule.${field} must contain concrete recovery information`);
+        }
+    }
+    const doNotRetry = asStringList(capsule.do_not_retry);
+    if (doNotRetry.length === 0 || doNotRetry.some((item) => isPlaceholderEvidence(item))) {
+        errors.push(`${taskId} resume_capsule.do_not_retry must list concrete paths or attempts not to repeat`);
+    }
+    const refs = asStringList(capsule.recovery_refs);
+    if (refs.length === 0) {
+        errors.push(`${taskId} resume_capsule.recovery_refs must link implementation doc and runbook/evidence documents`);
+        return errors;
+    }
+    const implementationDoc = String(currentTask.implementation_doc ?? "").trim();
+    if (implementationDoc && !refs.includes(implementationDoc)) {
+        errors.push(`${taskId} resume_capsule.recovery_refs must include current implementation_doc ${implementationDoc}`);
+    }
+    if (!refs.some((ref) => ref.startsWith(RUNBOOK_DOC_PREFIX))) {
+        errors.push(`${taskId} resume_capsule.recovery_refs must include a runbook/evidence document under ${RUNBOOK_DOC_PREFIX}`);
+    }
+    for (const ref of refs) {
+        if (!ref.startsWith(".docs/04_implementation/") && !ref.startsWith(RUNBOOK_DOC_PREFIX)) {
+            errors.push(`${taskId} resume_capsule.recovery_refs may only point to implementation docs or runbook/evidence docs: ${ref}`);
+            continue;
+        }
+        if (!(await pathExists(path.join(projectRoot, ref)))) {
+            errors.push(`${taskId} resume_capsule recovery_ref does not exist: ${ref}`);
+        }
+    }
+    return errors;
+}
+function requiresResumeCapsule(task) {
+    if (String(task.phase ?? "") !== "SPRINTING")
+        return false;
+    const evidenceLevel = isRecord(task.evidence_level) ? String(task.evidence_level.required ?? "") : "";
+    const targetKind = isRecord(task.target_runtime_environment) ? String(task.target_runtime_environment.kind ?? "") : "";
+    return RESUME_CAPSULE_REQUIRED_EVIDENCE_LEVELS.has(evidenceLevel) || RESUME_CAPSULE_REQUIRED_TARGET_KINDS.has(targetKind);
+}
 function validateRuntimeEvidenceContract(task) {
     const errors = [];
     const taskId = String(task.id ?? "Task");
@@ -1297,6 +1452,14 @@ function validateDevelopmentSelfTestReport(fullText, developmentEvidenceSection,
     if (!report) {
         return [`${taskId} implementation_doc must include Development Self-Test Report for self_test_contract: ${implementationDoc}`];
     }
+    const reportStatus = normalizeSelfTestReportStatus(evidenceFieldValue(report, "Report Status"));
+    if (!reportStatus) {
+        errors.push(`${taskId} Development Self-Test Report must include Report Status: PASS | BLOCKED | IN_PROGRESS | STALE in ${implementationDoc}`);
+    }
+    else if (reportStatus !== "PASS") {
+        errors.push(`${taskId} Development Self-Test Report Report Status is ${reportStatus}; validate-dev cannot handoff until the report status is PASS`);
+    }
+    errors.push(...validateSelfTestReportBoundary(report, taskId, implementationDoc));
     const basicSelfTest = evidenceFieldValue(developmentEvidenceSection, "Basic Self-test Evidence") ?? "";
     if (!containsAny(basicSelfTest, ["Development Self-Test Report", "开发自测报告", "self-test report"])) {
         errors.push(`${taskId} Basic Self-test Evidence must reference the Development Self-Test Report in ${implementationDoc}`);
@@ -1318,11 +1481,26 @@ function validateDevelopmentSelfTestReport(fullText, developmentEvidenceSection,
         }
     }
     const moduleKeyTestPath = evidenceFieldValue(report, "Module Key Test Path") ?? "";
+    if (isPlaceholderSelfTestReportValue(moduleKeyTestPath) || isTemplateModuleKeyTestPath(moduleKeyTestPath)) {
+        errors.push(`${taskId} Development Self-Test Report Module Key Test Path must replace template placeholders with actual executed path evidence in ${implementationDoc}`);
+    }
     const runnableEntry = String(contract.runnable_entry ?? "").trim();
     if (runnableEntry && !moduleKeyTestPath.includes(runnableEntry)) {
         errors.push(`${taskId} Development Self-Test Report Module Key Test Path must include runnable entry ${runnableEntry} in ${implementationDoc}`);
     }
     const scenarios = Array.isArray(contract.scenarios) ? contract.scenarios.filter(isRecord) : [];
+    const exitEvidenceTerms = [
+        String(contract.observable_exit ?? "").trim(),
+        ...scenarios.flatMap((scenario) => [
+            String(scenario.expected_exit ?? "").trim(),
+            String(scenario.evidence ?? "").trim()
+        ])
+    ].filter(Boolean);
+    if (exitEvidenceTerms.length > 0
+        && !exitEvidenceTerms.some((term) => normalizedIncludes(moduleKeyTestPath, term))
+        && !containsAny(moduleKeyTestPath, SELF_TEST_OBSERVABLE_EVIDENCE_TERMS)) {
+        errors.push(`${taskId} Development Self-Test Report Module Key Test Path must include observable exit or evidence from self_test_contract in ${implementationDoc}`);
+    }
     for (const scenario of scenarios) {
         const scenarioId = String(scenario.id ?? "").trim();
         if (!scenarioId)
@@ -1332,26 +1510,191 @@ function validateDevelopmentSelfTestReport(fullText, developmentEvidenceSection,
         }
         const status = scenarioStatus(report, scenarioId);
         if (!status) {
-            errors.push(`${taskId} Development Self-Test Report must record scenario ${scenarioId} as PASS or BLOCKED in ${implementationDoc}`);
+            errors.push(`${taskId} Development Self-Test Report must record scenario ${scenarioId} as PASS, BLOCKED, IN_PROGRESS, or STALE in ${implementationDoc}`);
         }
-        else if (status === "BLOCKED") {
-            errors.push(`${taskId} Development Self-Test Report scenario ${scenarioId} is BLOCKED; keep task open or record a blocker`);
+        else if (status === "AMBIGUOUS") {
+            errors.push(`${taskId} Development Self-Test Report scenario ${scenarioId} must choose exactly one status in ${implementationDoc}`);
         }
+        else if (status !== "PASS") {
+            errors.push(`${taskId} Development Self-Test Report scenario ${scenarioId} is ${status}; validate-dev cannot handoff until every scenario is PASS`);
+        }
+        errors.push(...validateScenarioTableEvidence(report, scenarioId, taskId, implementationDoc));
+    }
+    const targetRuntime = isRecord(task.target_runtime_environment) ? task.target_runtime_environment : undefined;
+    const reportContext = `${taskText(task)}\n${report}\n${Object.values(contract).map((value) => String(value ?? "")).join("\n")}`;
+    if (String(targetRuntime?.kind ?? "") === "browser" || containsAny(reportContext, PAGE_TASK_TERMS)) {
+        const loweredReport = report.toLowerCase();
+        if (!containsAny(loweredReport, PAGE_ENTRY_TERMS)) {
+            errors.push(`${taskId} page Development Self-Test Report must include a dev server or page URL in ${implementationDoc}`);
+        }
+        if (!containsAny(loweredReport, PAGE_BROWSER_CHECK_TERMS)) {
+            errors.push(`${taskId} page Development Self-Test Report must include browser, Playwright, screenshot, or equivalent interaction evidence in ${implementationDoc}`);
+        }
+    }
+    if (requiresResumeCapsule(task)) {
+        errors.push(...validateCurrentOperatorPath(fullText, taskId, implementationDoc));
+        errors.push(...validateGateBreakdown(fullText, taskId, implementationDoc));
+    }
+    return errors;
+}
+function normalizeSelfTestReportStatus(value) {
+    if (!value)
+        return undefined;
+    const normalized = value.replace(/`/g, "").trim().toUpperCase().replace(/[\s-]+/g, "_");
+    return SELF_TEST_REPORT_STATUSES.has(normalized) ? normalized : undefined;
+}
+function validateSelfTestReportBoundary(report, taskId, implementationDoc) {
+    const errors = [];
+    for (const line of report.split(/\r?\n/)) {
+        const match = line.match(/^(#{1,6})\s+(.+)$/);
+        if (!match)
+            continue;
+        const title = match[2].trim().toLowerCase();
+        const blockedTerm = SELF_TEST_REPORT_DISALLOWED_SECTION_TERMS.find((term) => title.includes(term));
+        if (blockedTerm) {
+            errors.push(`${taskId} Development Self-Test Report must not include debug/operator/runbook/exploration log section "${match[2].trim()}" in ${implementationDoc}; link a runbook or exploration appendix instead`);
+        }
+    }
+    return errors;
+}
+function validateCurrentOperatorPath(fullText, taskId, implementationDoc) {
+    const section = markdownSection(fullText, CURRENT_OPERATOR_PATH_TERMS);
+    if (!section) {
+        return [`${taskId} high-risk runtime task must include a short Current Operator Path section in ${implementationDoc}`];
+    }
+    const errors = [];
+    const requiredFields = [
+        ["canonical operator path", ["Canonical operator path", "Canonical path"]],
+        ["runbook link", ["Operator runbook", "Runbook"]],
+        ["credential reference name", ["Credential reference", "Credential reference name"]],
+        ["command/UI channel", ["Command/UI channel", "Command channel", "UI channel"]],
+        ["do-not-retry summary", ["Do-not-retry summary", "Do not retry summary"]]
+    ];
+    for (const [label, fields] of requiredFields) {
+        const value = fields.map((field) => evidenceFieldValue(section, field)).find((candidate) => candidate && candidate.trim());
+        if (!value) {
+            errors.push(`${taskId} Current Operator Path must record ${label} in ${implementationDoc}`);
+        }
+        else if (label !== "credential reference name" && isPlaceholderEvidence(value)) {
+            errors.push(`${taskId} Current Operator Path ${label} must be concrete in ${implementationDoc}`);
+        }
+    }
+    const runbookValue = evidenceFieldValue(section, "Operator runbook") ?? evidenceFieldValue(section, "Runbook") ?? section;
+    if (!runbookValue.includes(RUNBOOK_DOC_PREFIX)) {
+        errors.push(`${taskId} Current Operator Path must link a runbook/evidence document under ${RUNBOOK_DOC_PREFIX} in ${implementationDoc}`);
+    }
+    return errors;
+}
+function validateGateBreakdown(fullText, taskId, implementationDoc) {
+    const section = markdownSection(fullText, GATE_BREAKDOWN_TERMS);
+    if (!section) {
+        return [`${taskId} high-risk runtime task Development Self-Test Report must include Gate Breakdown in ${implementationDoc}`];
+    }
+    const errors = [];
+    const lowered = section.toLowerCase();
+    for (const [label, terms] of GATE_BREAKDOWN_LAYER_GROUPS) {
+        if (!containsAny(lowered, terms)) {
+            errors.push(`${taskId} Gate Breakdown must include ${label} status/evidence in ${implementationDoc}`);
+        }
+    }
+    const rows = markdownTableRows(section).filter((cells) => !cells.some((cell) => /gate layer|layer|层级/i.test(cell)));
+    const concreteRows = rows.filter((cells) => cells.some((cell) => !isPlaceholderSelfTestReportValue(cell)));
+    if (concreteRows.length < 2) {
+        errors.push(`${taskId} Gate Breakdown must split evidence into multiple concrete gate layers in ${implementationDoc}`);
+    }
+    if (concreteRows.length <= 1 && lowered.includes("validate-dev")) {
+        errors.push(`${taskId} Gate Breakdown cannot collapse high-risk runtime progress into only validate-dev in ${implementationDoc}`);
     }
     return errors;
 }
 function scenarioStatus(text, scenarioId) {
     const escaped = scenarioId.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
-    const patterns = [
-        new RegExp("^.*" + escaped + ".*\\b(PASS|BLOCKED)\\b.*$", "im"),
-        new RegExp("\\|[^\\n|]*" + escaped + "[^\\n|]*\\|[^\\n|]*\\b(PASS|BLOCKED)\\b[^\\n|]*\\|", "i")
-    ];
-    for (const pattern of patterns) {
-        const match = text.match(pattern);
-        if (match)
-            return match[1].toUpperCase();
+    const pattern = new RegExp("^.*" + escaped + ".*$", "gim");
+    const seen = new Set();
+    for (const match of text.matchAll(pattern)) {
+        const status = selfTestLineStatus(match[0]);
+        if (status === "AMBIGUOUS")
+            return status;
+        if (status)
+            seen.add(status);
+    }
+    if (seen.size > 1)
+        return "AMBIGUOUS";
+    return [...seen][0];
+}
+function selfTestLineStatus(line) {
+    const normalized = line.toUpperCase().replace(/\bIN[\s-]+PROGRESS\b/g, "IN_PROGRESS");
+    const matches = [];
+    if (hasStatusToken(normalized, "PASS"))
+        matches.push("PASS");
+    if (hasStatusToken(normalized, "BLOCKED"))
+        matches.push("BLOCKED");
+    if (hasStatusToken(normalized, "IN_PROGRESS"))
+        matches.push("IN_PROGRESS");
+    if (hasStatusToken(normalized, "STALE"))
+        matches.push("STALE");
+    if (matches.length > 1)
+        return "AMBIGUOUS";
+    return matches[0];
+}
+function hasStatusToken(line, status) {
+    const escaped = status.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
+    return new RegExp(`(^|[^A-Z0-9_])${escaped}([^A-Z0-9_]|$)`).test(line);
+}
+function validateScenarioTableEvidence(report, scenarioId, taskId, implementationDoc) {
+    const errors = [];
+    const rows = markdownTableRows(report).filter((cells) => cells.some((cell) => normalizeCell(cell) === scenarioId));
+    for (const cells of rows) {
+        const [id, result, executedEntry, actualExit, evidence] = cells;
+        if (!id || normalizeCell(id) !== scenarioId)
+            continue;
+        const requiredCells = [
+            ["Result", result],
+            ["Executed Entry", executedEntry],
+            ["Actual Exit", actualExit],
+            ["Evidence", evidence]
+        ];
+        for (const [label, value] of requiredCells) {
+            if (!value || isPlaceholderSelfTestReportValue(value)) {
+                errors.push(`${taskId} Development Self-Test Report scenario ${scenarioId} table ${label} must contain concrete evidence in ${implementationDoc}`);
+            }
+        }
+        const tableStatus = result ? scenarioStatus(`| ${cells.join(" | ")} |`, scenarioId) : undefined;
+        if (tableStatus === "AMBIGUOUS") {
+            errors.push(`${taskId} Development Self-Test Report scenario ${scenarioId} table Result must choose exactly one status in ${implementationDoc}`);
+        }
+        else if (tableStatus && tableStatus !== "PASS") {
+            errors.push(`${taskId} Development Self-Test Report scenario ${scenarioId} table Result is ${tableStatus}; validate-dev cannot handoff until every scenario is PASS`);
+        }
     }
-    return undefined;
+    return errors;
+}
+function markdownTableRows(section) {
+    return section
+        .split(/\r?\n/)
+        .map((line) => line.trim())
+        .filter((line) => line.startsWith("|") && line.endsWith("|") && !/^\|\s*:?-{3,}:?\s*(\|\s*:?-{3,}:?\s*)+\|$/.test(line))
+        .map((line) => line.slice(1, -1).split("|").map((cell) => cell.trim()));
+}
+function normalizeCell(value) {
+    return value.replace(/`/g, "").trim();
+}
+function isTemplateModuleKeyTestPath(value) {
+    const lowered = value.toLowerCase();
+    return [
+        "local start / invocation",
+        "all self-test scenarios",
+        "all task/module promised runnable entries",
+        "actual internal key paths",
+        "observable completion evidence"
+    ].some((term) => lowered.includes(term));
+}
+function isPlaceholderSelfTestReportValue(value) {
+    const normalized = value.trim().toLowerCase();
+    return isPlaceholderEvidence(value) || SELF_TEST_REPORT_PLACEHOLDER_TERMS.some((term) => normalized.includes(term));
+}
+function normalizedIncludes(text, needle) {
+    return text.toLowerCase().includes(needle.toLowerCase());
 }
 function hasConcreteDevelopmentEvidenceFields(section) {
     return ["Evidence Level", "Target Runtime Environment", "Runnable Entry", "Observable Exit", "Client / Server Initialization", "Config Contract"].some((field) => {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agent-project-sdlc",
-  "version": "0.1.20",
+  "version": "0.1.22",
   "description": "CLI and canonical assets for the AI SDLC Harness workflow.",
   "type": "module",
   "bin": {