agent-project-sdlc 0.1.17 → 0.1.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -84,9 +84,9 @@ Before development starts, `ARCHITECTING` can return to `REQUIREMENT_GATHERING`
84
84
 
85
85
  `validate-design` treats semantic slicing as a hard gate. Generated `overview.md` files do not count as deliverables, development draft tasks in `plan.draft.yaml` must reference existing tech plan slices through `docs.tech_plan`, multiple development draft tasks need distinct primary tech plan slices, and explicit AI provider/copilot, external-system, or compliance/permission/audit themes require dedicated architecture slices.
86
86
 
87
- SPRINTING Definition of Done includes runnable entry/exit boundaries. API, CLI, server route, service, agent, runtime, adapter, worker, provider, config-contract and fixture/live boundaries promised by a technical plan or task must be implemented or marked `BLOCKED` during development. The current task implementation doc must also include `Development Evidence` with `Runnable Entry`, `Observable Exit`, `Client / Server Initialization`, `Config Contract`, `Basic Self-test Evidence`, or a justified `Not applicable`. Provider smoke, fixture smoke, fake adapters and one-shot smoke prove only local links; they do not by themselves prove application readiness. REVIEWING treats missing entry/exit, initialization, config contract or development evidence as blocking, and TESTING only exercises entrypoints that Review has confirmed as `PASS`; it must not add product runtime, bootstrap, provider adapter, deploy code or package runtime scripts.
87
+ SPRINTING Definition of Done includes runnable entry/exit boundaries. API, CLI, server route, service, agent, runtime, adapter, worker, provider, config-contract and fixture/live boundaries promised by a technical plan or task must be implemented or marked `BLOCKED` during development. Runtime/app/provider/live tasks must declare `evidence_level.required` and `target_runtime_environment` in `plan.yaml`; `deployed_runtime` cannot be closed by `unit`, `local_runtime`, `external_provider_live`, provider smoke, fake adapters or localhost smoke alone, and `business_handoff_ready` requires a Testing Handoff Contract. The current task implementation doc must also include `Development Evidence` with `Evidence Level`, `Target Runtime Environment`, `Runnable Entry`, `Observable Exit`, `Client / Server Initialization`, `Config Contract`, `Testing Handoff Readiness`, `Known Missing Runtime Boundaries`, `Basic Self-test Evidence`, or a justified `Not applicable`. Provider smoke, fixture smoke, fake adapters and one-shot smoke prove only local links; they do not by themselves prove application readiness. REVIEWING treats missing entry/exit, initialization, config contract, target runtime, evidence level or development evidence as blocking, and TESTING only exercises entrypoints that Review has confirmed as `PASS`; it must not add product runtime, bootstrap, provider adapter, deploy code or package runtime scripts.
88
88
 
89
- `make validate-dev` and `npx sdlc-harness validate-dev` are in-development SPRINTING gates. They allow the current `current_task_id` open task to remain in `plan.yaml` while checking that it is a valid `phase: "SPRINTING"` task with `docs`, `allowed_paths`, `required_gates`, `acceptance_criteria`, `implementation_doc`, scoped dirty files, an empty `plan.draft.yaml` queue, linked runnable-entry implementation docs and structured development evidence. Page tasks need a dev server or page URL plus browser/Playwright/screenshot/equivalent interaction evidence; API/CLI/worker/service/agent/runtime tasks need a startup or invocation command, endpoint/health/status, and observable response/output/side effect. `make validate-current` and `/advance` are phase-exit gates; before moving to REVIEWING, the implementation commit and completion ledger must be done and no open task may remain.
89
+ `make validate-dev` and `npx sdlc-harness validate-dev` are in-development SPRINTING gates. They allow the current `current_task_id` open task to remain in `plan.yaml` while checking that it is a valid `phase: "SPRINTING"` task with `docs`, `allowed_paths`, `required_gates`, `acceptance_criteria`, `implementation_doc`, scoped dirty files, an empty `plan.draft.yaml` queue, runtime evidence task contract, linked runnable-entry implementation docs and structured development evidence. Page tasks need a dev server or page URL plus browser/Playwright/screenshot/equivalent interaction evidence; API/CLI/worker/service/agent/runtime tasks need a startup or invocation command, endpoint/health/status, and observable response/output/side effect. `make validate-current` and `/advance` are phase-exit gates; before moving to REVIEWING, the implementation commit and completion ledger must be done and no open task may remain.
90
90
 
91
91
  `validate-test` keeps its command name as the TESTING phase gate. `.docs/07_test/TEST_STRATEGY.md` describes scope, environment, priority and execution strategy; `.docs/07_test/TEST_CASES.md` describes cases bound to real runnable entry/exit; `.docs/07_test/TEST_REPORT.md` only records executed TESTING evidence, test matrix, regression evidence, runnable entry/exit coverage, coverage gaps and final decision. `validate-test` only accepts `TEST_REPORT.md`; it no longer treats `TEST_PLAN.md` as a report fallback.
92
92
 
@@ -105,9 +105,9 @@ Agent 会读取 `<harnessRoot>/state/lifecycle.yaml` 和 `<harnessRoot>/state/pl
105
105
 
106
106
  `validate-design` 会把架构阶段的语义切片作为硬 gate:`overview.md` 不计入 deliverables,`plan.draft.yaml` 中每个开发 draft task 必须通过 `docs.tech_plan` 指向存在的 tech plan slice;多个开发 draft task 默认需要不同 primary tech plan slice。PRD、tech plan 或 draft task 明确出现 AI provider / copilot、外部系统边界、合规 / 权限 / 审计等横切主题时,也需要对应的专门 architecture slice。
107
107
 
108
- SPRINTING 的 Definition of Done 包含可运行入口/出口:技术方案或 task 承诺的 API、CLI、server route、service、agent、runtime、adapter、worker、provider、配置契约和 fixture/live 边界必须在开发阶段实现或明确 `BLOCKED`。当前 task 的 implementation doc 还必须写入 `Development Evidence`,包含 `Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Basic Self-test Evidence`,或带原因的 `Not applicable`。provider smoke、fixture smoke、fake adapter 或 one-shot smoke 只能证明局部链路,不能单独证明 application readiness。REVIEWING 会把缺少入口/出口、初始化、配置契约或开发自测证据作为阻断项;TESTING 只调用 Review 已确认 `PASS` 的既有入口做输入输出验证,不能新增 product runtime、bootstrap、provider adapter、deploy 或 package runtime script。
108
+ SPRINTING 的 Definition of Done 包含可运行入口/出口:技术方案或 task 承诺的 API、CLI、server route、service、agent、runtime、adapter、worker、provider、配置契约和 fixture/live 边界必须在开发阶段实现或明确 `BLOCKED`。runtime/app/provider/live task 必须在 `plan.yaml` 声明 `evidence_level.required` 和 `target_runtime_environment`;`deployed_runtime` 不能用 `unit`、`local_runtime`、`external_provider_live`、provider smoke、fake adapter 或 localhost smoke 单独关闭,`business_handoff_ready` 还必须有 Testing Handoff Contract。当前 task 的 implementation doc 还必须写入 `Development Evidence`,包含 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries`、`Basic Self-test Evidence`,或带原因的 `Not applicable`。provider smoke、fixture smoke、fake adapter 或 one-shot smoke 只能证明局部链路,不能单独证明 application readiness。REVIEWING 会把缺少入口/出口、初始化、配置契约、目标运行环境、证据等级或开发自测证据作为阻断项;TESTING 只调用 Review 已确认 `PASS` 的既有入口做输入输出验证,不能新增 product runtime、bootstrap、provider adapter、deploy 或 package runtime script。
109
109
 
110
- `make validate-dev` / `npx sdlc-harness validate-dev` 是 SPRINTING 开发中 gate:当前 `current_task_id` 指向的 open task 可以继续留在 `plan.yaml`,validator 会检查它是否是合法 `phase: "SPRINTING"` task、是否具备 `docs`、`allowed_paths`、`required_gates`、`acceptance_criteria`、`implementation_doc`,并校验 dirty files、`plan.draft.yaml`、implementation doc 和结构化 `Development Evidence`。页面类证据需要 dev server/page URL 与 browser check;API/CLI/worker/service/agent/runtime 类证据需要 startup/invocation command、endpoint/health/status 与 response/output/side effect。`make validate-current` / `/advance` 是阶段出口 gate;进入 REVIEWING 前仍必须先完成 implementation commit 和 completion ledger,把 open task 从 `plan.yaml` 移除。
110
+ `make validate-dev` / `npx sdlc-harness validate-dev` 是 SPRINTING 开发中 gate:当前 `current_task_id` 指向的 open task 可以继续留在 `plan.yaml`,validator 会检查它是否是合法 `phase: "SPRINTING"` task、是否具备 `docs`、`allowed_paths`、`required_gates`、`acceptance_criteria`、`implementation_doc`,并校验 dirty files、`plan.draft.yaml`、runtime evidence task contract、implementation doc 和结构化 `Development Evidence`。页面类证据需要 dev server/page URL 与 browser check;API/CLI/worker/service/agent/runtime 类证据需要 startup/invocation command、endpoint/health/status 与 response/output/side effect。`make validate-current` / `/advance` 是阶段出口 gate;进入 REVIEWING 前仍必须先完成 implementation commit 和 completion ledger,把 open task 从 `plan.yaml` 移除。
111
111
 
112
112
  `validate-test` 仍然是 TESTING 阶段 gate 名称。`.docs/07_test/TEST_STRATEGY.md` 描述测试范围、环境、优先级和执行策略;`.docs/07_test/TEST_CASES.md` 描述绑定真实 runnable entry/exit 的测试用例;`.docs/07_test/TEST_REPORT.md` 只记录 TESTING 阶段实际执行后的 test matrix、regression evidence、runnable entry/exit coverage、coverage gaps 和 final decision。`validate-test` 只接受 `TEST_REPORT.md`,不会把 `TEST_PLAN.md` 当作 report fallback。
113
113
 
@@ -53,6 +53,8 @@ ADR 用来解决“后来的人只看到结果,看不到当年取舍”的问
53
53
  - `overview.md` 是 generated artifact,不算 architecture / tech plan deliverable,也不能作为 `docs.tech_plan` 引用。
54
54
  - 如果 PRD、tech plan 或 draft task 明确出现 AI provider / AI copilot、外部系统边界、合规 / 权限 / 审计等横切主题,应各自有专门的 architecture slice;不要把多个横切架构问题都塞进一个总览文档。
55
55
  - 如果实现计划改变了已有模块边界,应更新相关 architecture slice,而不是只在 task 描述里补一句。
56
+ - 只要技术方案或 draft task 出现 service、agent、runtime、worker、frontend app、provider/live integration 或外部可运行边界,task breakdown 必须包含最后一公里 runtime 初始化和 testing handoff 交付:目标运行环境、启动/部署或预览方式、health/readiness、smoke 输入输出、日志/错误证据、测试可调用入口和出口。
57
+ - 这类开发 draft task 必须写入 `evidence_level.required` 和 `target_runtime_environment`。`evidence_level.required` 只能使用 `unit`、`local_runtime`、`external_provider_live`、`deployed_runtime`、`business_handoff_ready`;`target_runtime_environment.kind` 只能使用 `local`、`ci`、`staging`、`cloud_vm`、`managed_service`、`browser`、`worker`、`not_applicable`。
56
58
  - 如果用户明确要求把既有完整技术方案文件切成多个 `.docs/03_tech_plan/` slices,先确认 replacement slices 覆盖原文件中仍有效的接口契约、数据模型、模块方案、任务组和 gate;切片完成并更新 `plan.draft.yaml` 引用、`.docs/INDEX.md`、刷新 `overview.md` 后,删除被替代的完整 tech plan file,避免同一事实由完整文件和 slices 双重保留。
57
59
  - 每次新增、拆分、合并或废弃 slice 后,都要更新 `.docs/INDEX.md`。
58
60
 
@@ -88,7 +90,7 @@ ADR 用来解决“后来的人只看到结果,看不到当年取舍”的问
88
90
  - [ ] 已判断 architecture / tech plan / ADR 的语义切片边界。
89
91
  - [ ] `plan.draft.yaml` 中每个开发 draft task 已通过 `docs.tech_plan` 绑定到对应 tech plan slice。
90
92
  - [ ] 如果用户要求把完整技术方案切成 tech plan slices,已删除被替代的完整 tech plan file,并同步 `plan.draft.yaml` 引用。
91
- - [ ] task draft 字段完整且范围清晰。
93
+ - [ ] task draft 字段完整且范围清晰;runtime/app/provider/live 类 task 已声明 `evidence_level` 和 `target_runtime_environment`。
92
94
  - [ ] `.docs/INDEX.md` 已链接新增产物。
93
95
  - [ ] 已运行 `make docs-overview` 刷新 `.docs/<stage>/overview.md`。
94
96
  - [ ] `make validate-design` 准备通过。
@@ -15,7 +15,7 @@ description: Use during SPRINTING to execute one task from plan.yaml, respecting
15
15
 
16
16
  开始编码前,先确认当前 open task 是否完整,修改范围是否覆盖必要文件,验收标准是否能被测试或 gate 验证。如果发现任务边界、产品行为或技术方案不清晰,要停下来说明 blocker、给出可能解释和推荐下一步,而不是扩大范围继续写。
17
17
 
18
- 开发阶段的 Definition of Done 包含可运行的系统入口/出口。凡技术方案或 task 承诺 API、CLI、server route、service、agent、runtime、adapter、worker、provider、外部发送/写入执行器、配置契约或 live/fixture 双模式边界,当前实现必须提供对应入口、调用方式、初始化方式、输出/副作用边界和验证方式;如果真实入口/出口尚不可运行,不能把 task 当作完成,也不能把缺口留给 TESTING 补 runtime。Implementation doc 必须写明 `Runnable Entry/Exit`,并在 `Development Evidence` 中记录 `Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Basic Self-test Evidence`;确实不适用时也要显式写 `Not applicable` 和具体原因。provider smoke、fixture smoke、fake adapter 或 one-shot smoke 只能证明局部链路,不能单独证明 `Application readiness`。此时应保留或创建 `BLOCKED`/后续 dev task,或通过 RFC/ARCHITECTING 处理边界变更。
18
+ 开发阶段的 Definition of Done 包含可运行的系统入口/出口。凡技术方案或 task 承诺 API、CLI、server route、service、agent、runtime、adapter、worker、provider、外部发送/写入执行器、配置契约或 live/fixture 双模式边界,当前实现必须提供对应入口、调用方式、初始化方式、输出/副作用边界和验证方式;如果真实入口/出口尚不可运行,不能把 task 当作完成,也不能把缺口留给 TESTING 补 runtime。runtime/app/provider/live 类 task 必须在 `plan.yaml` 声明 `evidence_level.required` 和 `target_runtime_environment`,并按合同交付:`deployed_runtime` 不能用 `unit`、`local_runtime`、`external_provider_live`、provider smoke、fake adapter 或 localhost smoke 单独关闭;`business_handoff_ready` 必须提供 Testing Handoff Contract。Implementation doc 必须写明 `Runnable Entry/Exit`,并在 `Development Evidence` 中记录 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries` 和 `Basic Self-test Evidence`;确实不适用时也要显式写 `Not applicable` 和具体原因。provider smoke、fixture smoke、fake adapter 或 one-shot smoke 只能证明局部链路,不能单独证明 `Application readiness`。此时应保留或创建 `BLOCKED`/后续 dev task,或通过 RFC/ARCHITECTING 处理边界变更。
19
19
 
20
20
  页面类任务在开发阶段必须启动 dev server 或等价预览入口,并用浏览器、Playwright、截图或等价方式验证页面可加载、主入口可访问、核心按钮/表单/跳转可用、没有明显报错或空白页。API/CLI/worker/RPA/service/agent/runtime 类任务必须记录实际启动或调用命令、endpoint、worker command、dry-run/live preflight、health/status 或 server action,以及可观察的 response、队列 item、审计日志、文件产物、发送结果、错误码或 PASS/BLOCKED 结果。
21
21
 
@@ -68,7 +68,7 @@ description: Use during SPRINTING to execute one task from plan.yaml, respecting
68
68
  每个 open task 都必须在 `plan.yaml` 中包含完整执行合同:
69
69
 
70
70
  1. `current_task_id` 指向正在执行的 open task。
71
- 2. open task 直接声明 `phase: "SPRINTING"`、`docs`、`allowed_paths`、`required_gates`、`acceptance_criteria` 和 `implementation_doc`。
71
+ 2. open task 直接声明 `phase: "SPRINTING"`、`docs`、`allowed_paths`、`required_gates`、`acceptance_criteria` 和 `implementation_doc`;runtime/app/provider/live 类 task 还必须声明 `evidence_level` 和 `target_runtime_environment`。
72
72
  3. 如果 open task 是由 `plan.draft.yaml.tasks[]` promote 而来,创建正式 `TASK-*` 和删除源 draft 必须发生在同一次状态更新中;正式 task 的恢复现场只保存在 `plan.yaml`。
73
73
  4. 任务执行中只保留恢复所需的简短 `working_notes`。
74
74
  5. gate、implementation doc、`.docs/INDEX.md` 和 `overview.md` 完成后,在当前 task 仍位于 `plan.yaml` 时创建 task implementation commit。
@@ -105,7 +105,8 @@ done task 的执行流水不在当前 `plan.yaml` 长期保留,也不是默认
105
105
  - [ ] open task 在 `plan.yaml` 中包含完整执行合同。
106
106
  - [ ] 当前任务仍然是单一清晰的执行单元。
107
107
  - [ ] 技术方案或 task 承诺的 API/CLI/adapter/worker/provider、配置契约、输出/副作用和 fixture/live 边界已可运行并写入 implementation doc,或已明确 `BLOCKED`/后续 dev task。
108
- - [ ] implementation doc `Development Evidence` 已记录 `Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Basic Self-test Evidence`,或写明带原因的 `Not applicable`。
108
+ - [ ] implementation doc `Development Evidence` 已记录 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries`、`Basic Self-test Evidence`,或写明带原因的 `Not applicable`。
109
+ - [ ] 如果 task 要求 `business_handoff_ready`,implementation doc 已写入 Testing Handoff Contract,包含入口、配置、初始化/health、输入样例、预期出口、清理方式和证据等级。
109
110
  - [ ] 如果当前 task 来自 `plan.draft.yaml.tasks[]`,源 draft 已在 promote 时从 draft 列表删除。
110
111
  - [ ] implementation doc 已生成或更新,并反映相关模块的真实代码。
111
112
  - [ ] 如果启用了 `parallel_execution`,worker owned paths、forbidden paths、required gates 和主 Agent 集成结果已记录。
@@ -17,7 +17,7 @@ description: Use after development gates pass to update module-level implementat
17
17
 
18
18
  文档应帮助后来者快速理解:某个模块或核心数据流的当前实现是什么、关键对象/函数职责是什么、行为如何从输入流到输出、测试覆盖了什么、还有什么未覆盖。task id 只作为 provenance,不作为默认切片粒度。
19
19
 
20
- 如果模块包含或承诺可运行系统边界,implementation doc 必须记录 runnable entry/exit:API/CLI/server route/service/agent/runtime/adapter/worker/provider 的调用方式、初始化方式、配置契约、输入来源、输出或副作用、fixture/live 模式边界,以及哪些真实外部执行器尚未实现。还必须在 `Development Evidence` 中记录开发阶段实际验证过的 `Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract` 和 `Basic Self-test Evidence`;确实没有应用入口时,`Not applicable` 必须写清原因。不能把未来才会实现的入口写成当前事实,不能把 provider smoke、fixture smoke、fake adapter 或 one-shot smoke 单独写成 application readiness
20
+ 如果模块包含或承诺可运行系统边界,implementation doc 必须记录 runnable entry/exit:API/CLI/server route/service/agent/runtime/adapter/worker/provider 的调用方式、初始化方式、配置契约、输入来源、输出或副作用、fixture/live 模式边界,以及哪些真实外部执行器尚未实现。还必须在 `Development Evidence` 中记录开发阶段实际验证过的 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries` 和 `Basic Self-test Evidence`;确实没有应用入口时,`Not applicable` 必须写清原因。不能把未来才会实现的入口写成当前事实,不能把 provider smoke、fixture smoke、fake adapter 或 one-shot smoke 单独写成 application readiness。如果 task 要求 `business_handoff_ready`,还必须写 Testing Handoff Contract,包含入口、配置、初始化/health、输入样例、预期出口、清理/reset/幂等说明和证据等级。
21
21
 
22
22
  ## 输入
23
23
 
@@ -48,7 +48,7 @@ description: Use after development gates pass to update module-level implementat
48
48
  2. 每个被记录的文件都应说明它在该模块或数据流中的作用和关键函数/对象。
49
49
  3. 与技术方案的偏移必须明确记录,即便该偏移是合理的。
50
50
  4. runnable entry/exit、配置契约和 fixture/live 边界必须记录当前事实;缺失项写入 `未覆盖(Not covered)` 或方案偏移。
51
- 5. `Development Evidence` 必须包含实际可调用入口、可观察出口、初始化方式、配置契约和开发自测证据;页面类任务记录 dev server/page URL 与 browser check,API/CLI/worker/RPA/service/agent/runtime 类任务记录 startup/invocation command、endpoint/health/status 与 response/output/side effect。
51
+ 5. `Development Evidence` 必须包含 task 合同要求的证据等级、目标运行环境、实际可调用入口、可观察出口、初始化方式、配置契约、测试交接状态、缺失 runtime 边界和开发自测证据;页面类任务记录 dev server/page URL 与 browser check,API/CLI/worker/RPA/service/agent/runtime 类任务记录 startup/invocation command、endpoint/health/status 与 response/output/side effect。
52
52
  6. 测试覆盖必须列出具体测试,或明确记录覆盖缺口。
53
53
  7. 文档粒度保持在模块、子系统或核心数据流级别;不要默认按 task 建文档,也不要写成跨全项目的巨型百科。
54
54
 
@@ -59,7 +59,8 @@ description: Use after development gates pass to update module-level implementat
59
59
  - [ ] 真实代码结构表已填写。
60
60
  - [ ] 核心数据流已说明。
61
61
  - [ ] runnable entry/exit、配置契约和 fixture/live 边界已记录,或缺失项已明确标注。
62
- - [ ] `Development Evidence` 已记录 `Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Basic Self-test Evidence`,或带原因的 `Not applicable`。
62
+ - [ ] `Development Evidence` 已记录 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries`、`Basic Self-test Evidence`,或带原因的 `Not applicable`。
63
+ - [ ] `business_handoff_ready` task 已记录 Testing Handoff Contract。
63
64
  - [ ] 已判断 implementation doc 的语义切片边界。
64
65
  - [ ] 方案偏移和测试覆盖已记录。
65
66
  - [ ] `.docs/INDEX.md` 已链接 implementation doc。
@@ -17,7 +17,7 @@ Review 时先建立证据链:PRD 说什么、技术方案承诺什么、implem
17
17
 
18
18
  不要把个人偏好包装成 blocker。区分 blocking issue、follow-up improvement 和 open question。如果没有发现问题,要明确说明,同时列出剩余测试缺口或残余风险。
19
19
 
20
- Review 必须把“当前模块没有可运行入口/出口”视为阻断项,而不是普通测试缺口。凡 PRD、技术方案或 implementation doc 承诺 API、CLI、server route、service、agent、runtime、adapter、worker、provider、外部发送/写入执行器、配置契约或 live/fixture 双模式边界,Review 都要读取技术方案的 `Development Deliverable Contract` 或等价交付边界,并核对真实代码和实现文档是否提供可调用入口、初始化方式、输出/副作用边界和验证方式;implementation doc 还必须包含结构化 `Development Evidence`,说明 `Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract` 和 `Basic Self-test Evidence`,或带原因的 `Not applicable`。缺失时 gate decision 应为 `BLOCKED`,并要求回到 SPRINTING/RFC,而不是允许进入 TESTING 后补 runtime。Review 报告必须写出 `Runnable Entry`、`Observable Exit`、`Initialization`、`Config Contract`、`Testing Handoff Readiness` 的 `PASS`/`BLOCKED` checklist;任一 `BLOCKED` 不得进入 TESTING。Review 不创建 `.docs/07_test/**` 正式测试产物;如果发现现有测试事实源仍链接已被 RFC supersede 的旧路线证据,应将其列为进入 TESTING 前的 blocker,并要求 RFC 清理或更新索引。
20
+ Review 必须把“当前模块没有可运行入口/出口”视为阻断项,而不是普通测试缺口。凡 PRD、技术方案或 implementation doc 承诺 API、CLI、server route、service、agent、runtime、adapter、worker、provider、外部发送/写入执行器、配置契约或 live/fixture 双模式边界,Review 都要读取技术方案的 `Development Deliverable Contract` 或等价交付边界,并核对真实代码和实现文档是否提供可调用入口、初始化方式、输出/副作用边界和验证方式;如果 task 声明了 `evidence_level.required` 和 `target_runtime_environment`,还必须核对实际证据等级、执行地点和目标运行环境是否匹配。implementation doc 还必须包含结构化 `Development Evidence`,说明 `Evidence Level`、`Target Runtime Environment`、`Runnable Entry`、`Observable Exit`、`Client / Server Initialization`、`Config Contract`、`Testing Handoff Readiness`、`Known Missing Runtime Boundaries` 和 `Basic Self-test Evidence`,或带原因的 `Not applicable`。如果 task 要求 `deployed_runtime` 或 `business_handoff_ready`,但证据只是在开发机 `localhost`、provider live smoke、fixture smoke、fake adapter 或文档描述,应判 `BLOCKED`。缺失时 gate decision 应为 `BLOCKED`,并要求回到 SPRINTING/RFC,而不是允许进入 TESTING 后补 runtime。Review 报告必须写出 `Runnable Entry`、`Observable Exit`、`Initialization`、`Config Contract`、`Testing Handoff Readiness` 的 `PASS`/`BLOCKED` checklist;任一 `BLOCKED` 不得进入 TESTING。Review 不创建 `.docs/07_test/**` 正式测试产物;如果发现现有测试事实源仍链接已被 RFC supersede 的旧路线证据,应将其列为进入 TESTING 前的 blocker,并要求 RFC 清理或更新索引。
21
21
 
22
22
  Review 产出本身也是 workflow task。开始 review 前,先在 `<harnessRoot>/state/plan.yaml` 创建或选择一个足够小的 `TASK-*` open task,并设置 `phase: "REVIEWING"`;当前轮只产出一个 review batch、一个风险主题 slice 或一次 PR review 结论。不要在一个任务里覆盖多个互不相关的 review 主题。
23
23
 
@@ -78,7 +78,8 @@ Review 阶段受 `plan.yaml` 管控:
78
78
  - [ ] 已评估需求一致性。
79
79
  - [ ] 已评估架构和可维护性风险。
80
80
  - [ ] 已评估 runnable entry/exit、配置契约和 fixture/live 边界是否足以进入 TESTING。
81
- - [ ] 已评估 implementation doc 是否包含 Runnable Entry、Observable Exit、Client / Server Initialization、Config Contract 和 Basic Self-test Evidence。
81
+ - [ ] 已评估 implementation doc 是否包含 Evidence Level、Target Runtime Environment、Runnable Entry、Observable Exit、Client / Server Initialization、Config Contract、Testing Handoff Readiness、Known Missing Runtime Boundaries 和 Basic Self-test Evidence。
82
+ - [ ] 已核对证据等级和执行地点是否匹配 task / 技术方案承诺的目标运行环境。
82
83
  - [ ] 已判断 review slice 的范围和风险主题边界。
83
84
  - [ ] 已列出测试缺口。
84
85
  - [ ] 已运行 `make docs-overview` 刷新 `.docs/<stage>/overview.md`。
@@ -17,7 +17,7 @@ description: Use during TESTING to produce a test matrix, run regression, and do
17
17
 
18
18
  执行回归时,优先选择能证明阶段出口的 gate。测试无法运行、环境缺失或数据不可得时,不要宣布通过;如果已经进入 TESTING,应在 `TEST_REPORT.md` 中记录 `BLOCKED`、已完成检查和恢复条件。
19
19
 
20
- TESTING 只能调用 SPRINTING/REVIEWING 已确认 `PASS` 的入口做输入/输出验证。可以补充测试、fixture、mock、assertion helper 和测试文档,但不能在 TESTING 中新增或长期维护 product runtime、server/API/CLI/adapter、direct poller、cloud bootstrap、systemd unit、真实 provider adapter、package runtime script 或部署脚本。如果发现真实入口/出口不存在、implementation doc 缺少 `Development Evidence`、live 模式不可调用、配置契约缺失、Review readiness checklist 不是全 `PASS`,或用户目标与已实现通道不一致,应记录 `BLOCKED`、生成 RFC 或后续 dev task 建议,并停止把测试阶段扩大成开发/集成搭建。开发尚未交付可测试 entry/exit 时,不要在 `.docs/07_test/**` 提前生成正式测试用例或正式报告;验收思路应留在 PRD acceptance criteria、tech plan verification strategy 或非 `.docs/07_test/**` 的草稿说明里。`TEST_REPORT.md` 不能在描述缺少 entry/exit、缺少 Development Evidence 或未交付应用入口时给出 `PASS`。
20
+ TESTING 只能调用 SPRINTING/REVIEWING 已确认 `PASS` 的入口做输入/输出验证。可以补充测试、fixture、mock、assertion helper 和测试文档,但不能在 TESTING 中新增或长期维护 product runtime、server/API/CLI/adapter、direct poller、cloud bootstrap、systemd unit、真实 provider adapter、package runtime script 或部署脚本。如果发现真实入口/出口不存在、implementation doc 缺少 `Development Evidence`、live 模式不可调用、配置契约缺失、Review readiness checklist 不是全 `PASS`,或 `Evidence Level` / `Target Runtime Environment` 与 task 或技术方案承诺不一致,应记录 `BLOCKED`、生成 RFC 或后续 dev task 建议,并停止把测试阶段扩大成开发/集成搭建。开发尚未交付可测试 entry/exit、目标运行环境或 Testing Handoff Contract 时,不要在 `.docs/07_test/**` 提前生成正式测试用例或正式报告;验收思路应留在 PRD acceptance criteria、tech plan verification strategy 或非 `.docs/07_test/**` 的草稿说明里。`TEST_REPORT.md` 不能在描述缺少 entry/exit、缺少 Development Evidence、证据等级不匹配或未交付应用入口时给出 `PASS`。
21
21
 
22
22
  测试设计和回归证据产出本身也是 workflow task。开始测试前,先在 `<harnessRoot>/state/plan.yaml` 创建或选择一个足够小的 `TASK-*` open task,并设置 `phase: "TESTING"`;当前轮只产出一个测试策略 slice、测试用例 slice、回归批次、风险验证片区或一组 scoped test changes。`plan.yaml` 仍是唯一执行计划事实源,`.docs/07_test/**` 只记录当前方案的 test strategy、test cases、executed regression evidence、coverage gaps 和 final decision,不表达“下一步如何开发”,也不保留已被 RFC supersede 的旧测试结果。
23
23
 
@@ -67,7 +67,7 @@ TESTING 只能调用 SPRINTING/REVIEWING 已确认 `PASS` 的入口做输入/输
67
67
 
68
68
  ## 规则
69
69
 
70
- 1. 测试用例必须追溯到 PRD acceptance criteria 或 Review findings,并绑定 SPRINTING/REVIEWING 已确认的 runnable entry/exit 和 Development Evidence
70
+ 1. 测试用例必须追溯到 PRD acceptance criteria 或 Review findings,并绑定 SPRINTING/REVIEWING 已确认的 runnable entry/exit、Development Evidence、Evidence Level、Target Runtime Environment Testing Handoff Contract
71
71
  2. 根据风险补充边界、负向、回归和集成测试。
72
72
  3. 如果有意延后覆盖,必须记录风险和 follow-up。
73
73
  4. 不得新增 product runtime、server/API/CLI/adapter、poller、cloud bootstrap、systemd unit、真实 provider adapter、package runtime script 或部署脚本;这些属于 SPRINTING/RFC。
@@ -86,7 +86,7 @@ TESTING 只能调用 SPRINTING/REVIEWING 已确认 `PASS` 的入口做输入/输
86
86
  - [ ] 当前 task 已从 `plan.yaml` 移除,或因中断/blocker 保留为可恢复 open task。
87
87
  - [ ] Regression checklist 已完成。
88
88
  - [ ] 测试只调用既有 runnable entry/exit;未在 TESTING 中新增 product runtime、bootstrap、provider adapter、deploy 或 package runtime script。
89
- - [ ] 已核对 implementation doc 中的 Development Evidence,并只基于已交付入口设计测试。
89
+ - [ ] 已核对 implementation doc 中的 Development Evidence、Evidence Level、Target Runtime Environment 和 Testing Handoff Contract,并只基于已交付入口设计测试。
90
90
  - [ ] 已判断 test report / test matrix 的语义切片边界。
91
91
  - [ ] 未把测试计划、测试用例或待填内容写成 `TEST_REPORT.md`。
92
92
  - [ ] 已确认 `.docs/07_test/**` 只包含当前方案仍有效的测试事实。
@@ -42,14 +42,28 @@ Input
42
42
 
43
43
  ## 6. Development Evidence(开发自测证据)
44
44
 
45
+ - Evidence Level:
46
+ - Target Runtime Environment:
45
47
  - Runnable Entry:
46
48
  - Observable Exit:
47
49
  - Client / Server Initialization:
48
50
  - Config Contract:
51
+ - Testing Handoff Readiness:
52
+ - Known Missing Runtime Boundaries:
49
53
  - Basic Self-test Evidence:
50
54
  - Not applicable:
51
55
 
52
- ## 7. 关键实现逻辑
56
+ ## 7. Testing Handoff Contract(测试交接合同)
57
+
58
+ - Entry:
59
+ - Config:
60
+ - Initialization / health:
61
+ - Input sample:
62
+ - Expected exit / observable side effect:
63
+ - Cleanup / reset / idempotency:
64
+ - Evidence Level:
65
+
66
+ ## 8. 关键实现逻辑
53
67
 
54
68
  - 输入校验(Input validation):
55
69
  - 核心分支(Core branches):
@@ -57,22 +71,22 @@ Input
57
71
  - 边界兜底(Boundary fallback):
58
72
  - 性能或并发注意事项(Performance or concurrency notes):
59
73
 
60
- ## 8. 与技术方案的偏移
74
+ ## 9. 与技术方案的偏移
61
75
 
62
76
  -
63
77
 
64
- ## 9. 测试覆盖(Test Coverage)
78
+ ## 10. 测试覆盖(Test Coverage)
65
79
 
66
80
  | 测试(Test) | 覆盖范围(Coverage) | 结果(Result) |
67
81
  |---|---|---|
68
82
  | | | |
69
83
 
70
- ## 10. 变更记录(Change Log)
84
+ ## 11. 变更记录(Change Log)
71
85
 
72
86
  | 日期(Date) | Task ID | Commit | 摘要(Summary) |
73
87
  |---|---|---|---|
74
88
  | | | | |
75
89
 
76
- ## 11. 后续维护注意事项
90
+ ## 12. 后续维护注意事项
77
91
 
78
92
  -
@@ -53,6 +53,16 @@ tasks:
53
53
  - "make docs-overview"
54
54
  acceptance_criteria:
55
55
  - "验收标准写在 open task 内,完成后从当前 plan 删除。"
56
+ # Required for SPRINTING tasks that deliver a service, agent, runtime,
57
+ # worker, frontend app, provider/live integration, or other runnable boundary.
58
+ evidence_level:
59
+ required: "unit | local_runtime | external_provider_live | deployed_runtime | business_handoff_ready"
60
+ supporting:
61
+ - "unit"
62
+ target_runtime_environment:
63
+ kind: "local | ci | staging | cloud_vm | managed_service | browser | worker | not_applicable"
64
+ required_for_done: true
65
+ handoff_entrypoint: "URL / CLI / worker command / server action"
56
66
  working_notes:
57
67
  - "执行现场备注只在 open task 保留。"
58
68
  result_docs:
@@ -32,6 +32,9 @@
32
32
  - Config contract:
33
33
  - Fixture/live boundary:
34
34
  - Development Evidence:
35
+ - Evidence Level:
36
+ - Target Runtime Environment:
37
+ - Testing Handoff Contract:
35
38
  - Blocking gaps before TESTING:
36
39
 
37
40
  ## 7. Application Readiness Checklist(应用就绪检查)
@@ -26,6 +26,9 @@
26
26
  - Expected exits / side effects:
27
27
  - Development Evidence used:
28
28
  - Application readiness decision source:
29
+ - Evidence Level used:
30
+ - Target runtime under test:
31
+ - Testing Handoff Contract used:
29
32
  - Config contract used:
30
33
  - Fixture/live boundary:
31
34
  - Missing entry/exit blocker:
@@ -10,6 +10,9 @@ const TASK_PHASES = new Set(["REQUIREMENT_GATHERING", "ARCHITECTING", "SPRINTING
10
10
  const PARALLEL_ALLOWED_PHASES = new Set(["REQUIREMENT_GATHERING", "SPRINTING", "TESTING"]);
11
11
  const TASK_STATUSES = new Set(["pending", "in_progress", "done", "blocked", "pending_revision", "cancelled"]);
12
12
  const OPEN_TASK_STATUSES = new Set(["pending", "in_progress", "blocked", "pending_revision"]);
13
+ const EVIDENCE_LEVELS = new Set(["unit", "local_runtime", "external_provider_live", "deployed_runtime", "business_handoff_ready"]);
14
+ const EVIDENCE_LEVEL_ORDER = ["unit", "local_runtime", "external_provider_live", "deployed_runtime", "business_handoff_ready"];
15
+ const TARGET_RUNTIME_KINDS = new Set(["local", "ci", "staging", "cloud_vm", "managed_service", "browser", "worker", "not_applicable"]);
13
16
  const DESIGN_CATEGORIES = [
14
17
  {
15
18
  label: "AI copilot/provider",
@@ -84,6 +87,7 @@ const RUNNABLE_ENTRY_EXIT_TERMS = [
84
87
  "not applicable"
85
88
  ];
86
89
  const DEVELOPMENT_EVIDENCE_TERMS = ["development evidence", "开发自测证据"];
90
+ const TESTING_HANDOFF_TERMS = ["testing handoff contract", "测试交接合同"];
87
91
  const EVIDENCE_PLACEHOLDER_TERMS = [
88
92
  "pending",
89
93
  "tbd",
@@ -95,7 +99,7 @@ const EVIDENCE_PLACEHOLDER_TERMS = [
95
99
  ];
96
100
  const PAGE_TASK_TERMS = ["frontend", "front-end", "browser", "page", "页面", "前端", "按钮", "表单", "跳转"];
97
101
  const PAGE_ENTRY_TERMS = ["http://", "https://", "localhost", "127.0.0.1", "page url", "页面 url", "dev server"];
98
- const PAGE_BROWSER_CHECK_TERMS = ["browser check", "playwright", "screenshot", "click", "button", "form", "页面可加载", "浏览器"];
102
+ const PAGE_BROWSER_CHECK_TERMS = ["browser check", "playwright", "screenshot", "click", "button", "form", "页面可加载", "浏览器验证"];
99
103
  const CALLABLE_TASK_TERMS = [
100
104
  "api",
101
105
  "endpoint",
@@ -163,6 +167,16 @@ const INSUFFICIENT_APPLICATION_SMOKE_TERMS = [
163
167
  "domain smoke",
164
168
  "受控 smoke"
165
169
  ];
170
+ const LOWER_LEVEL_EVIDENCE_TERMS = [
171
+ ...INSUFFICIENT_APPLICATION_SMOKE_TERMS,
172
+ "unit",
173
+ "unit test",
174
+ "local_runtime",
175
+ "local runtime",
176
+ "localhost",
177
+ "external_provider_live",
178
+ "external provider live"
179
+ ];
166
180
  const MISSING_READINESS_TERMS = [
167
181
  "missing entry",
168
182
  "missing exit",
@@ -180,6 +194,22 @@ const MISSING_READINESS_TERMS = [
180
194
  "未交付",
181
195
  "不存在"
182
196
  ];
197
+ const RUNTIME_MISMATCH_TERMS = [
198
+ ...MISSING_READINESS_TERMS,
199
+ "not deployed",
200
+ "not initialized",
201
+ "not connected",
202
+ "local only",
203
+ "localhost only",
204
+ "fake adapter",
205
+ "fake send",
206
+ "未部署",
207
+ "未初始化",
208
+ "未接入",
209
+ "只在本地",
210
+ "仅本地",
211
+ "本地跑通"
212
+ ];
183
213
  const REVIEW_READINESS_FIELDS = [
184
214
  "Runnable Entry",
185
215
  "Observable Exit",
@@ -395,6 +425,7 @@ function validateDraftTaskShape(task, index, errors) {
395
425
  if (!Array.isArray(task.acceptance_criteria) || task.acceptance_criteria.length === 0) {
396
426
  errors.push(`${String(task.id ?? prefix)} must define acceptance_criteria`);
397
427
  }
428
+ errors.push(...validateRuntimeEvidenceContract(task));
398
429
  }
399
430
  else {
400
431
  for (const field of ["docs", "allowed_paths", "required_gates", "acceptance_criteria", "working_notes", "gate_result", "result_docs"]) {
@@ -500,6 +531,7 @@ async function validateReview(projectRoot) {
500
531
  errors.push("Review report must assess runnable entry/exit readiness before TESTING");
501
532
  }
502
533
  errors.push(...validateReviewReadinessChecklist(rawText));
534
+ errors.push(...validateRuntimeHandoffReport(rawText, "Review report"));
503
535
  if (!containsAny(text, ["pass", "blocked", "通过", "阻塞"]))
504
536
  errors.push("Review report must include PASS/BLOCKED decision");
505
537
  return { info: ["validate-review checked review report"], errors };
@@ -528,6 +560,7 @@ async function validateTest(projectRoot) {
528
560
  if (!containsAny(text, ["pass", "blocked", "通过", "阻塞"]))
529
561
  errors.push("Test report must include PASS/BLOCKED decision");
530
562
  errors.push(...validateTestReadinessDecision(text));
563
+ errors.push(...validateRuntimeHandoffReport(report?.text ?? "", "Test report"));
531
564
  if (lifecycle.current_phase === "TESTING") {
532
565
  errors.push(...testingBoundaryErrorsForChangedFiles(await changedFiles(projectRoot)));
533
566
  }
@@ -657,6 +690,7 @@ async function validatePlanState(projectRoot, allowOpen) {
657
690
  if (!Array.isArray(task.acceptance_criteria) || task.acceptance_criteria.length === 0) {
658
691
  errors.push(`Open task ${task.id} must define acceptance_criteria`);
659
692
  }
693
+ errors.push(...validateRuntimeEvidenceContract(task));
660
694
  errors.push(...testingBoundaryErrorsForAllowedPaths(task));
661
695
  }
662
696
  else {
@@ -676,6 +710,67 @@ async function validatePlanState(projectRoot, allowOpen) {
676
710
  }
677
711
  return { taskCount: tasks.length, errors, plan: tasksData };
678
712
  }
713
+ function validateRuntimeEvidenceContract(task) {
714
+ const errors = [];
715
+ const taskId = String(task.id ?? "Task");
716
+ if (String(task.phase ?? "") !== "SPRINTING")
717
+ return errors;
718
+ const context = taskText(task).toLowerCase();
719
+ const needsRuntimeContract = containsAny(context, [...APPLICATION_READINESS_TASK_TERMS, ...PAGE_TASK_TERMS]);
720
+ const evidenceLevel = task.evidence_level;
721
+ const targetRuntime = task.target_runtime_environment;
722
+ if (needsRuntimeContract && !isRecord(evidenceLevel)) {
723
+ errors.push(`${taskId} runtime/app task must define evidence_level.required`);
724
+ }
725
+ if (needsRuntimeContract && !isRecord(targetRuntime)) {
726
+ errors.push(`${taskId} runtime/app task must define target_runtime_environment`);
727
+ }
728
+ if (evidenceLevel !== undefined) {
729
+ if (!isRecord(evidenceLevel)) {
730
+ errors.push(`${taskId} evidence_level must be a mapping`);
731
+ }
732
+ else {
733
+ const required = String(evidenceLevel.required ?? "");
734
+ if (!EVIDENCE_LEVELS.has(required)) {
735
+ errors.push(`${taskId} evidence_level.required must be one of ${[...EVIDENCE_LEVELS].join(", ")}`);
736
+ }
737
+ if ("supporting" in evidenceLevel && !Array.isArray(evidenceLevel.supporting)) {
738
+ errors.push(`${taskId} evidence_level.supporting must be a list when present`);
739
+ }
740
+ if (Array.isArray(evidenceLevel.supporting)) {
741
+ for (const level of evidenceLevel.supporting) {
742
+ if (!EVIDENCE_LEVELS.has(String(level))) {
743
+ errors.push(`${taskId} evidence_level.supporting contains invalid level: ${String(level)}`);
744
+ }
745
+ }
746
+ }
747
+ }
748
+ }
749
+ if (targetRuntime !== undefined) {
750
+ if (!isRecord(targetRuntime)) {
751
+ errors.push(`${taskId} target_runtime_environment must be a mapping`);
752
+ }
753
+ else {
754
+ const kind = String(targetRuntime.kind ?? "");
755
+ if (!TARGET_RUNTIME_KINDS.has(kind)) {
756
+ errors.push(`${taskId} target_runtime_environment.kind must be one of ${[...TARGET_RUNTIME_KINDS].join(", ")}`);
757
+ }
758
+ if (typeof targetRuntime.required_for_done !== "boolean") {
759
+ errors.push(`${taskId} target_runtime_environment.required_for_done must be a boolean`);
760
+ }
761
+ if (targetRuntime.required_for_done === true) {
762
+ const entrypoint = String(targetRuntime.handoff_entrypoint ?? "").trim();
763
+ if (!entrypoint) {
764
+ errors.push(`${taskId} target_runtime_environment.handoff_entrypoint is required when required_for_done is true`);
765
+ }
766
+ if (["cloud_vm", "staging", "managed_service"].includes(kind) && /localhost|127\.0\.0\.1/.test(entrypoint.toLowerCase())) {
767
+ errors.push(`${taskId} target runtime ${kind} cannot use localhost as final handoff_entrypoint`);
768
+ }
769
+ }
770
+ }
771
+ }
772
+ return errors;
773
+ }
679
774
  function validateParallelExecutionContract(plan, currentPhase, errors) {
680
775
  const contract = plan.parallel_execution;
681
776
  if (contract === undefined || contract === null)
@@ -874,15 +969,25 @@ function validateDevelopmentEvidenceText(text, task, implementationDoc) {
874
969
  if (!section) {
875
970
  return [`${taskId} implementation_doc must include Development Evidence with Runnable Entry, Observable Exit, and Basic Self-test Evidence: ${implementationDoc}`];
876
971
  }
877
- if (hasJustifiedNotApplicableEvidence(section))
972
+ if (hasJustifiedNotApplicableEvidence(section) && !hasConcreteDevelopmentEvidenceFields(section)) {
973
+ const required = isRecord(task.evidence_level) ? String(task.evidence_level.required ?? "") : "";
974
+ const targetRuntime = isRecord(task.target_runtime_environment) ? task.target_runtime_environment : undefined;
975
+ if (required && required !== "unit") {
976
+ return [`${taskId} Development Evidence cannot be Not applicable when evidence_level.required is ${required} in ${implementationDoc}`];
977
+ }
978
+ if (targetRuntime && targetRuntime.required_for_done === true && String(targetRuntime.kind ?? "") !== "not_applicable") {
979
+ return [`${taskId} Development Evidence cannot be Not applicable when target_runtime_environment.required_for_done is true in ${implementationDoc}`];
980
+ }
878
981
  return [];
879
- for (const field of ["Runnable Entry", "Observable Exit", "Client / Server Initialization", "Config Contract", "Basic Self-test Evidence"]) {
982
+ }
983
+ for (const field of ["Evidence Level", "Target Runtime Environment", "Runnable Entry", "Observable Exit", "Client / Server Initialization", "Config Contract", "Testing Handoff Readiness", "Known Missing Runtime Boundaries", "Basic Self-test Evidence"]) {
880
984
  const value = evidenceFieldValue(section, field);
881
985
  if (!value || isPlaceholderEvidence(value)) {
882
986
  errors.push(`${taskId} Development Evidence ${field} must contain concrete, executed evidence in ${implementationDoc}`);
883
987
  }
884
988
  }
885
- const context = `${taskText(task)}\n${text}`.toLowerCase();
989
+ const runnableSection = markdownSection(text, RUNNABLE_ENTRY_EXIT_TERMS) ?? "";
990
+ const context = `${taskText(task)}\n${section}\n${runnableSection}`.toLowerCase();
886
991
  const loweredSection = section.toLowerCase();
887
992
  if (containsAny(context, PAGE_TASK_TERMS)) {
888
993
  if (!containsAny(loweredSection, PAGE_ENTRY_TERMS)) {
@@ -911,8 +1016,92 @@ function validateDevelopmentEvidenceText(text, task, implementationDoc) {
911
1016
  errors.push(`${taskId} provider, fixture, fake-adapter, or one-shot smoke is not enough for application readiness; record application readiness evidence or BLOCKED in ${implementationDoc}`);
912
1017
  }
913
1018
  }
1019
+ errors.push(...validateEvidenceLevelAgainstContract(section, text, task, implementationDoc));
1020
+ return errors;
1021
+ }
1022
+ function hasConcreteDevelopmentEvidenceFields(section) {
1023
+ return ["Evidence Level", "Target Runtime Environment", "Runnable Entry", "Observable Exit", "Client / Server Initialization", "Config Contract"].some((field) => {
1024
+ const value = evidenceFieldValue(section, field);
1025
+ return Boolean(value && !isPlaceholderEvidence(value));
1026
+ });
1027
+ }
1028
+ function validateEvidenceLevelAgainstContract(section, fullText, task, implementationDoc) {
1029
+ const errors = [];
1030
+ const taskId = String(task.id ?? "current task");
1031
+ const evidenceLevel = isRecord(task.evidence_level) ? task.evidence_level : undefined;
1032
+ const targetRuntime = isRecord(task.target_runtime_environment) ? task.target_runtime_environment : undefined;
1033
+ const required = String(evidenceLevel?.required ?? "");
1034
+ const actualLevelText = evidenceFieldValue(section, "Evidence Level") ?? section;
1035
+ const actualLevel = evidenceLevelFromText(actualLevelText);
1036
+ const loweredText = `${section}\n${fullText}`.toLowerCase();
1037
+ if (required && EVIDENCE_LEVELS.has(required)) {
1038
+ if (!actualLevel) {
1039
+ errors.push(`${taskId} Development Evidence Evidence Level must state the actual evidence level for required ${required} in ${implementationDoc}`);
1040
+ }
1041
+ else if (evidenceLevelRank(actualLevel) < evidenceLevelRank(required)) {
1042
+ errors.push(`${taskId} Development Evidence level ${actualLevel} is lower than required ${required} in ${implementationDoc}`);
1043
+ }
1044
+ if (["deployed_runtime", "business_handoff_ready"].includes(required)) {
1045
+ const supportOnly = containsAny(loweredText, LOWER_LEVEL_EVIDENCE_TERMS) && !loweredText.includes(required);
1046
+ if (supportOnly) {
1047
+ errors.push(`${taskId} lower-level smoke cannot close required ${required}; record target runtime handoff evidence or BLOCKED in ${implementationDoc}`);
1048
+ }
1049
+ }
1050
+ }
1051
+ if (targetRuntime) {
1052
+ const kind = String(targetRuntime.kind ?? "");
1053
+ const targetText = evidenceFieldValue(section, "Target Runtime Environment") ?? section;
1054
+ if (kind && TARGET_RUNTIME_KINDS.has(kind) && !targetText.toLowerCase().includes(kind.replace("_", " ")) && !targetText.toLowerCase().includes(kind)) {
1055
+ errors.push(`${taskId} Development Evidence Target Runtime Environment must match task contract kind ${kind} in ${implementationDoc}`);
1056
+ }
1057
+ if (targetRuntime.required_for_done === true && String(targetRuntime.handoff_entrypoint ?? "").trim()) {
1058
+ const entrypoint = String(targetRuntime.handoff_entrypoint).trim();
1059
+ if (!fullText.includes(entrypoint)) {
1060
+ errors.push(`${taskId} implementation_doc must record handoff_entrypoint ${entrypoint} from target_runtime_environment in ${implementationDoc}`);
1061
+ }
1062
+ }
1063
+ }
1064
+ if (required === "business_handoff_ready") {
1065
+ errors.push(...validateTestingHandoffContract(fullText, task, implementationDoc));
1066
+ }
1067
+ return errors;
1068
+ }
1069
+ function validateTestingHandoffContract(text, task, implementationDoc) {
1070
+ const taskId = String(task.id ?? "current task");
1071
+ const section = markdownSection(text, TESTING_HANDOFF_TERMS);
1072
+ if (!section) {
1073
+ return [`${taskId} required business_handoff_ready but implementation_doc is missing Testing Handoff Contract: ${implementationDoc}`];
1074
+ }
1075
+ const lowered = section.toLowerCase();
1076
+ const requiredGroups = [
1077
+ ["entrypoint", ["entry", "entrypoint", "url", "command", "入口"]],
1078
+ ["config", ["config", "env", "secret", "配置", "环境变量"]],
1079
+ ["initialization", ["initialization", "startup", "start", "health", "初始化", "启动"]],
1080
+ ["input sample", ["input sample", "request body", "fixture", "message", "输入样例", "请求"]],
1081
+ ["observable exit", ["observable exit", "expected exit", "response", "queue", "audit", "file", "发送", "出口"]],
1082
+ ["cleanup", ["cleanup", "shutdown", "reset", "idempotent", "清理", "关闭", "重置", "幂等"]],
1083
+ ["evidence level", ["business_handoff_ready", "evidence level", "证据等级"]]
1084
+ ];
1085
+ const errors = [];
1086
+ for (const [label, terms] of requiredGroups) {
1087
+ if (!containsAny(lowered, terms)) {
1088
+ errors.push(`${taskId} Testing Handoff Contract must include ${label} in ${implementationDoc}`);
1089
+ }
1090
+ }
1091
+ const targetRuntime = isRecord(task.target_runtime_environment) ? task.target_runtime_environment : undefined;
1092
+ const entrypoint = String(targetRuntime?.handoff_entrypoint ?? "").trim();
1093
+ if (entrypoint && !section.includes(entrypoint)) {
1094
+ errors.push(`${taskId} Testing Handoff Contract must include handoff_entrypoint ${entrypoint} in ${implementationDoc}`);
1095
+ }
914
1096
  return errors;
915
1097
  }
1098
+ function evidenceLevelFromText(text) {
1099
+ const lowered = text.toLowerCase().replace(/-/g, "_").replace(/\s+/g, "_");
1100
+ return EVIDENCE_LEVEL_ORDER.find((level) => lowered.includes(level));
1101
+ }
1102
+ function evidenceLevelRank(level) {
1103
+ return EVIDENCE_LEVEL_ORDER.indexOf(level);
1104
+ }
916
1105
  function validateReviewReadinessChecklist(text) {
917
1106
  const errors = [];
918
1107
  for (const field of REVIEW_READINESS_FIELDS) {
@@ -948,6 +1137,25 @@ function validateTestReadinessDecision(text) {
948
1137
  }
949
1138
  return [];
950
1139
  }
1140
+ function validateRuntimeHandoffReport(text, label) {
1141
+ const decision = finalDecision(text);
1142
+ if (decision !== "PASS")
1143
+ return [];
1144
+ const lowered = text.toLowerCase();
1145
+ const errors = [];
1146
+ if (containsAny(lowered, RUNTIME_MISMATCH_TERMS)) {
1147
+ errors.push(`${label} cannot PASS while target runtime or handoff evidence is missing or lower-level only`);
1148
+ }
1149
+ if (containsAny(lowered, ["deployed_runtime", "business_handoff_ready", "target runtime", "target_runtime_environment", "evidence level"])) {
1150
+ if (!containsAny(lowered, ["evidence level", "evidence_level", "证据等级"])) {
1151
+ errors.push(`${label} PASS must state evidence level when runtime handoff is in scope`);
1152
+ }
1153
+ if (!containsAny(lowered, ["target runtime", "target_runtime_environment", "运行环境"])) {
1154
+ errors.push(`${label} PASS must state target runtime environment when runtime handoff is in scope`);
1155
+ }
1156
+ }
1157
+ return errors;
1158
+ }
951
1159
  function finalDecision(text) {
952
1160
  const match = text.match(/(?:decision|final decision|结论)\s*:?\s*`?(PASS|BLOCKED)`?/i);
953
1161
  return match ? match[1].toUpperCase() : undefined;
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-project-sdlc",
3
- "version": "0.1.17",
3
+ "version": "0.1.18",
4
4
  "description": "CLI and canonical assets for the AI SDLC Harness workflow.",
5
5
  "type": "module",
6
6
  "bin": {