npm - cc-devflow - Versions diffs - 4.5.1 → 4.5.2 - Mend

cc-devflow 4.5.1 → 4.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (36) hide show

package/.claude/skills/cc-check/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: cc-check
-version: 1.8.2
+version: 1.8.4
 description: Use when a planned or investigated change needs fresh verification evidence, layered gate proof, review truth, and an honest pass fail blocked verdict before entering cc-act.
 triggers:
   - 验收这个需求
@@ -25,7 +25,7 @@ entry_gate:
   - Re-run fresh commands instead of inheriting cc-do narration.
   - If evidence is stale or missing, reset context and rebuild the verdict from canonical artifacts.
 exit_criteria:
-  - review/report-card.json records pass, fail, or blocked using fresh evidence, plus spec alignment and sync readiness.
+  - review/report-card.json records pass, fail, or blocked using fresh evidence, review freshness, claim evidence, QA coverage and browser evidence, failure ownership, plus spec alignment and sync readiness.
   - Task-level review and requirement-level diff review are separated clearly.
   - 'The next step is unambiguous: cc-act, cc-do, cc-investigate, or cc-plan.'
 reroutes:
@@ -119,6 +119,8 @@ NO PASS WITHOUT FRESH EVIDENCE
    - runtime gate
    - task-level review proof
    - requirement-level diff review
+   - claim evidence matrix
+   - QA regression / test-quality proof
    - spec alignment / sync readiness
 4. **Freeze Verdict**
    - 只允许 `pass` / `fail` / `blocked`
@@ -131,12 +133,12 @@ NO PASS WITHOUT FRESH EVIDENCE
 - Allowed actions: rerun gates, inspect review proof, record a verdict, and route the requirement honestly.
 - Forbidden actions: continuing development, inheriting old execution claims without fresh proof, or masking blocked work as pass.
-- Required evidence: every passing statement must cite fresh command output, exit status, and key observation.
+- Required evidence: every passing statement must cite fresh command output, exit status, key observation, and the claim it proves.
 - Reroute rule: code and review fixes return to `cc-do`; root-cause drift returns to `cc-investigate`; scope or design invalidation returns to `cc-plan`.
 ## Verification Layers
-`cc-check` 不是只看“测试是不是绿的”，而是至少看 4 层：
+`cc-check` 不是只看“测试是不是绿的”，而是至少看 9 层：
 1. **Runtime Layer**
    - 测试、lint、typecheck、build、脚本 gate
@@ -147,9 +149,61 @@ NO PASS WITHOUT FRESH EVIDENCE
    - 当前改动是否真的兑现 requirement，而不是只让局部测试通过
 4. **Spec Sync Layer**
    - capability truth、expected spec delta、handoff readiness 是否仍然一致
+5. **Claim Evidence Layer**
+   - 测试通过、build 成功、bug 修复、需求完成、agent 完成等声明，是否各自有对应证据
+6. **QA Test Layer**
+   - 回归测试是否有 red/green 证据
+   - 测试是否验证真实行为，而不是 mock 或 test-only production API
+7. **Review Freshness Layer**
+   - review 是否绑定当前 `headSha`
+   - 从 review 到当前 HEAD 是否还有新增 commit
+   - 质量分、置信度、finding 去噪是否可复盘
+8. **QA Coverage / Browser Layer**
+   - 行为链路、错误态、边界条件是否被测试映射覆盖
+   - UI / 用户路径变更是否有浏览器证据、截图、console 结果或明确 skip 理由
+9. **Failure Ownership Layer**
+   - 失败是本分支引入、基线已存在、环境阻塞，还是归属不明
+   - 归属不明默认不能支撑 `pass`
 任何一层失真，都不能写 `pass`。
+## Claim Evidence Matrix
+不要把所有绿色都写成“测试过了”。`cc-check` 必须把声明拆成证据：
+| Claim | Required proof | Not enough |
+| --- | --- | --- |
+| Tests pass | 本轮 test command、exit 0、0 failures | 旧输出、局部日志、应该会过 |
+| Lint clean | 本轮 lint command、0 errors | 只跑 formatter、只看 touched file 且声明全仓 clean |
+| Build succeeds | build command exit 0 | test / lint 通过 |
+| Bug fixed | 原始症状或回归测试通过 | 代码改了、推测已修 |
+| Regression test works | red -> green 证据 | 测试只绿过一次 |
+| Agent completed | VCS diff / artifact 证明实际变化 | agent 自报 success |
+| Requirements met | 逐项 plan / manifest checklist | 测试通过 |
+这些事实写入 `claimEvidence[]`。缺少关键 claim 的证据时，结论至少是 `blocked`。
+## QA Test Review
+`cc-check` 必须区分“有测试”和“测试证明了正确行为”：
+1. 回归测试必须记录 red/green 证据；red 要因为目标行为缺失而失败，不是语法、fixture 或 mock 写错。
+2. 测试应验证真实行为；如果依赖 mock，必须说明 mock 的边界和为什么不会测试 mock 本身。
+3. 生产代码里新增仅测试使用的 API，默认是坏味道，必须 blocking，除非有明确生产生命周期理由。
+4. 复杂 mock setup 超过测试主体时，优先要求 integration / contract test 解释。
+这些事实写入 `qa.regressionProof` 和 `qa.testQuality`。如果本需求没有行为测试空间，必须记录 `tddException` 或替代验证命令。
+## QA Coverage And Browser Evidence
+测试不是数量游戏。`cc-check` 必须判断测试覆盖了哪条真实路径：
+1. `qa.coverageAudit` 记录 `coveragePct`、`pathMap`、`gaps`、`testsAdded`、`e2eRequired`、`evalRequired`、`qualityStars`。
+2. UI、路由、端到端用户路径、可视状态、交互状态变化时，必须记录 `qa.browserEvidence`。
+3. `qa.browserEvidence` 至少说明 `mode`、`affectedRoutes`、`screenshots`、`consoleErrors`、`healthScore`、`issues`、`skipReason`。
+4. 前端变更没有浏览器证据也没有 skip reason，不能写 `pass`。
+5. 非前端或纯内部变更可以把 `browserEvidence.status` 写成 `skipped`，但必须说明为什么不需要浏览器 QA。
 ## Diff Review Pipeline
 `cc-check` 的 requirement-level review 不能只写“diff 看过了”。至少要形成这些事实：
@@ -161,9 +215,56 @@ NO PASS WITHOUT FRESH EVIDENCE
 5. Outside-diff lookup：新增枚举值、状态、路由、artifact 类型时，必须搜索 sibling references，不能只读 diff 内文件。
 6. Documentation staleness：代码行为、入口、命令、结构变化时，检查 README / CLAUDE / architecture docs 是否漂移。
 7. Adversarial synthesis：如果有 codex review、subagent review、人工 review，多视角 finding 要去重并标出高置信重叠项。
+8. Specialist facets：按实际风险记录 `testing`、`security`、`performance`、`api-contract`、`data-migration`、`design` 等 review facet；没有派发也要写 skip reason，避免 reviewer 误以为已经覆盖。
+9. Confidence calibration：每条 finding 必须有可比较的置信度和指纹，低置信 finding 不准伪装成 blocker。
 这些事实写入 `review.diffReview.details` 或 `review.findings`。`pass` 只在 scope、completion、critical pass、doc staleness 都没有 blocking finding 时成立。
+## Review Packet And Triage
+每次 task-level 或 requirement-level review 都必须能脱离聊天记录复盘：
+1. `reviewPacket.baseSha`
+2. `reviewPacket.headSha`
+3. `reviewPacket.requirements`
+4. `reviewPacket.implemented`
+5. `reviewPacket.reviewerContext`
+每次 review 还必须记录 freshness：
+1. `review.freshness.status`：`fresh` / `stale` / `unknown` / `not-applicable`
+2. `review.freshness.reviewedCommit`
+3. `review.freshness.currentCommit`
+4. `review.freshness.commitsSinceReview`
+5. `review.freshness.staleReason`
+6. `review.qualityScore`：0-10，缺失时不能当成高置信审查
+每条 finding 必须有 triage：
+- `accepted-fixed`：已修并有验证
+- `rejected-with-evidence`：经代码 / 测试证明不适用
+- `deferred-minor`：非阻塞，已写入 follow-up
+- `clarification-needed`：不清楚，当前 verdict 不能是 `pass`
+`critical` / `important` finding 未 triage 或未闭环，不能进入 `cc-act`。
+每条 finding 还必须带去噪字段：
+- `confidenceScore`：1-10，低于 7 的 finding 只能作为 warning 或待验证 gap
+- `fingerprint`：稳定去重键，避免多路 review 重复报同一件事
+- `displayTier`：`blocking` / `warning` / `info` / `suppressed`
+- `suppressionReason`：只有 `displayTier=suppressed` 时允许非空
+## Failure Ownership
+失败不能只写“测试红了”。`cc-check` 必须把失败归属写入 `runtime.failureOwnership[]`：
+1. `classification` 只能是 `in-branch`、`pre-existing`、`environment`、`ambiguous`。
+2. `ambiguous` 默认按 `in-branch` 处理，除非有 base branch 复验证据。
+3. `pre-existing` 必须有 base branch 或历史证据，不能靠猜。
+4. `environment` 必须记录缺失依赖、权限、服务、密钥或平台约束。
+5. `pass` 不能带未解释的 `in-branch` 或 `ambiguous` 失败。
 ## Entry Gate
 1. 先读 `planning/design.md` 或 `planning/analysis.md`，再读 `planning/tasks.md`、`planning/task-manifest.json`。
@@ -181,6 +282,7 @@ NO PASS WITHOUT FRESH EVIDENCE
    - 运行真实命令
    - 记录 exit status
    - 识别 failure 还是 blocked
+   - 记录 failure ownership，而不是把所有红灯混成一个失败摘要
 3. **Compare against the contract**
    - 对照 `planning/design.md` 或 `planning/analysis.md`
    - 对照 `planning/tasks.md`、`planning/task-manifest.json`
@@ -265,9 +367,12 @@ NO PASS WITHOUT FRESH EVIDENCE
 1. severity：`critical` / `important` / `info`
 2. confidence：`high` / `medium` / `low`，低置信不要伪装成 blocker
+   - 同时写 `confidenceScore`，用 1-10 数字表达可比较置信度
 3. source：`runtime` / `task-review` / `diff-review` / `adversarial` / `docs`
 4. evidence：文件、命令、退出码、manifest path、或具体观察
 5. action：`fix-now` / `reroute-cc-do` / `reroute-cc-plan` / `reroute-cc-investigate` / `document-follow-up`
+6. fingerprint：稳定去重键
+7. displayTier：`blocking` / `warning` / `info` / `suppressed`
 不能写“可能有问题”然后让接手者猜。要么证明，要么标成待验证 gap。
@@ -281,15 +386,31 @@ NO PASS WITHOUT FRESH EVIDENCE
   "verdict": "pass",
   "overall": "pass",
   "summary": "verdict=pass quick=3/3 strict=0/0 review=pass",
+  "claimEvidence": [
+    { "claim": "tests-pass", "requiredProof": "fresh test command", "commandOrArtifact": "npm test", "exitStatus": 0, "keyObservation": "0 failures", "status": "pass" },
+    { "claim": "requirements-met", "requiredProof": "plan checklist", "commandOrArtifact": "planning/tasks.md", "exitStatus": null, "keyObservation": "all tasks complete", "status": "pass" }
+  ],
+  "runtime": { "status": "pass", "failureOwnership": [] },
+  "qa": {
+    "status": "pass",
+    "regressionProof": [],
+    "testQuality": [],
+    "coverageAudit": { "status": "pass", "coveragePct": 80, "pathMap": [], "gaps": [], "testsAdded": [], "e2eRequired": false, "evalRequired": false, "qualityStars": "★★" },
+    "browserEvidence": { "status": "skipped", "mode": "not-applicable", "affectedRoutes": [], "screenshots": [], "consoleErrors": [], "healthScore": null, "issues": [], "skipReason": "not a UI or user-path change" },
+    "tddException": null
+  },
   "quickGates": [],
   "strictGates": [],
   "review": {
     "status": "pass",
     "summary": "Task review and diff review both passed",
     "details": "",
-      "taskReviews": { "status": "pass", "required": true, "summary": "all completed tasks carry spec/code proof", "reviewers": [], "findings": [] },
-      "diffReview": { "status": "pass", "required": true, "summary": "plan completion clean, no scope drift, no critical diff findings", "reviewers": [], "findings": [] },
-      "findings": []
+    "freshness": { "status": "fresh", "reviewedCommit": "example-head", "currentCommit": "example-head", "commitsSinceReview": 0, "staleReason": "" },
+    "qualityScore": 9,
+    "specialistReviews": [],
+    "taskReviews": { "status": "pass", "required": true, "summary": "all completed tasks carry spec/code proof", "reviewPacket": {}, "reviewers": [], "findings": [] },
+    "diffReview": { "status": "pass", "required": true, "summary": "plan completion clean, no scope drift, no critical diff findings", "reviewPacket": {}, "reviewers": [], "findings": [] },
+    "findings": []
   },
   "blockingFindings": [],
   "reroute": "none",

package/.claude/skills/cc-check/assets/REPORT_CARD_TEMPLATE.json CHANGED Viewed

@@ -9,6 +9,80 @@
   "specAlignment": "blocked",
   "specDeltaVerified": false,
   "specSyncReady": false,
+  "runtime": {
+    "status": "blocked",
+    "failureOwnership": [
+      {
+        "failure": "missing spec review proof",
+        "classification": "in-branch",
+        "touchedByDiff": true,
+        "evidence": "planning/task-manifest.json tasks[T002].reviews.spec is empty",
+        "action": "reroute-cc-do",
+        "status": "open"
+      }
+    ]
+  },
+  "claimEvidence": [
+    {
+      "claim": "tests-pass",
+      "requiredProof": "fresh test command with exit 0 and 0 failures",
+      "commandOrArtifact": "npm test -- src/feature/feature.test.ts",
+      "exitStatus": 0,
+      "keyObservation": "targeted tests passed in this run",
+      "status": "pass"
+    },
+    {
+      "claim": "requirements-met",
+      "requiredProof": "line-by-line planning/tasks.md and task-manifest.json checklist",
+      "commandOrArtifact": "planning/tasks.md + planning/task-manifest.json",
+      "exitStatus": null,
+      "keyObservation": "T002 still lacks spec review proof",
+      "status": "blocked"
+    }
+  ],
+  "qa": {
+    "status": "blocked",
+    "regressionProof": [
+      {
+        "behavior": "original symptom",
+        "redCommand": "",
+        "redFailure": "",
+        "greenCommand": "",
+        "greenObservation": "",
+        "restoredState": false
+      }
+    ],
+    "testQuality": [
+      {
+        "area": "targeted-tests",
+        "checksRealBehavior": true,
+        "mockBoundary": "none",
+        "testOnlyProductionApi": false,
+        "status": "pass"
+      }
+    ],
+    "coverageAudit": {
+      "status": "blocked",
+      "coveragePct": null,
+      "pathMap": ["planning/tasks.md#T002"],
+      "gaps": ["T002 has no spec review proof, so the requirement cannot be marked covered"],
+      "testsAdded": [],
+      "e2eRequired": false,
+      "evalRequired": false,
+      "qualityStars": "★"
+    },
+    "browserEvidence": {
+      "status": "skipped",
+      "mode": "not-applicable",
+      "affectedRoutes": [],
+      "screenshots": [],
+      "consoleErrors": [],
+      "healthScore": null,
+      "issues": [],
+      "skipReason": "template example is not a UI browser QA scenario"
+    },
+    "tddException": null
+  },
   "quickGates": [
     {
       "name": "targeted-tests",
@@ -28,17 +102,63 @@
     "status": "blocked",
     "summary": "Task review evidence is incomplete",
     "details": "T002 is implemented, but the requirement still lacks spec review proof required by the gate.",
+    "freshness": {
+      "status": "unknown",
+      "reviewedCommit": "",
+      "currentCommit": "",
+      "commitsSinceReview": null,
+      "staleReason": "review range is not recorded yet"
+    },
+    "qualityScore": null,
+    "specialistReviews": [
+      {
+        "name": "testing",
+        "status": "blocked",
+        "required": true,
+        "summary": "testing facet cannot pass while task review proof is missing",
+        "skipReason": "",
+        "findings": []
+      }
+    ],
     "taskReviews": {
       "status": "blocked",
       "required": true,
       "summary": "T002 has no spec review record yet",
+      "reviewPacket": {
+        "baseSha": "",
+        "headSha": "",
+        "requirements": "planning/tasks.md#T002",
+        "implemented": "implementation report for T002",
+        "reviewerContext": "task spec and changed files"
+      },
       "reviewers": [],
-      "findings": []
+      "findings": [
+        {
+          "severity": "important",
+          "confidence": "high",
+          "source": "task-review",
+          "summary": "T002 spec review proof is missing",
+          "evidence": "planning/task-manifest.json tasks[T002].reviews.spec is empty",
+          "action": "reroute-cc-do",
+          "triageStatus": "clarification-needed",
+          "confidenceScore": 9,
+          "fingerprint": "task-review:T002:missing-spec-review",
+          "displayTier": "blocking",
+          "suppressionReason": null
+        }
+      ]
     },
     "diffReview": {
       "status": "skipped",
       "required": false,
       "summary": "",
+      "reviewPacket": {
+        "baseSha": "",
+        "headSha": "",
+        "requirements": "planning/design.md",
+        "implemented": "",
+        "reviewerContext": ""
+      },
       "reviewers": [],
       "findings": []
     },

package/.claude/skills/cc-check/references/review-contract.md CHANGED Viewed

@@ -21,6 +21,7 @@
 每个 reviewer 结果至少说明：
+- reviewPacket
 - status
 - summary
 - evidence
@@ -30,16 +31,103 @@
 - severity
 - confidence
+- confidenceScore
 - source
 - summary
 - evidence
 - action
+- triageStatus
+- fingerprint
+- displayTier
+- suppressionReason
+## Review Packet
+Review 不能依赖聊天记忆。每个 task-level review 和 requirement-level diff review 至少记录：
+- `baseSha`：被审查范围的起点
+- `headSha`：被审查范围的终点
+- `requirements`：对应的 plan、task、analysis 或 spec 路径
+- `implemented`：实现者声称完成的内容
+- `reviewerContext`：reviewer 实际拿到的上下文摘要
+缺 `baseSha` / `headSha` 时，review 只能算 `blocked` 或 `skipped`，不能支撑 `pass`。
+## Review Freshness
+Review 必须绑定当前被交付的 commit，而不是绑定聊天记忆。
+每份 requirement-level review 至少记录：
+- `review.freshness.status`：`fresh` / `stale` / `unknown` / `not-applicable`
+- `review.freshness.reviewedCommit`
+- `review.freshness.currentCommit`
+- `review.freshness.commitsSinceReview`
+- `review.freshness.staleReason`
+- `review.qualityScore`：0-10，可空但空值不能支撑高置信 pass
+`status=stale`、`status=unknown` 且没有解释，或 `commitsSinceReview > 0` 且未重审，都会阻塞 `pass`。
+## Specialist Facets
+`review.specialistReviews[]` 用来记录按风险覆盖的审查面，不要求每次都派发独立 reviewer，但要求边界说清：
+- `testing`：负路径、边界条件、隔离性、flaky 风险、回归测试质量
+- `security`：trust boundary、shell / SQL / secret / auth 风险
+- `performance`：热路径、批量、缓存、N+1、资源泄漏
+- `api-contract`：输入输出、状态枚举、兼容面、错误语义
+- `data-migration`：schema、回滚、幂等、历史数据
+- `design`：UI / UX / visual consistency 和可用性
+没有相关风险时写 `status=skipped` 和 `skipReason`；有风险却缺 facet 时，至少是 review gap。
+## Finding Triage
+Review finding 不只是“发现过”，必须有处置结果：
+| triageStatus | 什么时候用 |
+| --- | --- |
+| `accepted-fixed` | finding 正确，已修复，并有验证证据 |
+| `rejected-with-evidence` | finding 不适用，已有代码 / 测试 / 契约证据支撑 |
+| `deferred-minor` | minor，不阻塞本次交付，已写入 follow-up |
+| `clarification-needed` | finding 不清楚，需要用户或 reviewer 澄清 |
+`critical` / `important` finding 不能用 `deferred-minor`。任何 `clarification-needed` 都会阻塞 `pass`。
+## QA Test Review Facts
+Review 必须判断测试是否证明行为：
+- 回归测试是否有 red/green 证据
+- red 是否因为目标行为缺失而失败
+- green 是否包含 targeted test 和必要的 broader gate
+- mock 是否必要，且没有断言 mock 本身
+- 生产代码是否新增 test-only API
+- integration / contract test 是否比复杂 mock 更直接
+- coverage audit 是否映射真实 codepath / user flow / error state / edge case
+- UI 或用户路径变更是否有 browser evidence、截图、console 结果，或明确 skip reason
+## Failure Ownership
+失败归属必须结构化写入 `runtime.failureOwnership[]`：
+- `classification=in-branch`：当前分支引入
+- `classification=pre-existing`：base branch 也能复现，必须有证据
+- `classification=environment`：缺依赖、权限、服务、密钥或平台条件
+- `classification=ambiguous`：归属不明，默认不能支撑 `pass`
+不要把 pre-existing failure 当成当前分支失败，也不要把 ambiguous failure 当成环境问题。
 ## Gate Rules
 - 任务级 review 缺证据，不能绿灯
 - 需求级 diff review 在 strict 模式下缺失，至少是 `blocked`
 - `important` / `critical` finding 未处理前，不算通过
+- `important` / `critical` finding 缺 triageStatus，不算通过
+- QA test quality 缺失且本次涉及行为变化，至少是 `blocked`
+- review freshness 缺失、过期或与当前 head 不一致，不能绿灯
+- UI / 用户路径变更缺 browser evidence 且无 skip reason，不能绿灯
+- `runtime.failureOwnership` 仍有 `in-branch` 或 `ambiguous` 未解释失败，不能绿灯
 - plan item 是 `PARTIAL` / `NOT_DONE` 且影响成功标准时，不能通过
 - scope drift 没有解释清楚时，不能通过
 - 文档漂移如果影响 reviewer / maintainer 接手，必须阻塞到 `cc-act` doc sync 或 reroute