npm - cc-devflow - Versions diffs - 4.5.1 → 4.5.3 - Mend

cc-devflow 4.5.1 → 4.5.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (81) hide show

package/.claude/skills/cc-do/scripts/write-task-checkpoint.sh CHANGED Viewed

@@ -9,7 +9,7 @@ set -euo pipefail
 usage() {
   cat <<'EOF'
 Usage:
-  write-task-checkpoint.sh --dir path/to/change --task T001 --status pending|running|passed|failed|skipped --summary "..." [--event context_ready] [--attempt 0] [--session session-id] [--next-action "..."]
+  write-task-checkpoint.sh --dir path/to/change --task T001 --status pending|running|passed|failed|skipped --summary "..." [--event context_ready] [--attempt 0] [--session session-id] [--next-action "..."] [--tdd-json '{"red":...}']
 EOF
 }
@@ -23,6 +23,7 @@ EVENT_TYPE=""
 ATTEMPT="0"
 SESSION_ID=""
 NEXT_ACTION=""
+TDD_JSON=""
 while [[ $# -gt 0 ]]; do
   case "$1" in
@@ -34,6 +35,7 @@ while [[ $# -gt 0 ]]; do
     --attempt) ATTEMPT="$2"; shift 2 ;;
     --session) SESSION_ID="$2"; shift 2 ;;
     --next-action) NEXT_ACTION="$2"; shift 2 ;;
+    --tdd-json) TDD_JSON="$2"; shift 2 ;;
     -h|--help) usage; exit 0 ;;
     *) echo "Unknown arg: $1" >&2; usage; exit 1 ;;
   esac
@@ -57,6 +59,15 @@ if [[ -z "$SESSION_ID" ]]; then
   SESSION_ID="${TASK_ID}-$(date -u +%s)"
 fi
+tdd_payload="null"
+if [[ -n "$TDD_JSON" ]]; then
+  if [[ -f "$TDD_JSON" ]]; then
+    tdd_payload="$(jq -c . "$TDD_JSON")"
+  else
+    tdd_payload="$(printf '%s' "$TDD_JSON" | jq -c .)"
+  fi
+fi
 jq -nc \
   --arg changeId "$change_id" \
   --arg taskId "$TASK_ID" \
@@ -66,6 +77,7 @@ jq -nc \
   --arg summary "$SUMMARY" \
   --arg timestamp "$timestamp" \
   --arg attempt "$ATTEMPT" \
+  --argjson tdd "$tdd_payload" \
   '{
     changeId: $changeId,
     taskId: $taskId,
@@ -75,7 +87,7 @@ jq -nc \
     summary: $summary,
     timestamp: $timestamp,
     attempt: ($attempt | tonumber)
-  }' > "$runtime_task_dir/checkpoint.json"
+  } + (if $tdd == null then {} else {tdd: $tdd} end)' > "$runtime_task_dir/checkpoint.json"
 if [[ -n "$EVENT_TYPE" || "$STATUS" == "failed" ]]; then
   jq -nc \

package/.claude/skills/cc-investigate/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,36 @@
 # CC-Investigate Skill Changelog
+## v1.2.0 - 2026-04-28
+- treat feedback loops as investigation products that must be made faster, sharper, and more deterministic before root cause freeze
+- require flaky investigations to raise reproduction rate with stress, repetition, timing-window narrowing, or differential loops instead of guessing from weak signals
+- add prevention handoff so confirmed root causes produce either a regression task, architecture finding, or explicit non-recorded reason
+## v1.1.6 - 2026-04-28
+- clarify that investigation domain language and durable decisions come from cc-devflow native sources: `devflow/specs/`, roadmap/backlog handoff, historical design/analysis, and change metadata
+- remove external context/architecture-decision files from the standard investigation contract so they are not implied as generated artifacts
+- route conflicts through capability specs, roadmap decisions, or historical design decisions instead of external decision-document language
+## v1.1.5 - 2026-04-28
+- add a feedback-loop contract so investigations record loop type, command, symptom match, runtime, determinism, failure rate, signal specificity, and sharpening plan before freezing root cause
+- require ranked candidate hypotheses before narrowing to active falsification targets, plus probe tags for cleanup-safe diagnostic instrumentation
+- add performance-regression, native domain/decision context, correct-test-seam, and evidence-request fields across the analysis, task, manifest, playbook, and investigation contract templates
+## v1.1.4 - 2026-04-28
+- add boundary-probe, backward-trace, reference-comparison, diagnostic-instrumentation, and condition-wait investigation modes for multi-component, deep-stack, similar-path, and flaky failures
+- require analysis templates to record boundary matrices, caller chains, working-vs-broken comparisons, probe cleanup, root-cause class, and no-code-root-cause evidence
+- update tasks and manifest templates so repair handoffs preserve the probe/trace facts that prove the confirmed root cause
+## v1.1.3 - 2026-04-28
+- add the explicit `NO REPAIR WITHOUT A FROZEN ROOT-CAUSE CONTRACT` iron law to keep investigation separate from implementation
+- require prior investigation history, pattern analysis, falsification methods, sanitized external research, and escalation decisions before freezing a root cause
+- add repair-boundary scope lock fields for affected module, allowed files, forbidden files, blast radius, and split-or-reroute decisions
+- update analysis, tasks, and task-manifest templates with root-cause hypothesis and investigation metadata
 ## v1.1.2 - 2026-04-27
 - require investigation outputs to resolve the runtime output policy before writing analysis, task, or change metadata artifacts

package/.claude/skills/cc-investigate/PLAYBOOK.md CHANGED Viewed

@@ -12,10 +12,22 @@
 ## Core Rules
 1. 先复现，再猜原因。
-2. 先看最近变化，再决定是不是 regression。
-3. 先证伪假设，再冻结根因。
-4. `planning/analysis.md` 和 `planning/tasks.md` 必须足够让 `cc-do` 脱离当前会话继续工作。
-5. 调查失败三次后先重建入口，不准继续乱补。
+2. 先把复现做成快、准、可复跑的 feedback loop。
+3. 先确认 loop 复现的是用户报告的同一个失败。
+4. 先看最近变化，再决定是不是 regression。
+5. 先证伪假设，再冻结根因。
+6. `planning/analysis.md` 和 `planning/tasks.md` 必须足够让 `cc-do` 脱离当前会话继续工作。
+7. 调查失败三次后先重建入口，不准继续乱补。
+8. 没有 frozen root-cause contract，不准进入 repair task。
+9. 多组件、深层调用、flaky 问题必须先补边界探针、反向追踪或条件等待证据。
+## Iron Law
+```text
+NO REPAIR WITHOUT A FROZEN ROOT-CAUSE CONTRACT
+```
+root-cause contract 至少包含：稳定复现或缩小后的可验证症状、被破坏的代码 / 配置 / 数据 / 依赖契约、已证伪假设、最小修复边界。
 ## Required Outputs
@@ -26,10 +38,114 @@
 ## Investigation Standard
 1. 先收集 symptom、expected、actual、repro。
-2. 先沿代码路径定位触点和最近变更。
-3. 每个假设都要写支持证据和反证。
-4. 只有被证据钉死的根因才能进入 repair contract。
-5. repair contract 只讲最小修复边界，不顺手发明新范围。
+2. 先构造 feedback loop：失败测试、HTTP 脚本、CLI fixture、浏览器脚本、trace replay、throwaway harness、fuzz、bisect、differential，最后才是 HITL。
+3. 记录 loop 的运行时间、确定性、失败率、症状匹配证据和 sharpen 计划。
+4. 先查 prior investigations、TODOS/backlog、report-card finding 和最近变更。
+5. 先沿代码路径定位触点和最近变更。
+6. 先做 pattern analysis，再列 3-5 个候选假设并收敛到 1-3 个 active hypotheses。
+7. 每个假设都要写支持证据、反证、证伪方法、预期观察、实际观察。
+8. 只有被证据钉死的根因才能进入 repair contract。
+9. repair contract 只讲最小修复边界，不顺手发明新范围。
+## Investigation Modes
+| Mode | 什么时候用 | 第一动作 |
+| --- | --- | --- |
+| `reproduce-first` | 症状真实但不稳定 | 缩小复现命令 / 手动路径 |
+| `feedback-loop` | 已有复现但信号慢、松、偶然或不确定是否同一 bug | 记录 loop type、命令、runtime、determinism、failure rate 和 symptom match |
+| `diff-trace` | 昨天可用、今天坏了 | `git log --oneline -20 -- <affected-files>` |
+| `boundary-probe` | API -> service -> DB、CI -> build -> deploy 这类链路断裂 | 记录每层输入、输出、配置和状态 |
+| `backward-trace` | 错误出现在深层堆栈或坏值来源不明 | 从 immediate failure site 反追 original trigger |
+| `reference-compare` | 同仓库有相似可用路径 | 列出 working / broken 差异并逐项接受或排除 |
+| `condition-wait` | flaky、sleep、timeout、重试后消失 | 找真实等待条件，不先加大延时 |
+| `history-trace` | 同一区域反复坏 | 查历史 `analysis.md`、TODO、report-card finding |
+| `pattern-research` | 陌生框架 / 依赖 / 平台错误 | 脱敏后查通用错误类型 |
+| `contract-check` | 修复边界可能扩大 | 判定 implementation drift / missing spec truth / roadmap mismatch |
+## Pattern Analysis
+至少对照这些模式，不要直接猜：
+- race condition：间歇性、时序相关、共享状态
+- null propagation：TypeError / undefined / missing guard
+- state corruption：数据不一致、部分更新、hook / transaction 顺序
+- integration failure：timeout、协议不匹配、外部服务边界
+- configuration drift：本地 / CI / 生产表现不同
+- stale cache：清缓存后恢复或旧状态复现
+- resource leak：OOM、句柄增长、生命周期未释放
+- performance regression：变慢、CPU / IO / 查询耗时升高、吞吐下降
+- trust boundary drift：外部输入、LLM 输出、用户输入被当成可信
+- timing guess / flaky wait：任意 sleep / timeout / setTimeout 掩盖真实条件
+性能回归先建 baseline、profiler、query plan 或 bisect，不把普通日志当性能证据。
+## Boundary And Trace Evidence
+复杂链路必须在 `analysis.md` 写清：
+- Boundary Probe Matrix：component boundary、input observed、output observed、config/env observed、state observed、verdict
+- Backward Trace Chain：immediate failure site、caller chain、bad value origin、original trigger、why symptom-site fix is rejected
+- Reference Comparison：similar working example、broken path、differences accepted / ruled out
+- Diagnostic Instrumentation Plan：probe tag、probe location、question answered、command、expected signal、cleanup requirement
+- Feedback Loop Contract：loop type、command、expected / actual signal、symptom match、runtime、determinism、failure rate、sharpening plan
+- Correct Test Seam：test seam、public interface exercised、why it reaches the real trigger chain、why shallow tests are rejected
+这些字段不是装饰。它们的作用是证明根因位于源头，而不是报错点。
+## Prior Investigation History
+形成根因前至少查：
+1. `git log --oneline -20 -- <affected-files>`
+2. `rg -n "<error-keyword>|<module>|<capability>" devflow/changes`
+3. `TODOS.md`、backlog、roadmap 中的相关项
+4. 既有 `planning/analysis.md` 和 `review/report-card.json`
+命中历史时，写入 `analysis.md` 的 `Prior Investigations`，说明这次是复发、同类结构味道，还是无关历史。
+## Domain And Decision Context
+优先读取 cc-devflow 原生上下文：`devflow/specs/INDEX.md`、相关 capability specs、roadmap/backlog handoff、历史 `planning/design.md` / `planning/analysis.md`、`change-meta.json`。调查输出里的领域名、假设名、测试名应沿用项目词汇；如果调查结论违反 capability spec、roadmap decision 或历史 design decision，要显式写入 evidence chain，而不是静默覆盖既有设计决策。
+## External Research Hygiene
+只有在本地证据不足、错误类型陌生、或像依赖 / 框架 / 平台问题时才做外部调研。
+- 先删除 host、IP、token、customer id、内部路径、SQL、私有 repo 名。
+- 只搜错误类别、框架 / 库名、版本、通用组件名。
+- 调研结果只能进入候选假设，必须回到本仓库复现或代码证据验证。
+## No Code Root Cause
+如果结论是环境、外部服务或时序窗口：
+- 写 `rootCauseClass`: `code` / `config` / `environment` / `external` / `timing`
+- 写明为什么不是 code root cause
+- 写明需要什么 monitoring / future evidence
+- 写明 operator handling，不要把未知外因包装成代码修复
+## Scope Lock
+根因假设形成后，写清：
+- affected module
+- allowed files
+- forbidden files
+- blast radius estimate
+- if touches >5 files: split / justify / reroute
+超过 5 个文件默认是调查信号，不是正常 bug-fix 规模。
+## Escalation
+三次假设失败后写 `Escalation Decision`：
+- failed hypothesis count
+- attempted evidence
+- why current entry is suspect
+- next option：continue / instrument-and-wait / human-review / reroute-cc-plan
+- evidence request：需要可复现环境、HAR、日志 dump、core dump、带时间戳录屏或临时生产探针权限
+- recommendation
 ## Local Kit

package/.claude/skills/cc-investigate/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: cc-investigate
-version: 1.1.2
+version: 1.2.0
 description: "Use when a bug, regression, broken task, or unexpected behavior needs root-cause investigation, reproducible evidence, and a frozen repair handoff before cc-do resumes coding."
 triggers:
   - "帮我查这个 bug"
@@ -33,10 +33,13 @@ writes:
 entry_gate:
   - "Read the current bug report, existing requirement artifacts, relevant code, tests, and recent history before forming any hypothesis."
   - "Use a FIX-<number>-<description> change key for new bug-fix investigations."
-  - "Reproduce or narrow the symptom first, then freeze the evidence chain before proposing repair tasks."
+  - "Build a runnable feedback loop, confirm it matches the reported symptom, then freeze the evidence chain before proposing repair tasks."
+  - "Search prior investigations, TODO/backlog signals, and recent fixes in the affected area before declaring the bug novel."
+  - "For multi-component, deep-stack, or flaky symptoms, record boundary probes, backward trace, or condition-wait evidence before declaring the root cause."
+  - "For performance regressions, collect a baseline or profile signal before treating logs as evidence."
   - "Do not write production code here; this stage ends with planning/analysis.md plus executable repair tasks for cc-do."
 exit_criteria:
-  - "planning/analysis.md records symptom, reproduction, evidence chain, confirmed root cause, and repair boundary."
+  - "planning/analysis.md records symptom, reproduction, evidence chain, boundary probes or backward trace when applicable, pattern analysis, tested hypotheses, confirmed root cause, and repair boundary."
   - "planning/tasks.md and planning/task-manifest.json are explicit enough that cc-do can repair the bug without chat memory."
   - "The honest next step is cc-do, cc-plan, or roadmap."
 reroutes:
@@ -106,6 +109,21 @@ tool_budget:
 如果问题其实是在问“应该做什么功能”或“范围是否要变”，别硬调试，回 `cc-plan`。
+## Iron Law
+```text
+NO REPAIR WITHOUT A FROZEN ROOT-CAUSE CONTRACT
+```
+`cc-investigate` 可以跑复现、读代码、查日志、加临时探针、证伪假设，但不能把“可能是”写成 repair task。
+根因合同必须同时回答：
+1. 症状如何稳定复现或被缩小到可验证范围。
+2. 哪条代码 / 配置 / 数据 / 依赖路径违反了什么契约。
+3. 哪些假设被证伪，为什么不是它们。
+4. 最小修复边界在哪里，哪些文件明确不该动。
 ## Quick Start
 先判断你面对的是哪一类调查现实：
@@ -113,8 +131,15 @@ tool_budget:
 | 现实状态 | 先走什么路径 |
 | --- | --- |
 | 症状真实，但还没有稳定复现 | `reproduce-first`，先把现象钉死 |
+| 已有复现但信号慢 / 松 / 偶然 | `feedback-loop`，先把 pass/fail loop 做快、准、可复跑 |
 | 明显是 regression | `diff-trace`，先查最近变化 |
+| 多组件链路断裂 | `boundary-probe`，先记录每个边界的输入、输出、配置和状态 |
+| 报错点很深或坏值来源不明 | `backward-trace`，从 symptom site 一直追到 original trigger |
+| 同仓库有相似可用路径 | `reference-compare`，先列出 working vs broken differences |
+| flaky / sleep / timeout 类问题 | `condition-wait`，先找真实等待条件，不先加大延时 |
 | 症状已知，但修复边界可能扩大范围 | `contract-check`，先判是否还属于当前 requirement |
+| 错误类型陌生，像框架 / 依赖 / 平台问题 | `pattern-research`，先做脱敏外部调研 |
+| 同一区域反复坏 | `history-trace`，先查 prior investigations 和最近修复 |
 | 看起来像 bug，实则是未定义行为或新需求 | 直接 reroute 到 `cc-plan` |
 先说“这是什么类问题”，再说“我要怎么修”。没有分类的 debug，最后都会变成乱打补丁。
@@ -163,30 +188,237 @@ tool_budget:
    - 记录用户看见了什么
    - 记录预期与实际差异
    - 记录复现命令或手动路径
-2. **Trace reality**
+   - 确认复现的是用户描述的同一个失败，而不是旁边的红灯
+   - 如果上下文缺失，只问一个最关键问题，不一次性抛出问题清单
+2. **Build feedback loop**
+   - 优先构造 agent 可运行的 pass/fail 信号：失败测试、curl / HTTP 脚本、CLI fixture、浏览器脚本、trace replay、throwaway harness、property / fuzz loop、bisect harness、differential loop，最后才是 HITL 脚本
+   - 记录 loop 类型、命令、运行时间、确定性、失败率、症状匹配证据和下一步 sharpen 计划
+   - loop 太慢、太宽、太 flaky 时，先优化 loop 本身；没有可信 loop，不进入 confirmed root cause
+   - 如果确实无法建 loop，写明尝试过什么，并请求 HAR、日志 dump、core dump、带时间戳录屏、可复现环境访问或临时生产探针权限
+3. **Trace reality**
    - 沿着代码路径找触点
+   - 多组件系统先写 `Boundary Probe Matrix`：每个边界的输入、输出、配置 / 环境、状态和 pass/fail
+   - 深层报错先写 `Backward Trace Chain`：immediate failure site、caller chain、bad value origin、original trigger
    - 查最近提交和同类改动
+   - 查既有 `devflow/changes/*/planning/analysis.md`、`TODOS.md`、backlog、report-card finding
+   - 如果仓库有 `devflow/specs/`、roadmap/backlog handoff、历史 `planning/design.md` / `planning/analysis.md` 或 `change-meta.json`，把领域词汇和已冻结决策当成契约证据
    - 找现有测试和断点证据
    - 判定偏离的是 capability boundary、invariant，还是只是实现细节
-3. **Form hypotheses**
-   - 只保留 1-3 个可被证伪的假设
-   - 每个假设都写支持证据和反证
-4. **Test hypotheses**
+4. **Classify pattern**
+   - 判定是否属于 race condition、null propagation、state corruption、integration failure、configuration drift、stale cache、resource leak、performance regression、trust boundary drift、timing guess / flaky wait
+   - 如果有同仓库 working example，先写 `Reference Comparison`，列出 working path、broken path、差异和被接受 / 排除的假设
+   - 如果错误类型陌生，先做脱敏外部调研；只搜索通用错误类型、框架 / 库名和版本，不搜索 raw secret / path / customer data
+5. **Form hypotheses**
+   - 先列 3-5 个候选假设并排序，避免第一直觉锚定
+   - 再收敛到 1-3 个 active hypotheses 进入验证
+   - 每个假设都写支持证据、反证和优先级理由
+   - 每个假设都写 `falsification method`、`expected observation`、`actual observation`
+6. **Test hypotheses**
    - 用复现、日志、断言、最小探针验证
-   - 三次假设都失败，就停下重建上下文
-5. **Freeze repair contract**
+   - 临时探针必须写 `Diagnostic Instrumentation Plan`：probe tag、probe location、question answered、command、expected signal、cleanup requirement
+   - 每个 probe 只回答一个假设预测；一次只改一个变量
+   - debug 日志必须带唯一前缀，例如 `[DEBUG-FIX123-a4f2]`，进入 `cc-do` 前用前缀 grep 清理或转正
+   - 三次假设都失败，就停下进入 escalation decision
+7. **Freeze repair contract**
    - 根因确认后，写进 `planning/analysis.md`
    - 只保留最小修复边界
+   - 写清正确测试缝隙：测试是否覆盖真实触发链；如果没有正确 seam，这本身就是需要记录的架构事实
+   - 写明 affected module、allowed files、forbidden files、blast radius estimate；超过 5 个文件默认拆分或 reroute
    - 输出 `planning/tasks.md` + `planning/task-manifest.json` + `change-meta.json`
-6. **Hand off**
+8. **Hand off**
    - 下一步明确写成 `cc-do`
    - 如果 repair contract 越过当前 requirement，就 reroute 到 `cc-plan`
+## Pattern Analysis
+不要从空白猜测开始。先把 bug 放进一个可检查的模式：
+| Pattern | Signature | First place to inspect |
+| --- | --- | --- |
+| race condition | 间歇性、时序相关、重试后消失 | 并发写、锁、队列、共享状态 |
+| null propagation | TypeError / NoMethod / undefined access | nullable 输入、默认值、边界 guard |
+| state corruption | 数据不一致、部分更新、顺序错乱 | transaction、callback、hook、副作用 |
+| integration failure | timeout、unexpected response、协议不匹配 | API boundary、service config、schema |
+| configuration drift | 本地可用、CI/生产失败 | env、feature flag、版本、路径、权限 |
+| stale cache | 清缓存后恢复、旧状态复现 | browser / CDN / Redis / build cache |
+| resource leak | OOM、句柄增长、慢性崩溃 | lifecycle、subscription、retention、cleanup |
+| performance regression | 变慢、CPU / IO / 查询耗时升高、吞吐下降 | baseline、profiler、query plan、bisect |
+| trust boundary drift | LLM / 用户输入 / 外部响应被当成可信 | validation、escaping、policy gate |
+| timing guess / flaky wait | sleep / setTimeout / timeout 增大后偶尔通过 | 真实完成条件、事件、文件、状态或队列计数 |
+模式分析不是结论，只是定位第一批证据的索引。
+## Boundary Probe Matrix
+多组件链路不要先猜。先记录每个边界的事实：
+- component boundary
+- input observed
+- output observed
+- config / env observed
+- state observed
+- verdict: `pass` / `fail` / `unknown`
+只有一个边界先失败时，后续调查才收缩到那个组件。多个边界都异常时，优先找共同上游，不在下游堆补丁。
+## Backward Trace Chain
+报错点很深时，不准只在 symptom site 加 guard。`analysis.md` 必须追到：
+- immediate failure site
+- direct caller
+- caller chain
+- bad value origin
+- original trigger
+- why symptom-site fix is rejected
+找不到 original trigger 时，根因还没有冻结，只能继续调查或进入 escalation。
+## Reference Comparison
+同仓库或参考实现有相似可用路径时，先对照再假设：
+- similar working example
+- broken path
+- differences found
+- differences accepted as hypothesis
+- differences ruled out
+不要用“看起来差不多”跳过差异。小差异也可能是根因。
+## Diagnostic Instrumentation
+临时日志、断言、探针只能用于回答一个明确问题：
+- probe location
+- question answered
+- command to run
+- expected signal
+- actual signal
+- cleanup requirement
+探针不能变成修复。进入 `cc-do` 前，要么删除，要么明确写入 repair task 的清理 / 转正方式。
+## Feedback Loop Contract
+根因调查首先依赖一个可信 loop：
+- loop type: failing test / HTTP script / CLI fixture / browser script / trace replay / throwaway harness / property-fuzz / bisect / differential / HITL
+- command or manual driver
+- expected failing signal
+- actual failing signal
+- symptom match: 为什么它复现的是用户报告的同一个问题
+- runtime and determinism
+- failure rate for flaky bugs
+- sharpening plan: 如何让它更快、更准、更稳定
+把 loop 当成调查产品来迭代。已有 loop 但信号差时，先优化它：
+1. faster：缓存 setup、缩小 test scope、跳过无关启动。
+2. sharper：断言用户看见的具体症状，不用“没有崩溃”冒充匹配。
+3. more deterministic：固定时间、随机种子、filesystem、network、locale、feature flag。
+flaky bug 的目标不是立刻 100% 复现，而是提高复现率直到可调试。可以循环 100 次、并行触发、加压力、缩小时序窗口或做 differential loop；如果失败率仍低到不可证伪，先写 Evidence Request，不要继续猜。
+没有 loop 时，不能把代码阅读当成根因。只能写 `Evidence Request`：需要可复现环境、HAR、日志 dump、core dump、带时间戳录屏，或临时生产探针权限。
+## Correct Test Seam
+进入 repair handoff 前，必须说明回归测试缝隙是否正确：
+- test seam
+- public interface exercised
+- why this seam reaches the real trigger chain
+- why a shallower test would be false confidence
+- if no correct seam exists, record it as an architecture finding and keep repair verification tied to the original feedback loop
+## Timing And Flaky Bugs
+遇到 flaky、sleep、timeout、重试后消失：
+- 先找真实等待条件：event、state、file、count、queue empty、network response
+- 任意 timeout 必须说明为什么这个时间是业务 / 协议事实，不是猜测
+- 如果只能在并发或负载下复现，记录对应命令和环境
+- 不把“加大 sleep”写成 repair contract，除非它本身就是被证实的协议等待窗口
+## No Code Root Cause
+如果调查证明是环境、外部服务或时序窗口，不要假装代码根因：
+- `rootCauseClass`: `code` / `config` / `environment` / `external` / `timing`
+- `why not code root cause`
+- `monitoring or future evidence needed`
+- `operator handling after fix`
+这类结论仍然需要本地证据支撑；“没有根因”通常只是调查不完整。
+## Prior Investigation History
+同一区域反复坏是架构味道，不是巧合。形成根因前至少查：
+1. `git log --oneline -20 -- <affected-files>`
+2. `rg -n "<error-keyword>|<module>|<capability>" devflow/changes`
+3. `TODOS.md`、backlog、roadmap 中的相关未解决项
+4. 既有 `report-card.json` finding 和 `planning/analysis.md`
+5. 可用时，查询项目记忆或历史调查摘要
+如果命中过往调查，写入 `analysis.md` 的 `Prior Investigations`，包括是否复发、根因是否同类、这次是否说明结构问题。
+## External Research Hygiene
+外部调研只在本地证据不足、错误类型陌生、或像依赖 / 框架 / 平台 bug 时使用。
+调研前必须脱敏：
+- 删除 host、IP、token、customer id、内部路径、SQL 片段、私有 repo 名
+- 搜索错误类别、框架 / 库名、版本和通用组件名
+- 如果无法安全脱敏，就跳过外部搜索，并在 `researchEvidence` 写明原因
+调研结论只能作为候选假设，不能直接变成 confirmed root cause。必须回到本仓库复现或代码证据验证。
+## Scope Lock And Blast Radius
+形成根因假设后，先锁定最小调查 / 修复边界：
+- `affectedModule`
+- `allowedFiles`
+- `forbiddenFiles`
+- `blastRadiusFiles`
+- `blastRadiusRisk`: `low` / `medium` / `high`
+如果修复预计触碰超过 5 个文件，默认说明这可能不是单点 bug：
+1. 能拆成 critical path + follow-up，就拆。
+2. 不能拆但仍是根因跨度，写明为什么。
+3. 如果已经变成设计 / 架构范围，reroute 到 `cc-plan`。
+## Prevention Handoff
+根因冻结后必须写一句后验判断：什么结构、测试 seam、capability invariant、operator guard 或文档会让这个 bug 更早暴露或根本不出现。
+- 如果答案是小范围 regression test，把它写进当前 repair task。
+- 如果答案是 seam / module / capability 边界问题，把它写成 architecture finding，并明确交给 `cc-plan` 或后续 backlog。
+- 如果答案只是流程提醒或人工记忆，不算预防；要么转成可执行 guard，要么明确不记录。
+## Escalation Decision
+三次假设失败后，不准继续硬猜。`analysis.md` 必须写：
+- failedHypothesisCount
+- what was attempted
+- why current entry is suspect
+- next option：`continue-with-new-hypothesis` / `instrument-and-wait` / `human-review` / `reroute-cc-plan`
+- evidence request if the loop cannot be built or the environment is missing
+- recommendation
 ## Good Output
 - 看完第一屏就知道 bug 是什么、怎么复现、为什么会坏
 - 根因不是感觉，而是被证据钉死的具体断点
+- 假设不是列表装饰，而是带证伪方法和实际观察
+- 历史调查、最近改动、模式分析没有被跳过
 - 修复边界清楚到 `cc-do` 不需要二次调查
+- 正确测试缝隙写清楚，不用浅层测试制造假安全
 - `planning/tasks.md` 只包含修复任务，不夹带新需求
 - 如果应该回 `cc-plan`，理由写清楚，不假装还能继续 patch
@@ -202,12 +434,15 @@ tool_budget:
 ## Working Rules
 1. 没有复现，不准声称找到了根因。
-2. 没有证据，不准把猜测写成结论。
-3. 先根因，再修复；先调查，再编码。
-4. `planning/tasks.md` 必须足够让 `cc-do` 在脱离当前对话后继续推进。
-5. 如果修复方案已经变成新 feature 设计，停止 debug，回 `cc-plan`。
-6. 三次假设失败后，默认说明你的调查入口错了，不准继续硬猜。
-7. 好的调查不是“找了很多可能性”，而是把错误世界缩成一个可信的 repair contract。
+2. 没有可信 feedback loop，不准把代码阅读包装成 confirmed root cause。
+3. 没有证据，不准把猜测写成结论。
+4. 先根因，再修复；先调查，再编码。
+5. `planning/tasks.md` 必须足够让 `cc-do` 在脱离当前对话后继续推进。
+6. 如果修复方案已经变成新 feature 设计，停止 debug，回 `cc-plan`。
+7. 三次假设失败后，默认说明你的调查入口错了，不准继续硬猜。
+8. 外部调研必须先脱敏，调研结论必须回到本仓库证据验证。
+9. 修复触点超过 5 个文件时，默认先拆分或 reroute，不把大重构伪装成 bug fix。
+10. 好的调查不是“找了很多可能性”，而是把错误世界缩成一个可信的 repair contract。
 ## Exit Criteria