cc-devflow 4.5.1 → 4.5.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (81) hide show
  1. package/.claude/skills/cc-act/CHANGELOG.md +27 -0
  2. package/.claude/skills/cc-act/PLAYBOOK.md +32 -1
  3. package/.claude/skills/cc-act/SKILL.md +53 -7
  4. package/.claude/skills/cc-act/assets/PR_BRIEF_TEMPLATE.md +35 -1
  5. package/.claude/skills/cc-act/assets/RELEASE_NOTE_TEMPLATE.md +10 -1
  6. package/.claude/skills/cc-act/references/closure-contract.md +11 -0
  7. package/.claude/skills/cc-act/scripts/cc-act-common.sh +32 -1
  8. package/.claude/skills/cc-act/scripts/render-pr-brief.sh +130 -0
  9. package/.claude/skills/cc-act/scripts/verify-act-gate.sh +23 -1
  10. package/.claude/skills/cc-check/CHANGELOG.md +26 -0
  11. package/.claude/skills/cc-check/PLAYBOOK.md +128 -1
  12. package/.claude/skills/cc-check/SKILL.md +147 -7
  13. package/.claude/skills/cc-check/assets/REPORT_CARD_TEMPLATE.json +164 -1
  14. package/.claude/skills/cc-check/references/gate-contract.md +11 -0
  15. package/.claude/skills/cc-check/references/review-contract.md +104 -0
  16. package/.claude/skills/cc-check/scripts/render-report-card.js +209 -5
  17. package/.claude/skills/cc-check/scripts/verify-gate.sh +28 -0
  18. package/.claude/skills/cc-do/CHANGELOG.md +12 -0
  19. package/.claude/skills/cc-do/PLAYBOOK.md +14 -9
  20. package/.claude/skills/cc-do/SKILL.md +24 -13
  21. package/.claude/skills/cc-do/references/execution-recovery.md +16 -5
  22. package/.claude/skills/cc-do/scripts/verify-task-gates.sh +19 -6
  23. package/.claude/skills/cc-do/scripts/write-task-checkpoint.sh +14 -2
  24. package/.claude/skills/cc-investigate/CHANGELOG.md +31 -0
  25. package/.claude/skills/cc-investigate/PLAYBOOK.md +124 -8
  26. package/.claude/skills/cc-investigate/SKILL.md +252 -17
  27. package/.claude/skills/cc-investigate/assets/ANALYSIS_TEMPLATE.md +112 -3
  28. package/.claude/skills/cc-investigate/assets/TASKS_TEMPLATE.md +17 -5
  29. package/.claude/skills/cc-investigate/assets/TASK_MANIFEST_TEMPLATE.json +141 -1
  30. package/.claude/skills/cc-investigate/references/investigation-contract.md +192 -0
  31. package/.claude/skills/cc-plan/CHANGELOG.md +26 -0
  32. package/.claude/skills/cc-plan/PLAYBOOK.md +18 -6
  33. package/.claude/skills/cc-plan/SKILL.md +72 -34
  34. package/.claude/skills/cc-plan/assets/DESIGN_TEMPLATE.md +30 -3
  35. package/.claude/skills/cc-plan/assets/TASKS_TEMPLATE.md +28 -0
  36. package/.claude/skills/cc-plan/assets/TASK_MANIFEST_TEMPLATE.json +46 -1
  37. package/.claude/skills/cc-plan/assets/TINY_DESIGN_TEMPLATE.md +24 -0
  38. package/.claude/skills/cc-plan/references/planning-contract.md +18 -4
  39. package/.claude/skills/cc-roadmap/CHANGELOG.md +14 -0
  40. package/.claude/skills/cc-roadmap/PLAYBOOK.md +10 -7
  41. package/.claude/skills/cc-roadmap/SKILL.md +43 -23
  42. package/.claude/skills/cc-roadmap/assets/BACKLOG_TEMPLATE.md +10 -0
  43. package/.claude/skills/cc-roadmap/assets/ROADMAP_TEMPLATE.md +15 -0
  44. package/.claude/skills/cc-roadmap/assets/TRACKING_TEMPLATE.json +1 -1
  45. package/.claude/skills/cc-roadmap/references/roadmap-dialogue.md +11 -7
  46. package/.claude/skills/cc-simplify/CHANGELOG.md +21 -0
  47. package/.claude/skills/cc-simplify/SKILL.md +264 -35
  48. package/.claude/skills/cc-spec-init/CHANGELOG.md +6 -0
  49. package/.claude/skills/cc-spec-init/SKILL.md +14 -1
  50. package/CHANGELOG.md +37 -0
  51. package/README.md +10 -2
  52. package/README.zh-CN.md +10 -2
  53. package/docs/examples/example-bindings.json +7 -7
  54. package/docs/examples/full-design-blocked/BACKLOG.md +1 -1
  55. package/docs/examples/full-design-blocked/README.md +1 -1
  56. package/docs/examples/full-design-blocked/ROADMAP.md +1 -1
  57. package/docs/examples/full-design-blocked/changes/REQ-002-bulk-invite-import/planning/design.md +1 -1
  58. package/docs/examples/full-design-blocked/changes/REQ-002-bulk-invite-import/planning/tasks.md +1 -1
  59. package/docs/examples/full-design-blocked/changes/REQ-002-bulk-invite-import/review/report-card.json +140 -3
  60. package/docs/examples/full-design-blocked/roadmap-tracking.json +1 -1
  61. package/docs/examples/local-handoff/BACKLOG.md +1 -1
  62. package/docs/examples/local-handoff/README.md +1 -1
  63. package/docs/examples/local-handoff/ROADMAP.md +1 -1
  64. package/docs/examples/local-handoff/changes/REQ-003-audit-log-export/planning/design.md +1 -1
  65. package/docs/examples/local-handoff/changes/REQ-003-audit-log-export/planning/tasks.md +1 -1
  66. package/docs/examples/local-handoff/changes/REQ-003-audit-log-export/review/report-card.json +92 -0
  67. package/docs/examples/local-handoff/roadmap-tracking.json +1 -1
  68. package/docs/examples/pdca-loop/BACKLOG.md +1 -1
  69. package/docs/examples/pdca-loop/README.md +1 -1
  70. package/docs/examples/pdca-loop/ROADMAP.md +1 -1
  71. package/docs/examples/pdca-loop/changes/REQ-001-copy-invite-link/handoff/pr-brief.md +20 -0
  72. package/docs/examples/pdca-loop/changes/REQ-001-copy-invite-link/planning/design.md +1 -1
  73. package/docs/examples/pdca-loop/changes/REQ-001-copy-invite-link/planning/task-manifest.json +2 -2
  74. package/docs/examples/pdca-loop/changes/REQ-001-copy-invite-link/planning/tasks.md +1 -1
  75. package/docs/examples/pdca-loop/changes/REQ-001-copy-invite-link/review/report-card.json +92 -0
  76. package/docs/examples/pdca-loop/roadmap-tracking.json +1 -1
  77. package/docs/skill-strategy-audit.md +48 -0
  78. package/lib/skill-runtime/__tests__/runtime.integration.test.js +19 -1
  79. package/lib/skill-runtime/review.js +64 -1
  80. package/lib/skill-runtime/schemas.js +161 -4
  81. package/package.json +1 -1
@@ -16,8 +16,25 @@
16
16
 
17
17
  - What the user saw:
18
18
  - Reproduction command / path:
19
+ - Repro stability: `stable` | `intermittent` | `not-yet-reproduced` | `narrowed-only`
20
+ - Matches reported symptom: `yes` | `no` | `partial` | `unknown`
21
+ - Symptom match evidence:
19
22
  - Expected:
20
23
  - Actual:
24
+ - Impact / blast radius:
25
+
26
+ ## Feedback Loop Contract
27
+
28
+ - Loop type: `failing-test` | `http-script` | `cli-fixture` | `browser-script` | `trace-replay` | `throwaway-harness` | `property-fuzz` | `bisect` | `differential` | `hitl`
29
+ - Command or manual driver:
30
+ - Expected failing signal:
31
+ - Actual failing signal:
32
+ - Runtime:
33
+ - Determinism: `deterministic` | `high-rate-flaky` | `low-rate-flaky` | `unknown`
34
+ - Failure rate:
35
+ - Signal specificity:
36
+ - Sharpening plan:
37
+ - If no loop, evidence request:
21
38
 
22
39
  ## Evidence Chain
23
40
 
@@ -25,25 +42,114 @@
25
42
  - Code path:
26
43
  - Recent changes:
27
44
  - Existing tests:
45
+ - Prior investigations:
46
+ - TODO / backlog / report-card signals:
47
+ - Native domain / decision context:
28
48
 
29
- ## Hypothesis Table
49
+ ## Boundary Probe Matrix
50
+
51
+ | Component boundary | Input observed | Output observed | Config / env observed | State observed | Verdict |
52
+ | --- | --- | --- | --- | --- | --- |
53
+ | | | | | | unknown |
54
+
55
+ ## Backward Trace Chain
56
+
57
+ - Immediate failure site:
58
+ - Direct caller:
59
+ - Caller chain:
60
+ - Bad value origin:
61
+ - Original trigger:
62
+ - Why symptom-site fix is rejected:
63
+
64
+ ## Reference Comparison
65
+
66
+ - Similar working example:
67
+ - Broken path:
68
+ - Differences found:
69
+ - Differences accepted as hypothesis:
70
+ - Differences ruled out:
71
+
72
+ ## Diagnostic Instrumentation Plan
73
+
74
+ | Probe tag | Probe location | Question answered | Command to run | Expected signal | Actual signal | Cleanup requirement |
75
+ | --- | --- | --- | --- | --- | --- | --- |
76
+ | | | | | | | |
77
+
78
+ ## Pattern Analysis
30
79
 
31
- | Hypothesis | Evidence for | Evidence against | Status |
80
+ | Pattern | Evidence checked | Status | Notes |
32
81
  | --- | --- | --- | --- |
33
- | | | | pending |
82
+ | race condition | | ruled-out | |
83
+ | null propagation | | ruled-out | |
84
+ | state corruption | | ruled-out | |
85
+ | integration failure | | ruled-out | |
86
+ | configuration drift | | ruled-out | |
87
+ | stale cache | | ruled-out | |
88
+ | resource leak | | ruled-out | |
89
+ | performance regression | | ruled-out | |
90
+ | trust boundary drift | | ruled-out | |
91
+ | timing guess / flaky wait | | ruled-out | |
92
+
93
+ ## Candidate Hypotheses
94
+
95
+ | Rank | Hypothesis | Why plausible | Prediction | Status |
96
+ | --- | --- | --- | --- | --- |
97
+ | 1 | | | | pending |
98
+
99
+ ## Research Evidence
100
+
101
+ - External research used: `yes` | `no`
102
+ - Sanitized query:
103
+ - Source / result:
104
+ - Applicability:
105
+ - Accepted into hypothesis: `yes` | `no`
106
+ - If skipped, reason:
107
+
108
+ ## Hypothesis Table
109
+
110
+ | Hypothesis | Evidence for | Evidence against | Falsification method | Expected observation | Actual observation | Status |
111
+ | --- | --- | --- | --- | --- | --- | --- |
112
+ | | | | | | | pending |
113
+
114
+ ## Escalation Decision
115
+
116
+ - Failed hypothesis count:
117
+ - Attempted evidence:
118
+ - Why current entry is suspect:
119
+ - Next option: `continue-with-new-hypothesis` | `instrument-and-wait` | `human-review` | `reroute-cc-plan`
120
+ - Evidence request:
121
+ - Recommendation:
34
122
 
35
123
  ## Root Cause
36
124
 
37
125
  - Confirmed root cause:
126
+ - Root cause class: `code` | `config` | `environment` | `external` | `timing`
38
127
  - Broken contract:
39
128
  - Spec diagnosis: `implementation drift` | `missing spec truth` | `roadmap mismatch`
40
129
  - Why it escaped:
130
+ - Why not code root cause:
131
+ - Monitoring or future evidence needed:
132
+ - Operator handling after fix:
133
+ - Prior history relationship: `new` | `recurring` | `same-root-cause` | `architectural-smell-candidate`
134
+
135
+ ## Correct Test Seam
136
+
137
+ - Test seam:
138
+ - Public interface exercised:
139
+ - Why this seam reaches the real trigger chain:
140
+ - Why a shallower test would be false confidence:
141
+ - If no correct seam exists:
41
142
 
42
143
  ## Repair Boundary
43
144
 
44
145
  - Fix strategy:
146
+ - Affected module:
147
+ - Allowed files:
45
148
  - Files likely touched:
46
149
  - Do not change:
150
+ - Blast radius file count:
151
+ - Blast radius risk: `low` | `medium` | `high`
152
+ - Split / reroute decision if >5 files:
47
153
  - Expected spec delta:
48
154
  - Verification after fix:
49
155
  - Why this can enter `cc-do`:
@@ -51,6 +157,9 @@
51
157
  ## Review Gate
52
158
 
53
159
  - Repro stable:
160
+ - Feedback loop trustworthy:
161
+ - Symptom match confirmed:
54
162
  - Root cause confirmed:
163
+ - Correct test seam identified:
55
164
  - Repair scope still belongs to this requirement:
56
165
  - If not, reroute:
@@ -15,22 +15,34 @@
15
15
  - Canonical change meta: `change-meta.json`
16
16
  - Execution mode: `single-path` | `parallel-ready`
17
17
  - Confirmed root cause:
18
+ - Root-cause hypothesis:
19
+ - Feedback loop:
20
+ - Symptom match evidence:
18
21
  - Frozen repair boundary:
22
+ - Boundary probes:
23
+ - Backward trace:
24
+ - Reference comparison:
25
+ - Allowed files:
26
+ - Forbidden files:
27
+ - Blast radius:
19
28
  - Capability specs:
20
29
  - Read first:
21
30
  - Commands to trust:
22
31
  - Do not re-decide:
23
32
  - Parallel boundaries:
33
+ - Correct test seam:
34
+ - Evidence request if blocked:
24
35
 
25
- ## Phase 1: Reproduce Guard
36
+ ## Phase 1: Reproduce And Probe Guard
26
37
 
27
38
  - [ ] T001 [TEST] Capture the failing behavior as a stable reproduction (dependsOn:none) `path/to/test`
28
- Goal: 让 bug 先变成一个可复跑的失败事实。
39
+ Goal: 让 bug 先变成一个快、准、可复跑且匹配用户症状的失败事实。
29
40
  Files: `path/to/test`
30
41
  Read first: `analysis.md`, `tasks.md`
31
42
  Verification: `npm test -- path/to/test`
32
- Evidence: failing output or reproducible log
33
- Ready when: reproduction path 已稳定
43
+ Evidence: failing output or reproducible log + symptom match evidence
44
+ Correct seam: test must exercise the real trigger chain through a public interface
45
+ Ready when: feedback loop 已稳定,analysis 已记录必要的 boundary / trace / comparison evidence
34
46
 
35
47
  ## Phase 2: Repair
36
48
 
@@ -40,7 +52,7 @@
40
52
  Read first: `analysis.md`, `path/to/test`
41
53
  Verification: `npm test -- path/to/test`
42
54
  Evidence: passing output + checkpoint
43
- Ready when: T001 已证明问题存在
55
+ Ready when: T001 已证明同一个用户症状存在,analysis 已证明根因源头
44
56
 
45
57
  ## Phase 3: Verify
46
58
 
@@ -20,12 +20,149 @@
20
20
  ]
21
21
  },
22
22
  "planningMeta": {
23
- "ccInvestigateSkillVersion": "1.0.0",
23
+ "ccInvestigateSkillVersion": "1.1.6",
24
24
  "analysisVersion": "analysis.v1",
25
25
  "approvedAt": "2026-04-17T12:00:00.000Z",
26
26
  "approvedBy": "user",
27
27
  "basedOnRootCause": "Root cause sentence"
28
28
  },
29
+ "investigationMeta": {
30
+ "symptomStatus": "stable",
31
+ "reproductionPath": "npm test -- src/feature/feature.test.ts",
32
+ "feedbackLoop": {
33
+ "loopType": "failing-test",
34
+ "commandOrDriver": "npm test -- src/feature/feature.test.ts",
35
+ "expectedFailingSignal": "The test fails with the user-reported behavior",
36
+ "actualFailingSignal": "Observed failure output from the current repo",
37
+ "symptomMatchEvidence": "Failure output matches the reported symptom, not a nearby unrelated failure",
38
+ "runtime": "under 10s",
39
+ "determinism": "deterministic",
40
+ "failureRate": "100%",
41
+ "signalSpecificity": "asserts the exact broken behavior",
42
+ "sharpeningPlan": "Narrow setup or assertions if the loop becomes slow or broad",
43
+ "evidenceRequest": ""
44
+ },
45
+ "patternAnalysis": {
46
+ "selectedPattern": "null propagation",
47
+ "ruledOutPatterns": [
48
+ "race condition",
49
+ "performance regression",
50
+ "configuration drift",
51
+ "timing guess / flaky wait"
52
+ ],
53
+ "notes": "Pattern evidence belongs in planning/analysis.md"
54
+ },
55
+ "boundaryProbes": [
56
+ {
57
+ "componentBoundary": "api -> service",
58
+ "inputObserved": "Request payload matches the reproduced failure",
59
+ "outputObserved": "Service receives invalid state",
60
+ "configEnvObserved": "Relevant env/config values recorded in analysis.md",
61
+ "stateObserved": "State snapshot or log pointer",
62
+ "verdict": "fail"
63
+ }
64
+ ],
65
+ "backwardTrace": {
66
+ "immediateFailureSite": "file:line or operation where the symptom appears",
67
+ "directCaller": "caller that passed the bad value or state",
68
+ "callerChain": [
69
+ "entrypoint",
70
+ "intermediate caller",
71
+ "failure site"
72
+ ],
73
+ "badValueOrigin": "where the invalid data/state first appears",
74
+ "originalTrigger": "user action, command, event, config, or dependency response that starts the chain",
75
+ "symptomSiteFixRejectedBecause": "Guarding only the failure site would leave the bad upstream contract intact"
76
+ },
77
+ "referenceComparison": {
78
+ "similarWorkingExample": "path/to/working/example",
79
+ "brokenPath": "path/to/broken/path",
80
+ "differencesFound": [
81
+ "Working path validates input before persistence"
82
+ ],
83
+ "differencesAcceptedAsHypothesis": [
84
+ "Missing validation before persistence"
85
+ ],
86
+ "differencesRuledOut": []
87
+ },
88
+ "diagnosticInstrumentation": [
89
+ {
90
+ "probeTag": "[DEBUG-FIXXXX-a4f2]",
91
+ "probeLocation": "file:line or component boundary",
92
+ "questionAnswered": "Which boundary first emits the invalid value?",
93
+ "commandToRun": "npm test -- src/feature/feature.test.ts",
94
+ "expectedSignal": "Probe records invalid value before the failure site",
95
+ "actualSignal": "Observed evidence from the current repo",
96
+ "cleanupRequirement": "Remove temporary probe or convert it into a durable assertion/log"
97
+ }
98
+ ],
99
+ "candidateHypotheses": [
100
+ {
101
+ "rank": 1,
102
+ "statement": "Specific, testable root-cause claim",
103
+ "whyPlausible": "Reproduction output points to the affected contract",
104
+ "prediction": "The failing signal disappears when that contract is restored",
105
+ "status": "accepted-for-testing"
106
+ }
107
+ ],
108
+ "priorInvestigations": [],
109
+ "researchEvidence": [],
110
+ "domainDecisionContext": {
111
+ "contextFilesRead": [],
112
+ "adrFilesRead": [],
113
+ "vocabularyNotes": [],
114
+ "adrConflicts": []
115
+ },
116
+ "rootCauseHypothesis": {
117
+ "statement": "Specific, testable root-cause claim",
118
+ "falsificationMethod": "Command, log probe, assertion, or code-path check",
119
+ "expectedObservation": "What should be observed if the hypothesis is true",
120
+ "actualObservation": "Observed evidence from the current repo",
121
+ "status": "confirmed"
122
+ },
123
+ "rootCauseClass": "code",
124
+ "noCodeRootCause": {
125
+ "whyNotCodeRootCause": "",
126
+ "monitoringOrFutureEvidenceNeeded": "",
127
+ "operatorHandlingAfterFix": ""
128
+ },
129
+ "hypothesisAttempts": [
130
+ {
131
+ "statement": "Specific, testable root-cause claim",
132
+ "status": "confirmed",
133
+ "evidenceFor": [
134
+ "Reproduction output points to the affected code path"
135
+ ],
136
+ "evidenceAgainst": [],
137
+ "falsificationMethod": "Run the reproduction command"
138
+ }
139
+ ],
140
+ "escalationDecision": {
141
+ "failedHypothesisCount": 0,
142
+ "nextOption": "cc-do",
143
+ "recommendation": "Repair the confirmed root cause"
144
+ },
145
+ "correctTestSeam": {
146
+ "testSeam": "public interface or end-to-end path that reaches the real trigger chain",
147
+ "publicInterfaceExercised": "CLI/API/UI behavior observed by callers",
148
+ "realTriggerChainCoverage": "The test enters through the same trigger path as the bug",
149
+ "whyShallowTestRejected": "A lower-level unit test would not prove the upstream contract",
150
+ "ifNoCorrectSeam": ""
151
+ },
152
+ "repairBoundary": {
153
+ "affectedModule": "src/feature",
154
+ "allowedFiles": [
155
+ "src/feature/feature.ts",
156
+ "src/feature/feature.test.ts"
157
+ ],
158
+ "forbiddenFiles": [
159
+ "unrelated modules"
160
+ ],
161
+ "blastRadiusFileCount": 2,
162
+ "blastRadiusRisk": "low",
163
+ "splitOrRerouteDecision": "single focused repair"
164
+ }
165
+ },
29
166
  "status": "planned",
30
167
  "designMode": "cc-investigate",
31
168
  "approvedOption": "confirmed-root-cause",
@@ -52,6 +189,7 @@
52
189
  "activePhase": 1,
53
190
  "frozenDecisions": [
54
191
  "Fix only the confirmed root cause",
192
+ "Use planning/analysis.md as the canonical root-cause contract",
55
193
  "Do not widen scope without rerouting to cc-plan"
56
194
  ],
57
195
  "tasks": [
@@ -71,6 +209,8 @@
71
209
  ],
72
210
  "acceptance": [
73
211
  "The target bug is reproduced as a stable failure",
212
+ "The failing loop matches the user-reported symptom",
213
+ "The regression test uses the correct seam for the real trigger chain",
74
214
  "The failure output points to the confirmed root-cause path"
75
215
  ],
76
216
  "verification": [
@@ -3,6 +3,7 @@
3
3
  ## Iron Law
4
4
 
5
5
  - 没有根因,不准修 bug。
6
+ - 没有 frozen root-cause contract,不准生成 repair task。
6
7
 
7
8
  ## Minimum Evidence
8
9
 
@@ -10,10 +11,25 @@
10
11
 
11
12
  - symptom
12
13
  - reproduction path
14
+ - feedback loop contract
15
+ - symptom match evidence
13
16
  - expected vs actual
14
17
  - code path
15
18
  - recent change signal
19
+ - prior investigation signal
20
+ - boundary probe matrix, when the failure crosses components
21
+ - backward trace chain, when the error appears below the original trigger
22
+ - reference comparison, when a similar working path exists
23
+ - diagnostic instrumentation plan, when probes are needed
24
+ - pattern analysis
25
+ - ranked candidate hypotheses
26
+ - root-cause hypothesis
27
+ - falsification method
16
28
  - confirmed root cause
29
+ - correct test seam
30
+ - root cause class
31
+ - repair boundary
32
+ - blast radius
17
33
 
18
34
  ## Output Shape
19
35
 
@@ -21,6 +37,182 @@
21
37
  - `planning/tasks.md` 是修复 handoff
22
38
  - `planning/task-manifest.json` 是执行真相源
23
39
 
40
+ ## Root-Cause Hypothesis
41
+
42
+ 每条假设都必须可证伪:
43
+
44
+ - `candidateRank`:候选假设排序,避免第一直觉锚定
45
+ - `hypothesis`:具体说明什么坏了,为什么会导致症状
46
+ - `evidenceFor`
47
+ - `evidenceAgainst`
48
+ - `falsificationMethod`
49
+ - `expectedObservation`
50
+ - `actualObservation`
51
+ - `status`:`pending` / `confirmed` / `rejected` / `needs-more-evidence`
52
+
53
+ 只有 `confirmed` 假设可以进入 Root Cause。
54
+
55
+ ## Feedback Loop Contract
56
+
57
+ 调查必须先构造一个可信 pass/fail loop:
58
+
59
+ - `loopType`: failing-test / http-script / cli-fixture / browser-script / trace-replay / throwaway-harness / property-fuzz / bisect / differential / hitl
60
+ - `commandOrDriver`
61
+ - `expectedFailingSignal`
62
+ - `actualFailingSignal`
63
+ - `symptomMatchEvidence`
64
+ - `runtime`
65
+ - `determinism`
66
+ - `failureRate`
67
+ - `sharpeningPlan`
68
+
69
+ loop 必须复现用户报告的同一失败。无法构造 loop 时,只能进入 `Evidence Request`,不能冻结根因。
70
+
71
+ ## Pattern Analysis
72
+
73
+ 调查必须显式选择或排除常见模式:
74
+
75
+ - race condition
76
+ - null propagation
77
+ - state corruption
78
+ - integration failure
79
+ - configuration drift
80
+ - stale cache
81
+ - resource leak
82
+ - performance regression
83
+ - trust boundary drift
84
+ - timing guess / flaky wait
85
+
86
+ 模式分析只是检索索引,不是 root cause。
87
+
88
+ ## Boundary Probe Matrix
89
+
90
+ 多组件链路必须记录每个边界的事实:
91
+
92
+ - `componentBoundary`
93
+ - `inputObserved`
94
+ - `outputObserved`
95
+ - `configEnvObserved`
96
+ - `stateObserved`
97
+ - `verdict`: `pass` / `fail` / `unknown`
98
+
99
+ 第一个失败边界决定下一轮调查收缩点;多个边界同时失败时,优先追共同上游。
100
+
101
+ ## Backward Trace Chain
102
+
103
+ 深层堆栈或坏值来源不明时,必须追到源头:
104
+
105
+ - immediate failure site
106
+ - direct caller
107
+ - caller chain
108
+ - bad value origin
109
+ - original trigger
110
+ - why symptom-site fix is rejected
111
+
112
+ 找不到 original trigger 时,不能冻结根因。
113
+
114
+ ## Reference Comparison
115
+
116
+ 有相似可用实现时,必须记录:
117
+
118
+ - similar working example
119
+ - broken path
120
+ - differences found
121
+ - differences accepted as hypothesis
122
+ - differences ruled out
123
+
124
+ 不能用“差不多”跳过差异。
125
+
126
+ ## Diagnostic Instrumentation
127
+
128
+ 临时探针必须回答一个明确问题:
129
+
130
+ - probe tag
131
+ - probe location
132
+ - question answered
133
+ - command to run
134
+ - expected signal
135
+ - actual signal
136
+ - cleanup requirement
137
+
138
+ 探针不是修复。handoff 必须说明删除、保留为正式日志,或转成测试断言。
139
+
140
+ debug 日志必须带唯一前缀,例如 `[DEBUG-FIX123-a4f2]`,确保 cleanup 可以用 grep 验证。
141
+
142
+ ## Correct Test Seam
143
+
144
+ 修复 handoff 必须记录回归测试是否覆盖真实触发链:
145
+
146
+ - `testSeam`
147
+ - `publicInterfaceExercised`
148
+ - `realTriggerChainCoverage`
149
+ - `whyShallowTestRejected`
150
+ - `ifNoCorrectSeam`
151
+
152
+ 没有正确 seam 时,必须把它记录为架构事实,并保留原始 feedback loop 作为修复验证。
153
+
154
+ ## Domain And Decision Context
155
+
156
+ 调查前先读 cc-devflow 原生上下文:`devflow/specs/INDEX.md`、相关 capability specs、roadmap/backlog handoff、历史 `planning/design.md` / `planning/analysis.md`、`change-meta.json`。
157
+
158
+ - 输出中的领域概念、假设名、测试名使用项目既有词汇
159
+ - 如果根因或修复方向违反 capability spec、roadmap decision 或历史 design decision,必须显式记录冲突和理由
160
+ - 缺失领域词汇是调查信号,不要临时发明同义词掩盖契约缺口
161
+
162
+ ## Prior History
163
+
164
+ 调查必须记录是否检查了:
165
+
166
+ - `git log --oneline -20 -- <affected-files>`
167
+ - historical `planning/analysis.md`
168
+ - `TODOS.md` / backlog / roadmap
169
+ - previous `report-card.json` findings
170
+
171
+ 如果同一区域重复出现 bug,必须标记为 architectural smell candidate。
172
+
173
+ ## External Research
174
+
175
+ 外部调研必须脱敏:
176
+
177
+ - 不搜索 host、IP、token、customer id、内部路径、SQL、私有 repo 名
178
+ - 只搜索通用错误类别、框架 / 库名、版本、组件名
179
+ - research finding 只能作为候选假设,必须回到本仓库验证
180
+
181
+ ## No Code Root Cause
182
+
183
+ 如果结论不是代码根因,必须写清:
184
+
185
+ - `rootCauseClass`: `code` / `config` / `environment` / `external` / `timing`
186
+ - why not code root cause
187
+ - monitoring or future evidence needed
188
+ - operator handling after fix
189
+
190
+ 环境、外部服务、时序窗口仍然需要证据;不能把调查不足写成外因。
191
+
192
+ ## Repair Boundary
193
+
194
+ 修复边界至少记录:
195
+
196
+ - affected module
197
+ - allowed files
198
+ - forbidden files
199
+ - expected spec delta
200
+ - verification after fix
201
+ - blast radius file count
202
+ - blast radius risk
203
+
204
+ 预计触碰超过 5 个文件时,必须 split / justify / reroute。
205
+
206
+ ## Escalation
207
+
208
+ 三次假设失败后,不再继续猜。必须记录:
209
+
210
+ - failed hypothesis count
211
+ - attempted evidence
212
+ - why current entry is suspect
213
+ - recommended next option:continue / instrument-and-wait / human-review / reroute-cc-plan
214
+ - evidence request:repro env / HAR / log dump / core dump / timestamped recording / temporary production instrumentation
215
+
24
216
  ## Reroute
25
217
 
26
218
  - 根因明确,修复边界清楚 -> `cc-do`
@@ -1,5 +1,31 @@
1
1
  # CC-Plan Skill Changelog
2
2
 
3
+ ## v3.7.0 - 2026-04-28
4
+
5
+ - add glossary delta capture for canonical terms, aliases to avoid, ambiguities, and relationship constraints during context sweep
6
+ - require non-trivial public interfaces to compare deliberately different shapes before freezing the final seam
7
+ - mark vertical slices as `AFK` or `HITL` and require durable design / issue handoffs to describe behavior contracts instead of stale file paths
8
+
9
+ ## v3.6.2 - 2026-04-28
10
+
11
+ - clarify that canonical language and durable decisions come from cc-devflow native sources: `devflow/specs/`, roadmap/backlog handoff, planning design/analysis, and change metadata
12
+ - remove external context/architecture-decision files from the standard planning contract so they are not implied as generated artifacts
13
+ - route long-lived decisions into capability spec deltas, roadmap/backlog decision notes, or the current design decision log
14
+
15
+ ## v3.6.1 - 2026-04-28
16
+
17
+ - require plans to freeze public test seams, behavior assertions, mock boundaries, and feedback loop types before handing Red tasks to `cc-do`
18
+ - strengthen TDD planning so Red tasks reject implementation-detail tests, internal collaborator mocks, and fake seams
19
+ - update design, tiny-design, tasks, and manifest templates with test quality fields inherited from the TDD workflow review
20
+
21
+ ## v3.6.0 - 2026-04-28
22
+
23
+ - absorb grilling-session discipline into native planning: one decision branch at a time, recommended answer with evidence, and no user questions when repo evidence can answer
24
+ - require domain language and durable decision scans before naming modules, interfaces, tests, or tasks
25
+ - add interface/deep-module checks so new public surfaces identify callers, hidden complexity, misuse risk, and alternative shapes before task split
26
+ - strengthen test-first planning around vertical tracer bullets so tasks do not become horizontal "all tests first, all implementation later" slices
27
+ - update design, tiny-design, tasks, and manifest templates with language handoff, interface shape, and vertical slice fields
28
+
3
29
  ## v3.5.6 - 2026-04-28
4
30
 
5
31
  - require non-trivial plans to compare named option roles, including minimal viable and ideal architecture, before freezing a recommendation
@@ -18,14 +18,16 @@
18
18
  5. 版本、来源、冻结决策必须可追踪。
19
19
  6. 机械决策自动落盘;taste decision 和 user challenge 必须显式交给用户拍板。
20
20
  7. 同 blast radius 内的完整边界优先做完,跨系统或无证据扩张才 defer。
21
- 8. 具体执行计划默认测试先行;没有 Red/Green/Refactor 链或 TDD exception,不准交给 `cc-do`。
21
+ 8. 具体执行计划默认测试先行;没有 Red/Green/Refactor 链、公共测试 seam、行为断言、mock 边界或 TDD exception,不准交给 `cc-do`。
22
22
  9. 新 change 目录必须使用 `REQ-<number>-<description>` 或 `FIX-<number>-<description>`;旧小写目录只读兼容,不再作为新输出。
23
23
  10. 原始需求跨多个独立子系统时,先拆回 roadmap / 多个 REQ/FIX;不要把一个大杂烩压成单个计划。
24
24
  11. `tiny-design` 仍然必须被批准,它只是短设计,不是跳过设计。
25
25
  12. 非 trivial 方案必须至少比较 `minimal viable` 和 `ideal architecture` 两种角色,小方案没有天然优先权。
26
26
  13. `full-design` 必须冻结 implementation decision horizon 和 error/rescue map,避免 `cc-do` 临场补设计。
27
- 14. 测试框架来源、覆盖质量和回归测试必须在计划阶段写清,不准靠执行阶段猜。
27
+ 14. 测试框架来源、覆盖质量、测试 seam、mock 边界和回归测试必须在计划阶段写清,不准靠执行阶段猜。
28
28
  15. UI 和 developer/operator-facing 范围只在适用时触发对应 gate,不把每个计划都塞成大审查清单。
29
+ 16. 先对齐项目语言和持久决策,再命名 capability、模块、接口、测试和任务;术语冲突必须显式暴露。
30
+ 17. 行为变更按 tracer bullet 垂直切片推进,不能把任务水平切成“先测试层、再服务层、最后 UI 层”。
29
31
 
30
32
  ## Required Outputs
31
33
 
@@ -63,10 +65,14 @@
63
65
  12. `full-design` 必须包含 implementation decision horizon 和 error/rescue map;不适用时写清 N/A 理由。
64
66
  13. 新 artifact、CLI、包、容器、文档入口必须在计划阶段写清分发和 discoverability,不准到 `cc-act` 才发现没人能用。
65
67
  14. 行为变更任务必须拆成 `[TEST] -> [IMPL] -> [REFACTOR]` 或写明 TDD exception;不能用“实现并测试”混成一个任务。
66
- 15. 回归测试不能 defer。修改既有行为且缺少覆盖时,必须先计划 regression test。
67
- 16. UI scope 要写 design completeness score 和 loading / empty / error / success / partial 状态。
68
- 17. developer/operator-facing scope 要写 target persona、time to first value、magic moment 和 install / run / debug / upgrade 风险。
69
- 18. Review gate 只拦会导致实现错误、执行卡住、范围越界、验证缺失的问题;文字偏好和 nice-to-have 只能作为 advisory。
68
+ 15. 行为变更任务必须按一个 observable behavior 一条 tracer bullet 链组织,不能先批量写红灯再批量实现。
69
+ 16. 回归测试不能 defer。修改既有行为且缺少覆盖时,必须先计划 regression test。
70
+ 17. Red 任务必须验证公共接口上的行为,不验证私有函数、内部调用次数或临时数据结构。
71
+ 18. Mock 只能放在系统边界;如果测试必须 mock 自己控制的模块,说明 seam 或接口设计还没压平。
72
+ 19. 找不到正确 seam 时,先计划 exploratory spike 或设计修正,不能用假红灯冒充 TDD。
73
+ 17. UI scope 要写 design completeness score 和 loading / empty / error / success / partial 状态。
74
+ 18. developer/operator-facing scope 要写 target persona、time to first value、magic moment 和 install / run / debug / upgrade 风险。
75
+ 19. Review gate 只拦会导致实现错误、执行卡住、范围越界、验证缺失的问题;文字偏好和 nice-to-have 只能作为 advisory。
70
76
 
71
77
  ## Approval Flow
72
78
 
@@ -86,9 +92,15 @@
86
92
  - 每个会触达的文件职责是什么,为什么属于这个文件,而不是另一个平行位置?
87
93
  - 为什么推荐方案胜过 `minimal viable` / `ideal architecture` 的另一端?
88
94
  - foundation / core / integration / polish 阶段哪些决策已经冻结,哪些仍是 blocked question?
95
+ - 核心语言是否沿用 `devflow/specs/`、roadmap handoff 或历史 design/analysis,是否存在 language conflict?
96
+ - 新增接口是否是小接口深模块,复杂度是否被藏在正确边界里?
89
97
  - 每条 failure path 的 rescue action、用户可见结果和测试证据是什么?
90
98
  - 每条新增 code path / user flow / error path 的第一条失败测试是什么?
99
+ - 第一条失败测试通过哪个公共 seam 进入系统,断言什么可观察行为?
100
+ - 哪些依赖允许 mock,哪些内部协作者禁止 mock?
101
+ - 反馈循环是自动测试、HTTP、CLI、浏览器、trace replay、harness、property/fuzz、differential,还是 HITL;为什么这是当前最短可信循环?
91
102
  - 测试框架来源是什么,现有覆盖是 strong、happy-path-only、smoke-only 还是 missing?
103
+ - task 是否以端到端 tracer bullet 为单位,而不是按层水平拆?
92
104
  - 哪些生产失败模式已经处理,哪些 defer 到 backlog?
93
105
 
94
106
  ## Design Mode Switch