cc-devflow 4.5.13 → 4.5.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. package/.claude/skills/cc-act/SKILL.md +2 -2
  2. package/.claude/skills/cc-check/CHANGELOG.md +6 -0
  3. package/.claude/skills/cc-check/PLAYBOOK.md +18 -1
  4. package/.claude/skills/cc-check/SKILL.md +59 -3
  5. package/.claude/skills/cc-check/references/gate-contract.md +34 -0
  6. package/.claude/skills/cc-check/references/review-contract.md +11 -0
  7. package/.claude/skills/cc-dev/SKILL.md +1 -1
  8. package/.claude/skills/cc-dev/scripts/resolve-cc-devflow.sh +8 -26
  9. package/.claude/skills/cc-do/CHANGELOG.md +6 -0
  10. package/.claude/skills/cc-do/PLAYBOOK.md +23 -6
  11. package/.claude/skills/cc-do/SKILL.md +20 -4
  12. package/.claude/skills/cc-do/references/execution-recovery.md +15 -3
  13. package/.claude/skills/cc-investigate/CHANGELOG.md +6 -0
  14. package/.claude/skills/cc-investigate/PLAYBOOK.md +24 -0
  15. package/.claude/skills/cc-investigate/SKILL.md +39 -3
  16. package/.claude/skills/cc-investigate/assets/TASKS_TEMPLATE.md +68 -1
  17. package/.claude/skills/cc-investigate/references/investigation-contract.md +32 -0
  18. package/.claude/skills/cc-plan/CHANGELOG.md +14 -0
  19. package/.claude/skills/cc-plan/PLAYBOOK.md +21 -1
  20. package/.claude/skills/cc-plan/SKILL.md +77 -10
  21. package/.claude/skills/cc-plan/assets/TASKS_TEMPLATE.md +61 -3
  22. package/.claude/skills/cc-plan/references/planning-contract.md +28 -4
  23. package/.claude/skills/cc-review/CHANGELOG.md +6 -0
  24. package/.claude/skills/cc-review/PLAYBOOK.md +9 -3
  25. package/.claude/skills/cc-review/SKILL.md +52 -1
  26. package/.claude/skills/cc-review/references/implementation-review-branch.md +106 -6
  27. package/.claude/skills/cc-review/references/plan-review-branch.md +109 -4
  28. package/.claude/skills/cc-review/references/review-methods.md +162 -9
  29. package/.claude/skills/cc-spec-init/PLAYBOOK.md +1 -1
  30. package/.claude/skills/cc-spec-init/SKILL.md +1 -1
  31. package/CHANGELOG.md +22 -0
  32. package/README.md +16 -18
  33. package/README.zh-CN.md +15 -17
  34. package/bin/cc-devflow-cli.js +8 -94
  35. package/docs/examples/example-bindings.json +5 -5
  36. package/docs/examples/full-design-blocked/README.md +1 -1
  37. package/docs/examples/full-design-blocked/changes/REQ-002-bulk-invite-import/task.md +17 -5
  38. package/docs/examples/local-handoff/README.md +1 -1
  39. package/docs/examples/local-handoff/changes/REQ-003-audit-log-export/task.md +17 -5
  40. package/docs/examples/pdca-loop/README.md +1 -1
  41. package/docs/examples/pdca-loop/changes/REQ-001-copy-invite-link/task.md +17 -5
  42. package/docs/guides/artifact-contract.md +1 -1
  43. package/docs/guides/getting-started.md +3 -3
  44. package/docs/guides/getting-started.zh-CN.md +3 -3
  45. package/docs/guides/minimize-artifacts.md +1 -7
  46. package/lib/skill-runtime/CLAUDE.md +1 -1
  47. package/lib/skill-runtime/index.js +1 -9
  48. package/package.json +1 -1
  49. package/lib/skill-runtime/errors.js +0 -39
  50. package/lib/skill-runtime/query-registry.js +0 -101
  51. package/lib/skill-runtime/query.js +0 -126
  52. package/lib/skill-runtime/trace.js +0 -22
@@ -1,12 +1,112 @@
1
1
  # Implementation Review Branch
2
2
 
3
- Read:
3
+ Use this reference when the review target is code, tests, docs, UI behavior, or a current branch diff.
4
4
 
5
- 1. current Git diff
6
- 2. `task.md`
7
- 3. changed code and tests
8
- 4. fresh command output when available
5
+ ## Intake
9
6
 
10
- Review behavior, regression risk, security, reliability, test quality, and code smells inside the current blast radius.
7
+ Read, in order:
8
+
9
+ 1. current branch and base branch
10
+ 2. `git diff <base>...HEAD --stat`
11
+ 3. full diff for changed files
12
+ 4. `task.md`
13
+ 5. changed code plus direct importers/callers for enum, state, API, and behavior changes
14
+ 6. fresh command output when available
15
+
16
+ If no plan exists, infer intent from user request, commits, TODOs, and PR body if present. Mark intent confidence.
17
+
18
+ ## Scope Check
19
+
20
+ Produce this in scratch reasoning before findings:
21
+
22
+ ```text
23
+ Scope Check: CLEAN | DRIFT DETECTED | REQUIREMENTS MISSING
24
+ Intent: ...
25
+ Delivered: ...
26
+ Diff surface: ...
27
+ ```
28
+
29
+ Out-of-scope files are findings only when they change behavior or expand blast radius.
30
+
31
+ ## Diff Review Passes
32
+
33
+ Turn these passes into review nodes before reporting findings. Every changed file, public behavior, test surface, documentation surface, and UI/runtime flow belongs to a node or has a skip reason.
34
+
35
+ For broad or PR-landing diffs, use the risk-lane profile from `review-methods.md` before final findings:
36
+
37
+ 1. Intent and regression
38
+ 2. Security and privacy
39
+ 3. Performance and reliability
40
+ 4. Contracts and coverage
41
+
42
+ ### Contract Fidelity
43
+
44
+ Check whether implementation matches `task.md` or investigation:
45
+
46
+ - required tasks done
47
+ - rejected scope not implemented
48
+ - root cause still true
49
+ - expected spec delta honored
50
+ - behavior visible at public seam
51
+
52
+ ### Code Smell Scan
53
+
54
+ Use `review-methods.md` smell taxonomy.
55
+
56
+ Look for:
57
+
58
+ - copy-paste helper logic
59
+ - broad catch-all errors
60
+ - parameter clumps
61
+ - shallow pass-through modules
62
+ - internal mocks driving production design
63
+ - new branch forests where a data shape would collapse cases
64
+ - hidden state or multiple truth sources
65
+ - cycles between modules
66
+
67
+ ### Structural Risk
68
+
69
+ Check:
70
+
71
+ - security and trust boundaries
72
+ - enum/value completeness outside the diff
73
+ - migrations and rollback
74
+ - concurrency and double-submit
75
+ - external service failures
76
+ - logs/metrics for new paths
77
+
78
+ ### Test Quality
79
+
80
+ Build a coverage map:
81
+
82
+ ```text
83
+ CODE PATHS USER/RUNTIME FLOWS
84
+ file.ts feature flow
85
+ ├── [tested] happy ├── [tested] main path
86
+ ├── [gap] empty ├── [gap] double action
87
+ └── [gap] upstream error └── [gap] navigate away / timeout
88
+ ```
89
+
90
+ Flag:
91
+
92
+ - no regression test for changed behavior
93
+ - tests only assert implementation shape
94
+ - tests mock internal modules instead of public seam
95
+ - fixture lies with missing fields or type casts
96
+ - no UI/E2E proof for user-visible change
97
+
98
+ ### Documentation and DX
99
+
100
+ If changed behavior affects README, guides, CLI help, package install, public API, agent skill usage, or examples, check whether docs changed too.
101
+
102
+ ## Fix Policy
11
103
 
12
104
  Findings stay in the response. Ask which repair option to apply before editing code. Do not write process files.
105
+
106
+ Return:
107
+
108
+ - findings ordered by severity
109
+ - smallest safe repair option
110
+ - broader cleanup option when the smell is real
111
+ - defer option with explicit risk
112
+ - recommendation and route
@@ -1,9 +1,114 @@
1
1
  # Plan Review Branch
2
2
 
3
- Read:
3
+ Use this reference when the review target is a plan, investigation handoff, or mixed branch whose plan contract may be wrong.
4
+
5
+ ## Intake
6
+
7
+ Read, in order:
4
8
 
5
9
  1. `task.md`
6
- 2. relevant roadmap or issue text
7
- 3. affected code/tests/docs
10
+ 2. relevant roadmap, issue, PR text, or user request
11
+ 3. affected code/tests/docs referenced by the plan
12
+ 4. existing command output only when it proves or disproves a planning assumption
13
+
14
+ If no `task.md` exists, review the user-provided plan text and make missing durable task contract a finding.
15
+
16
+ ## Review Facets
17
+
18
+ Select only applicable facets, but do not skip a selected facet to keep the answer short.
19
+
20
+ ### Strategy
21
+
22
+ Check:
23
+
24
+ - Is this the right problem?
25
+ - Is the stated user/business outcome direct or only a proxy?
26
+ - What happens if we do nothing?
27
+ - What does the 12-month ideal look like?
28
+ - What existing code or workflow already solves part of this?
29
+
30
+ Useful shape:
31
+
32
+ ```text
33
+ CURRENT -> THIS PLAN -> 12-MONTH IDEAL
34
+ ```
35
+
36
+ ### Engineering
37
+
38
+ Check:
39
+
40
+ - component boundaries
41
+ - data flow and shadow paths
42
+ - state transitions
43
+ - security boundaries
44
+ - rollback shape
45
+ - testability seam
46
+ - parallelization risk
47
+
48
+ For non-trivial plans, reason through:
49
+
50
+ ```text
51
+ Entry -> validate -> transform -> persist -> output
52
+ | | | | |
53
+ nil invalid exception conflict stale
54
+ empty wrong type timeout duplicate partial
55
+ ```
56
+
57
+ ### Design
58
+
59
+ Run only for user-facing UI or interaction flows.
60
+
61
+ Check:
62
+
63
+ - first, second, third thing the user sees
64
+ - loading / empty / error / success / partial states
65
+ - responsive and accessibility intent
66
+ - generic UI or AI slop risk
67
+ - whether live design review will be needed after implementation
68
+
69
+ ### DX / Operator
70
+
71
+ Run only for API, CLI, SDK, package, docs, agent skill, MCP, or developer/operator surfaces.
72
+
73
+ Check:
74
+
75
+ - target developer/operator persona
76
+ - time to first value
77
+ - install/run/debug/upgrade path
78
+ - actionable errors: problem + cause + fix
79
+ - copy-paste examples and escape hatches
80
+
81
+ ### TOC Root Cause
82
+
83
+ For complex bugs:
84
+
85
+ 1. Current reality tree: symptoms, causes, enabling conditions.
86
+ 2. Conflict diagram: why the obvious fix conflicts with a real need.
87
+ 3. Future reality tree: what the proposed fix changes and what it may break.
88
+
89
+ If the root cause is not proven, reroute to `cc-investigate`, not `cc-do`.
90
+
91
+ ## Planning Smells
92
+
93
+ Plans can contain smells before code exists:
94
+
95
+ - repeated implementation steps with slight variations
96
+ - parallel data sources
97
+ - task split by technical layer instead of behavior
98
+ - fake abstraction or one-adapter seam
99
+ - missing owner for shared state
100
+ - hand-wavy "handle edge cases" or "add validation"
101
+
102
+ Each planning smell becomes a finding in `task.md` and routes to `cc-plan`.
103
+
104
+ ## Output
105
+
106
+ Write plan review findings directly into `task.md`:
107
+
108
+ - scope or architecture finding
109
+ - evidence and impact
110
+ - required task or contract change
111
+ - decision options when user judgment is needed
112
+ - reroute recommendation
8
113
 
9
- Find scope, architecture, test-strategy, and ambiguity problems. Write findings into `task.md`; final response only summarizes the changed sections. Do not write separate files.
114
+ Final response only summarizes changed `task.md` sections and next route. Do not write separate files.
@@ -1,13 +1,166 @@
1
1
  # Review Methods
2
2
 
3
- Pick only the methods that match the current risk:
3
+ Use this reference for every `cc-review` run. It defines the method library. Load branch-specific references for concrete workflow steps.
4
4
 
5
- - diff review
6
- - test-quality review
7
- - security review
8
- - performance review
9
- - API contract review
10
- - UI/browser review
11
- - documentation/PR body review
5
+ ## Method Selection
12
6
 
13
- Each finding needs evidence, impact, recommendation, and route. Do not write process files.
7
+ Pick every method needed by the current risk. This is a routing map, not a finding cap:
8
+
9
+ | Risk | Method |
10
+ | --- | --- |
11
+ | unclear goal | goal tree |
12
+ | repeated symptom | current reality tree |
13
+ | hidden tradeoff | conflict diagram |
14
+ | uncertain fix impact | future reality tree |
15
+ | implementation complexity | logic tree and smell scan |
16
+ | UI/runtime mismatch | E2E/plugin verification |
17
+ | code quality or simplification risk | cc-simplify reference plus smell scan |
18
+ | broad implementation diff | risk-lane review swarm profile |
19
+
20
+ Selected methods stay in scratch reasoning and final response/task updates. Do not write process files.
21
+
22
+ ## Review Nodes
23
+
24
+ Before findings, mentally create ordered review nodes:
25
+
26
+ ```text
27
+ R001 plan.strategy.outcome
28
+ target: task.md
29
+ method: goal tree
30
+ check: outcome and scope consistency
31
+
32
+ R101 implementation.contract.public-seam
33
+ target: changed code + tests
34
+ method: contract fidelity
35
+ check: public behavior matches task.md
36
+ ```
37
+
38
+ Node rules:
39
+
40
+ - one node reviews one coherent question, artifact, or changed surface
41
+ - every selected method creates at least one node
42
+ - every changed file or user-facing surface is assigned to a node or explicitly skipped
43
+ - every node ends as `checked`, `skipped`, or `blocked`
44
+ - no finding limit exists while nodes remain unchecked
45
+ - prior clean conclusions can be reused only when Git proves the target and dependencies did not change
46
+
47
+ ## Risk-Lane Review Swarm Profile
48
+
49
+ Use this profile when a broad implementation diff, PR landing review, or mixed review benefits from independent context. The profile is a default decomposition, not a requirement to manufacture findings.
50
+
51
+ | Lane | Reviewer question |
52
+ | --- | --- |
53
+ | intent-regression | Does the diff match the intended behavior without extra behavior drift, broken edge cases, fallback loss, or caller/callee contract drift? |
54
+ | security-privacy | Did the diff weaken auth, validation, secret handling, sensitive data boundaries, defaults, or trust of external input? |
55
+ | performance-reliability | Did the diff add duplicate work, hot-path cost, missing cleanup, retry storms, ordering races, or brittle failure handling? |
56
+ | contracts-coverage | Did the diff miss API/schema/type/config/flag alignment, migration fallout, regression tests, logs, metrics, assertions, or error paths? |
57
+
58
+ Small diffs may use one combined reviewer that covers all lanes. Large or multi-surface diffs should assign separate reviewers for the highest-risk lanes when the host supports subagents.
59
+
60
+ ## Aggregation
61
+
62
+ The main thread owns aggregation:
63
+
64
+ - merge duplicate findings under the clearest evidence
65
+ - reject style preferences, nits, and speculative concerns with no concrete impact
66
+ - downgrade low-confidence notes unless they point to critical impact
67
+ - convert intent-unclear claims into decision questions instead of findings
68
+ - order final findings by severity, confidence, and current-scope impact
69
+
70
+ Subagent output is evidence input, not verdict.
71
+
72
+ ## Thinking Tools
73
+
74
+ ### Goal Tree
75
+
76
+ Use when the plan has too many proposed actions and not enough outcome clarity.
77
+
78
+ ```text
79
+ GOAL
80
+ ├── necessary condition A
81
+ │ ├── measurable signal
82
+ │ └── blocked by
83
+ ├── necessary condition B
84
+ └── NOT IN SCOPE
85
+ ```
86
+
87
+ ### Current Reality Tree
88
+
89
+ Use for bugs and recurring failures.
90
+
91
+ ```text
92
+ SYMPTOM
93
+ ├── direct cause
94
+ │ └── deeper cause
95
+ ├── enabling condition
96
+ └── missing control
97
+ ```
98
+
99
+ ### Conflict Diagram
100
+
101
+ Use when two requirements appear incompatible.
102
+
103
+ ```text
104
+ Objective
105
+ ├── Need A -> Want X
106
+ └── Need B -> Want not-X
107
+ Assumption to break: ...
108
+ ```
109
+
110
+ ### Future Reality Tree
111
+
112
+ Use before recommending a non-trivial redesign.
113
+
114
+ ```text
115
+ CHANGE
116
+ ├── desired effect
117
+ ├── possible negative branch
118
+ │ └── prevention
119
+ └── verification signal
120
+ ```
121
+
122
+ ### Logic Tree
123
+
124
+ Use for implementation reviews.
125
+
126
+ ```text
127
+ Entry point
128
+ ├── path A
129
+ │ ├── happy
130
+ │ ├── empty
131
+ │ └── error
132
+ └── path B
133
+ ```
134
+
135
+ ## Code Smell Taxonomy
136
+
137
+ Only report smells inside the current requirement blast radius or smells made worse by the current work.
138
+
139
+ | Smell | Review question | Preferred fix shape |
140
+ | --- | --- | --- |
141
+ | rigidity | Does a small change force unrelated edits? | move decision to one owner |
142
+ | duplication | Is the same logic repeated with small variations? | reuse existing helper or make one narrow helper |
143
+ | cycle | Do modules know each other's internals? | invert dependency or extract boundary |
144
+ | fragility | Can one change break unrelated behavior? | isolate side effects and add focused tests |
145
+ | obscurity | Is intent hidden behind clever names or control flow? | rename, split, or make data shape explicit |
146
+ | data-clump | Do fields always travel together? | group them into one object/value |
147
+ | unnecessary-complexity | Is abstraction solving a hypothetical future? | delete seam or collapse to direct code |
148
+
149
+ ## Severity
150
+
151
+ - `critical`: ships wrong behavior, data/security risk, silent failure, broken root cause, or impossible verification.
152
+ - `important`: likely maintenance, test, UX, DX, performance, or operability problem in current scope.
153
+ - `advisory`: good improvement but not required for this change.
154
+
155
+ ## Confidence
156
+
157
+ - `9-10`: directly verified in code, artifact, command output, UI run, or log.
158
+ - `7-8`: strong evidence from nearby patterns and diff.
159
+ - `5-6`: plausible but needs confirmation; mark as verify-first.
160
+ - `<5`: do not put in main findings unless critical impact.
161
+
162
+ ## Decision Questions
163
+
164
+ Ask only when a finding requires user judgment. Do not stop the whole review at the first decision unless that answer blocks the next review node.
165
+
166
+ Plan decisions are written to `task.md`. Implementation repair choices stay in the response.
@@ -14,6 +14,6 @@ Specs record current capability truth. Roadmap records future work. Git records
14
14
 
15
15
  ## Do Not
16
16
 
17
- - Do not create change-scoped JSON.
17
+ - Do not create extra change-scoped files.
18
18
  - Do not store workflow state in specs.
19
19
  - Do not turn one requirement's implementation detail into capability truth.
@@ -28,7 +28,7 @@ Allowed outputs:
28
28
  - `devflow/specs/INDEX.md`
29
29
  - `devflow/specs/capabilities/<capability>.md`
30
30
 
31
- Do not create change-scoped JSON. Changes link to specs through `task.md`, roadmap text, PR text, and Git commits.
31
+ Changes link to specs through `task.md`, roadmap text, PR text, and Git commits.
32
32
 
33
33
  ## Use This Skill When
34
34
 
package/CHANGELOG.md CHANGED
@@ -9,6 +9,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
9
9
 
10
10
  ## [Unreleased]
11
11
 
12
+ ## [4.5.15] - 2026-05-14
13
+
14
+ ### Added
15
+
16
+ - Added Product / Creative Discovery and Second-Move Review gates to `cc-plan` so non-trivial plans confirm product value and shape before engineering details.
17
+ - Updated `cc-plan` task templates, planning contract, playbook, and examples with durable slots for product shape, narrowest wedge, better-version thinking, and question-quality review.
18
+
19
+ ### Removed
20
+
21
+ - Removed the `cc-devflow query` runtime surface and the `workflow-context` query.
22
+ - Removed workflow-context stage-transition requirements from distributed skills; stages now start from `task.md`, Git, and PR or handoff reality.
23
+
24
+ ## [4.5.14] - 2026-05-14
25
+
26
+ ### Changed
27
+
28
+ - Removed legacy process JSON and report-card assumptions from the public workflow contracts.
29
+ - Restored the artifact-light `cc-plan` planning dialogue flow inside `task.md` contracts.
30
+ - Restored `cc-investigate` root-cause proof, hypothesis, reroute, and feedback-loop guidance without bringing back separate analysis artifacts.
31
+ - Restored `cc-do` TDD execution discipline and `cc-check` fresh-evidence verification flow while keeping durable truth limited to `task.md`, Git, PR briefs, and postmortems.
32
+ - Restored `cc-review` node-by-node review, risk lanes, finding aggregation, and decision-question flow while routing plan findings into `task.md` and implementation findings through user-selected repair options.
33
+
12
34
  ## [4.5.13] - 2026-05-13
13
35
 
14
36
  ### Changed
package/README.md CHANGED
@@ -95,15 +95,15 @@ flowchart TD
95
95
  | --- | --- | --- |
96
96
  | `cc-roadmap` | You need product direction, staged scope, or backlog order | `devflow/roadmap.json`, `devflow/ROADMAP.md`, deprecated `devflow/BACKLOG.md` |
97
97
  | `cc-next` | You need to pick the next roadmap-aware ready target from roadmap, unarchived local changes, and issue truth | one Goal Packet for `cc-dev` |
98
- | `cc-dev` | A selected objective should be driven in the current worktree to a remote PR | PDCA/IDCA artifacts plus a PR or handoff |
99
- | `cc-plan` | A feature or change needs scope, design, and task freezing | `planning/tasks.md#Contract Summary`, `task-manifest.json`, `change-meta.json` |
100
- | `cc-investigate` | A bug needs symptom, reproduction, root cause, and repair boundary | `planning/tasks.md#Root Cause Contract`, `task-manifest.json`, `change-meta.json` |
101
- | `cc-do` | Planned or investigated work needs implementation | code, tests, task state, scratch runtime |
102
- | `cc-review` | Complex plans, investigations, or diffs need optional deep multi-round review before implementation or verification | `review-ledger.jsonl`, optional `review-findings.json`, optional rendered Markdown |
98
+ | `cc-dev` | A selected objective should be driven in the current worktree to a remote PR | `task.md`, Git commits, and a PR or handoff |
99
+ | `cc-plan` | A feature or change needs scope, design, and task freezing | `task.md#Contract Summary` |
100
+ | `cc-investigate` | A bug needs symptom, reproduction, root cause, and repair boundary | `task.md#Root Cause Contract` |
101
+ | `cc-do` | Planned or investigated work needs implementation | code, tests, `task.md` status, Git commit |
102
+ | `cc-review` | Complex plans, investigations, or diffs need optional deep review before implementation or verification | plan findings in `task.md`; implementation findings and repair options in the response |
103
103
  | `cc-pr-review` | A remote PR needs an independent review session before landing | PR review packet, findings, and landing verdict |
104
104
  | `cc-pr-land` | Reviewed PRs need rebase-first landing into main with parity proof | integrated main plus local/remote parity evidence |
105
- | `cc-check` | Work needs fresh verification evidence | `report-card.json` |
106
- | `cc-act` | Verified work needs a PR, local handoff, release note, or closeout | one final handoff file |
105
+ | `cc-check` | Work needs fresh verification evidence | pass/fail/blocked response and Git commit |
106
+ | `cc-act` | Verified work needs a PR, local handoff, or closeout | optional `handoff/pr-brief.md`, Git/PR truth, or incident postmortem |
107
107
 
108
108
  Maintenance skills are shipped with the pack:
109
109
 
@@ -114,19 +114,19 @@ Maintenance skills are shipped with the pack:
114
114
 
115
115
  `cc-roadmap` now records planning posture, evidence maturity, canonical project language, and durable decision context before recommending a route. That keeps idea-stage, active-user, paying-customer, infrastructure, and recovery work from being forced through the same questions, and prevents roadmap items from inventing a second vocabulary. Developer-facing or operator-facing roadmap items also carry target user, time to first value, magic moment, adoption bottleneck, and domain handoff into `cc-plan`.
116
116
 
117
- Canonical language and durable decisions stay inside cc-devflow-native sources: `devflow/specs/`, `devflow/roadmap.json`, `devflow/ROADMAP.md`, `planning/tasks.md`, and `change-meta.json`. Legacy `planning/design.md` and `planning/analysis.md` remain readable fallback inputs for older changes.
117
+ Canonical language and durable decisions stay inside cc-devflow-native sources: `devflow/specs/`, `devflow/roadmap.json`, `devflow/ROADMAP.md`, `task.md`, Git history, and PR truth. Legacy planning artifacts are readable fallback inputs only.
118
118
 
119
119
  `cc-plan` freezes more implementation decisions before `cc-do` starts. Non-trivial plans compare minimal viable and ideal architecture options, full designs include decision horizon plus error/rescue mapping, and test-first plans record test framework evidence, public test seams, spec-style test names, public verification paths, behavior assertions, mock boundaries, coverage quality, mandatory regression tests, interface depth, Green minimality guards, refactor candidates, and vertical tracer-bullet slices when existing behavior changes. Before handoff, `cc-plan` and `cc-investigate` also reconcile the source roadmap item so RM status, REQ/FIX binding, progress, and spec diagnosis do not drift from the frozen change artifacts.
120
120
 
121
- Every post-planning stage can start from `cc-devflow query workflow-context --change <id> --change-key <key> --data-only --no-trace --compact`. Treat the result as a context index, not semantic compression: it routes the next stage, names the current task, carries source hashes, `mustNotForget` constraints, default section/JSON refs, trusted commands, fail-closed rules, and machine-readable deep-open conditions. Source artifacts still decide disputed facts. This primarily reduces stage-routing and context-reset reads; end-to-end PDCA/IDCA savings depend on how often agents open `defaultOpen` and `deepOpen` refs. Use `npm run benchmark:workflow-context` to inspect token estimates plus routing correctness over the checked-in and synthetic examples. Use `npm run benchmark:skills` to keep public skill entrypoints thin; deeper planning rules should live behind conditional references instead of default context.
121
+ Every post-planning stage starts from `task.md`, current Git history/status, and PR or handoff truth when present. There is no runtime context query layer; disputed facts must be re-read from source artifacts. Use `npm run benchmark:skills` to keep public skill entrypoints thin; deeper planning rules should live behind conditional references instead of default context.
122
122
 
123
- `cc-review` is optional and deeper than `cc-check`. It can run immediately after `cc-plan` / `cc-investigate` to review the frozen plan or root-cause contract, or after `cc-do` to review the implementation. It reads prior review records and current git/artifact delta, then records review lifecycle events through `cc-devflow review start`, `record-node`, `add-finding`, and `close` into `review-ledger.jsonl`. Human Markdown reports are rendered on demand with `cc-devflow review render`. When the host supports subagents, selected nodes can be dispatched to independent read-only reviewers so strategy, engineering, design, DX, smell, test, and runtime checks do not share one contaminated context. Broad implementation reviews can use separate risk lanes for intent/regression, security/privacy, performance/reliability, and contracts/coverage before the main thread triages raw findings. Plan reviews borrow strategy/design/engineering/DX methods through progressive references, while implementation reviews inspect diff scope, code smells, tests, UI/runtime behavior, Browser/Computer Use evidence, and logs when applicable. Findings route back to `cc-plan` or `cc-do`; clean implementation reviews continue to `cc-check`.
123
+ `cc-review` is optional and deeper than `cc-check`. It can run immediately after `cc-plan` / `cc-investigate` to review the frozen plan or root-cause contract, or after `cc-do` to review the implementation. Plan and investigation review findings are written directly into `task.md`. Implementation review findings are returned in the response with repair options; the user chooses the repair path before code is edited. PR reviews stay in the response or GitHub review. No local review report, ledger, findings JSON, or other review output file is written.
124
124
 
125
125
  ## Verification And Ship Gates
126
126
 
127
127
  `cc-check` now treats QA as a feedback-loop problem, not only a green-test problem. Bugfix and behavior work records the loop used to prove reality, expected versus actual behavior, reproduction steps, test boundary quality, and architecture follow-ups when no clean public test seam exists.
128
128
 
129
- `cc-act` carries that evidence into PR briefs, handoffs, and release notes. It checks source roadmap progress during closeout, updates `devflow/roadmap.json`, and regenerates `devflow/ROADMAP.md` / `devflow/BACKLOG.md` when verified reality changes. Follow-ups must be durable behavior briefs with current behavior, desired behavior, key interfaces, acceptance criteria, and explicit out-of-scope notes before they are written back to roadmap or backlog.
129
+ `cc-act` carries that evidence into PR briefs, handoffs, or incident postmortems when needed. It checks source roadmap progress during closeout, updates `devflow/roadmap.json`, and regenerates `devflow/ROADMAP.md` / `devflow/BACKLOG.md` when verified reality changes. Follow-ups must be durable behavior briefs with current behavior, desired behavior, key interfaces, acceptance criteria, and explicit out-of-scope notes before they are written back to roadmap or backlog.
130
130
 
131
131
  ## Installation Modes
132
132
 
@@ -244,19 +244,17 @@ The currently distributed skill folders are:
244
244
 
245
245
  - `devflow/specs/` stores durable capability truth: `INDEX.md` plus `capabilities/*.md`.
246
246
  - New change directories use `REQ-<number>-<description>` for requirements or `FIX-<number>-<description>` for bug fixes. `REQ` and `FIX` numbers advance independently, so the same number may exist in both prefixes. Parallel worktrees may also create repeated numbers; the full change key must use a specific description to distinguish the work.
247
- - `devflow/changes/<change>/` stores durable change truth: CLI-generated `change-meta.json`, `planning/tasks.md`, CLI-generated `task-manifest.json`, review ledger/findings records, optional CLI logs for debug/failure, `report-card.json`, and one final handoff file. Task `context.md`, `checkpoint.json`, review markdown, and AI-written process files are not default durable truth.
248
- - New changes default to one human-authored Markdown artifact: `planning/tasks.md`. Feature plans put the frozen design in `## Contract Summary`; bug investigations put root-cause truth in `## Root Cause Contract`. Legacy `planning/design.md`, `planning/analysis.md`, and `cc-review-*.md` remain fallback inputs, not new default writes.
249
- - Machine JSON is CLI-owned: write the human contract in `planning/tasks.md`, then run `cc-devflow task-contract compile` / `validate`; do not handwrite `task-manifest.json` or `change-meta.json`.
250
- - Use `cc-devflow task-contract validate`, `npm run verify:artifacts`, `npm run benchmark:artifacts`, and `npm run benchmark:skills` to keep workflow artifacts and skill entrypoints small and measurable.
247
+ - `devflow/changes/<change>/` stores durable change truth in `task.md`, optional `handoff/pr-brief.md`, and Git commits. Real recurring failures may also write incident postmortems under `devflow/postmortems/`.
248
+ - New changes default to one human-authored Markdown artifact: `task.md`. Feature plans put the frozen design in `## Contract Summary`; bug investigations put root-cause truth in `## Root Cause Contract`. Legacy planning and review artifacts are readable fallback inputs only.
249
+ - Workflow state is Git-owned: keep `task.md` current, commit each completed stage/environment, and do not create extra process files.
250
+ - Use `npm run verify:examples` and `npm run benchmark:skills` to keep workflow truth and skill entrypoints small and measurable.
251
251
  - `devflow/workspaces/<change>/` stores ephemeral runtime scratch such as worker assignment, journals, prompts, and session logs.
252
252
  - Regenerable files should not be persisted under `devflow/changes/`.
253
253
 
254
254
  Artifact contract quick checks:
255
255
 
256
256
  ```bash
257
- npx cc-devflow task-contract validate --change REQ-001 --change-key REQ-001-copy-invite-link
258
- npm run verify:artifacts
259
- npm run benchmark:artifacts
257
+ npm run verify:examples
260
258
  npm run benchmark:skills
261
259
  ```
262
260
 
package/README.zh-CN.md CHANGED
@@ -95,15 +95,15 @@ flowchart TD
95
95
  | --- | --- | --- |
96
96
  | `cc-roadmap` | 需要产品方向、阶段范围或 backlog 顺序 | `devflow/roadmap.json`、`devflow/ROADMAP.md`、deprecated `devflow/BACKLOG.md` |
97
97
  | `cc-next` | 需要从 roadmap、未归档本地 change 和 issue truth 里选下一个 ready 目标 | 交给 `cc-dev` 的 Goal Packet |
98
- | `cc-dev` | 已选目标要在当前 worktree 内自动推进到远程 PR | PDCA/IDCA 产物加 PR 或 handoff |
99
- | `cc-plan` | 新功能或变更需要澄清范围、设计方案、冻结任务 | `planning/tasks.md#Contract Summary`、`task-manifest.json`、`change-meta.json` |
100
- | `cc-investigate` | Bug 需要症状、复现、根因和修复边界 | `planning/tasks.md#Root Cause Contract`、`task-manifest.json`、`change-meta.json` |
101
- | `cc-do` | 已计划或已调查的任务需要实现 | 代码、测试、任务状态、scratch runtime |
102
- | `cc-review` | 复杂方案、调查根因或 diff 需要在实现前或验证前做可选深度多轮 Review | `review-ledger.jsonl`,可选 `review-findings.json`,可按需渲染 Markdown |
98
+ | `cc-dev` | 已选目标要在当前 worktree 内自动推进到远程 PR | `task.md`、Git commit、PR 或 handoff |
99
+ | `cc-plan` | 新功能或变更需要澄清范围、设计方案、冻结任务 | `task.md#Contract Summary` |
100
+ | `cc-investigate` | Bug 需要症状、复现、根因和修复边界 | `task.md#Root Cause Contract` |
101
+ | `cc-do` | 已计划或已调查的任务需要实现 | 代码、测试、`task.md` 状态、Git commit |
102
+ | `cc-review` | 复杂方案、调查根因或 diff 需要在实现前或验证前做可选深度 Review | 计划 finding 写入 `task.md`;执行 finding 和修复选项回到对话 |
103
103
  | `cc-pr-review` | 远程 PR 需要单独会话做合并前 Review | PR review packet、findings 和 landing verdict |
104
104
  | `cc-pr-land` | 已 Review PR 需要 rebase-first 合并到 main 并证明 parity | 已集成 main 和本地 / 远程一致性证据 |
105
- | `cc-check` | 工作需要新鲜验证证据 | `report-card.json` |
106
- | `cc-act` | 已验证工作需要 PR、本地 handoff、release note 或 closeout | 唯一最终 handoff 文件 |
105
+ | `cc-check` | 工作需要新鲜验证证据 | pass/fail/blocked 回复和 Git commit |
106
+ | `cc-act` | 已验证工作需要 PR、本地 handoff 或 closeout | 可选 `handoff/pr-brief.md`、Git/PR 真相或 incident postmortem |
107
107
 
108
108
  整包还包含两个维护类 Skill:
109
109
 
@@ -114,13 +114,13 @@ flowchart TD
114
114
 
115
115
  `cc-roadmap` 现在会先记录 planning posture、evidence maturity、项目 canonical language 和持久决策上下文,再推荐路线。idea、已有用户、付费客户、infra、recovery 场景不会被套进同一组问题,也不会让 roadmap item 发明第二套词汇。面向开发者或操作者的 roadmap item 还会把目标用户、time to first value、magic moment、adoption bottleneck 和 domain handoff 交给 `cc-plan`。
116
116
 
117
- Canonical language 和 durable decisions 只收敛到 cc-devflow 原生真相源:`devflow/specs/`、`devflow/roadmap.json`、`devflow/ROADMAP.md`、`planning/tasks.md``change-meta.json`。历史 `planning/design.md` / `planning/analysis.md` 只作为旧 change 的可读 fallback。
117
+ Canonical language 和 durable decisions 只收敛到 cc-devflow 原生真相源:`devflow/specs/`、`devflow/roadmap.json`、`devflow/ROADMAP.md`、`task.md`、Git history PR truth。历史 planning artifacts 只作为可读 fallback 输入。
118
118
 
119
119
  `cc-plan` 会在 `cc-do` 开始前冻结更多实现决策。非 trivial 计划需要比较 minimal viable 和 ideal architecture,full-design 需要包含 implementation decision horizon 和 error/rescue map;测试计划要记录测试框架证据、public test seam、spec-style test name、public verification path、behavior assertion、mock boundary、覆盖质量、强制 regression test、interface depth、Green minimality guard、refactor candidates 和 vertical tracer-bullet slices。交接前,`cc-plan` 和 `cc-investigate` 还会校准 source roadmap item,让 RM 状态、REQ/FIX 绑定、progress 和 spec diagnosis 不再漂移。
120
120
 
121
- planning 之后的每个阶段都可以先运行 `cc-devflow query workflow-context --change <id> --change-key <key> --data-only --no-trace --compact`。把结果当成 context index,而不是语义压缩:它负责路由下一阶段、标记当前 task、携带 source hash、`mustNotForget` 约束、默认 section/JSON refs、可信命令、fail-closed 规则和机器可读 deep-open 条件;有争议的事实仍由源 artifact 裁决。它主要降低 stage-routing 和 context-reset 的读取成本;端到端 PDCA/IDCA 节省取决于 agent 实际打开多少 `defaultOpen` 和 `deepOpen` refs。可用 `npm run benchmark:workflow-context` 查看仓库示例和合成用例上的 token 估算与路由正确性。用 `npm run benchmark:skills` 保持 public skill 入口足够薄;深层规划规则应该放在条件 reference 后面,而不是默认上下文里。
121
+ planning 之后的每个阶段都从 `task.md`、当前 Git history/status,以及存在时的 PR handoff truth 开始。系统不再提供 runtime context query 层;有争议的事实必须回到源 artifact 重新读取。用 `npm run benchmark:skills` 保持 public skill 入口足够薄;深层规划规则应该放在条件 reference 后面,而不是默认上下文里。
122
122
 
123
- `cc-review` 是可选的深度 Review,不替代 `cc-check`。它可以接在 `cc-plan` / `cc-investigate` 后审冻结的计划或根因合同,也可以接在 `cc-do` 后审实现。它先读取上次 Review 记录和当前 git/artifact delta,再通过 `cc-devflow review start`、`record-node`、`add-finding`、`close` 把生命周期事件写进 `review-ledger.jsonl`。需要人类 Markdown 报告时,再用 `cc-devflow review render` 按需渲染。宿主支持 subAgent 时,选中的节点可以派给独立只读 reviewer,让 strategy、engineering、design、DX、坏味道、测试和运行时审查不共享同一个被污染的上下文。复杂实现 Review 可以把 intent/regressionsecurity/privacyperformance/reliability、contracts/coverage 拆成独立风险 lane,再由主线程聚合和筛掉弱 findings。计划 Review 通过渐进式 references 借鉴 strategy / design / engineering / DX 方法;实现 Review 检查 diff 范围、代码坏味道、测试、UI/runtime 行为、Browser/Computer Use 证据和日志。Finding 回到 `cc-plan` 或 `cc-do`;实现 Review 干净后再进入 `cc-check`。
123
+ `cc-review` 是可选的深度 Review,不替代 `cc-check`。它可以接在 `cc-plan` / `cc-investigate` 后审冻结的计划或根因合同,也可以接在 `cc-do` 后审实现。计划 / 调查 Review finding 直接写进 `task.md`。执行 Review finding 在当前回复里组织成修复选项,用户选择后才改代码。PR Review 只留在对话或 GitHub review 中。不写本地 review reportledger、findings JSON 或其它 Review 产物文件。
124
124
 
125
125
  ## 验证与交付门禁
126
126
 
@@ -244,19 +244,17 @@ npx cc-devflow config doctor --cwd /path/to/your/project
244
244
 
245
245
  - `devflow/specs/` 保存 durable capability truth:`INDEX.md` 和 `capabilities/*.md`。
246
246
  - 新 change 目录使用 `REQ-<number>-<description>` 表示需求,使用 `FIX-<number>-<description>` 表示 Bug 修复。`REQ` 和 `FIX` 各自递增自己的编号,跨前缀同号允许共存。并行工作树也可能产生重复编号,必须用完整 change key 的描述区分业务内容。
247
- - `devflow/changes/<change>/` 保存 durable change truth:CLI 生成的 `change-meta.json`、`planning/tasks.md`、CLI 生成的 `task-manifest.json`、review ledger / findings 记录、debug / failed 的可选 CLI 日志、`report-card.json` 和唯一最终 handoff 文件。任务级 `context.md`、`checkpoint.json`、review markdown AI 手写过程文件不是默认 durable truth
248
- - 新 change 默认只有一个人工编写的 Markdown artifact:`planning/tasks.md`。功能计划把冻结设计写进 `## Contract Summary`;Bug 调查把根因真相写进 `## Root Cause Contract`。历史 `planning/design.md`、`planning/analysis.md` 和 `cc-review-*.md` 只作为旧 change fallback 输入,不再是新默认写入。
249
- - 机器态 JSON 归 CLI 所有:先把人类合同写进 `planning/tasks.md`,再运行 `cc-devflow task-contract compile` / `validate`;不要手写 `task-manifest.json` 或 `change-meta.json`。
250
- - 用 `cc-devflow task-contract validate`、`npm run verify:artifacts`、`npm run benchmark:artifacts` 和 `npm run benchmark:skills` 保持 workflow artifact 与 skill 入口小而可测。
247
+ - `devflow/changes/<change>/` durable change truth 只保留 `task.md`、可选 `handoff/pr-brief.md` Git commits。真实复发故障可以在 `devflow/postmortems/` incident postmortem
248
+ - 新 change 默认只有一个人工编写的 Markdown artifact:`task.md`。功能计划把冻结设计写进 `## Contract Summary`;Bug 调查把根因真相写进 `## Root Cause Contract`。历史 planning / review artifacts 只作为可读 fallback 输入。
249
+ - 流程状态归 Git:保持 `task.md` 当前,每个完成阶段 / 执行环境提交 commit,不创建额外过程文件。
250
+ - 用 `npm run verify:examples` 和 `npm run benchmark:skills` 保持 workflow truth 与 skill 入口小而可测。
251
251
  - `devflow/workspaces/<change>/` 保存 ephemeral runtime scratch,例如 worker assignment、journal、prompt 和 session log。
252
252
  - 能从 durable truth 再生成的文件,不应该持久化到 `devflow/changes/`。
253
253
 
254
254
  Artifact contract 快速检查:
255
255
 
256
256
  ```bash
257
- npx cc-devflow task-contract validate --change REQ-001 --change-key REQ-001-copy-invite-link
258
- npm run verify:artifacts
259
- npm run benchmark:artifacts
257
+ npm run verify:examples
260
258
  npm run benchmark:skills
261
259
  ```
262
260