npm - cc-devflow - Versions diffs - 4.5.13 → 4.5.15 - Mend

cc-devflow 4.5.13 → 4.5.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

package/.claude/skills/cc-act/SKILL.md +2 -2
package/.claude/skills/cc-check/CHANGELOG.md +6 -0
package/.claude/skills/cc-check/PLAYBOOK.md +18 -1
package/.claude/skills/cc-check/SKILL.md +59 -3
package/.claude/skills/cc-check/references/gate-contract.md +34 -0
package/.claude/skills/cc-check/references/review-contract.md +11 -0
package/.claude/skills/cc-dev/SKILL.md +1 -1
package/.claude/skills/cc-dev/scripts/resolve-cc-devflow.sh +8 -26
package/.claude/skills/cc-do/CHANGELOG.md +6 -0
package/.claude/skills/cc-do/PLAYBOOK.md +23 -6
package/.claude/skills/cc-do/SKILL.md +20 -4
package/.claude/skills/cc-do/references/execution-recovery.md +15 -3
package/.claude/skills/cc-investigate/CHANGELOG.md +6 -0
package/.claude/skills/cc-investigate/PLAYBOOK.md +24 -0
package/.claude/skills/cc-investigate/SKILL.md +39 -3
package/.claude/skills/cc-investigate/assets/TASKS_TEMPLATE.md +68 -1
package/.claude/skills/cc-investigate/references/investigation-contract.md +32 -0
package/.claude/skills/cc-plan/CHANGELOG.md +14 -0
package/.claude/skills/cc-plan/PLAYBOOK.md +21 -1
package/.claude/skills/cc-plan/SKILL.md +77 -10
package/.claude/skills/cc-plan/assets/TASKS_TEMPLATE.md +61 -3
package/.claude/skills/cc-plan/references/planning-contract.md +28 -4
package/.claude/skills/cc-review/CHANGELOG.md +6 -0
package/.claude/skills/cc-review/PLAYBOOK.md +9 -3
package/.claude/skills/cc-review/SKILL.md +52 -1
package/.claude/skills/cc-review/references/implementation-review-branch.md +106 -6
package/.claude/skills/cc-review/references/plan-review-branch.md +109 -4
package/.claude/skills/cc-review/references/review-methods.md +162 -9
package/.claude/skills/cc-spec-init/PLAYBOOK.md +1 -1
package/.claude/skills/cc-spec-init/SKILL.md +1 -1
package/CHANGELOG.md +22 -0
package/README.md +16 -18
package/README.zh-CN.md +15 -17
package/bin/cc-devflow-cli.js +8 -94
package/docs/examples/example-bindings.json +5 -5
package/docs/examples/full-design-blocked/README.md +1 -1
package/docs/examples/full-design-blocked/changes/REQ-002-bulk-invite-import/task.md +17 -5
package/docs/examples/local-handoff/README.md +1 -1
package/docs/examples/local-handoff/changes/REQ-003-audit-log-export/task.md +17 -5
package/docs/examples/pdca-loop/README.md +1 -1
package/docs/examples/pdca-loop/changes/REQ-001-copy-invite-link/task.md +17 -5
package/docs/guides/artifact-contract.md +1 -1
package/docs/guides/getting-started.md +3 -3
package/docs/guides/getting-started.zh-CN.md +3 -3
package/docs/guides/minimize-artifacts.md +1 -7
package/lib/skill-runtime/CLAUDE.md +1 -1
package/lib/skill-runtime/index.js +1 -9
package/package.json +1 -1
package/lib/skill-runtime/errors.js +0 -39
package/lib/skill-runtime/query-registry.js +0 -101
package/lib/skill-runtime/query.js +0 -126
package/lib/skill-runtime/trace.js +0 -22

package/.claude/skills/cc-review/references/implementation-review-branch.md CHANGED Viewed

@@ -1,12 +1,112 @@
 # Implementation Review Branch
-Read:
+Use this reference when the review target is code, tests, docs, UI behavior, or a current branch diff.
-1. current Git diff
-2. `task.md`
-3. changed code and tests
-4. fresh command output when available
+## Intake
-Review behavior, regression risk, security, reliability, test quality, and code smells inside the current blast radius.
+Read, in order:
+1. current branch and base branch
+2. `git diff <base>...HEAD --stat`
+3. full diff for changed files
+4. `task.md`
+5. changed code plus direct importers/callers for enum, state, API, and behavior changes
+6. fresh command output when available
+If no plan exists, infer intent from user request, commits, TODOs, and PR body if present. Mark intent confidence.
+## Scope Check
+Produce this in scratch reasoning before findings:
+```text
+Scope Check: CLEAN | DRIFT DETECTED | REQUIREMENTS MISSING
+Intent: ...
+Delivered: ...
+Diff surface: ...
+```
+Out-of-scope files are findings only when they change behavior or expand blast radius.
+## Diff Review Passes
+Turn these passes into review nodes before reporting findings. Every changed file, public behavior, test surface, documentation surface, and UI/runtime flow belongs to a node or has a skip reason.
+For broad or PR-landing diffs, use the risk-lane profile from `review-methods.md` before final findings:
+1. Intent and regression
+2. Security and privacy
+3. Performance and reliability
+4. Contracts and coverage
+### Contract Fidelity
+Check whether implementation matches `task.md` or investigation:
+- required tasks done
+- rejected scope not implemented
+- root cause still true
+- expected spec delta honored
+- behavior visible at public seam
+### Code Smell Scan
+Use `review-methods.md` smell taxonomy.
+Look for:
+- copy-paste helper logic
+- broad catch-all errors
+- parameter clumps
+- shallow pass-through modules
+- internal mocks driving production design
+- new branch forests where a data shape would collapse cases
+- hidden state or multiple truth sources
+- cycles between modules
+### Structural Risk
+Check:
+- security and trust boundaries
+- enum/value completeness outside the diff
+- migrations and rollback
+- concurrency and double-submit
+- external service failures
+- logs/metrics for new paths
+### Test Quality
+Build a coverage map:
+```text
+CODE PATHS                         USER/RUNTIME FLOWS
+file.ts                            feature flow
+├── [tested] happy                 ├── [tested] main path
+├── [gap] empty                    ├── [gap] double action
+└── [gap] upstream error           └── [gap] navigate away / timeout
+```
+Flag:
+- no regression test for changed behavior
+- tests only assert implementation shape
+- tests mock internal modules instead of public seam
+- fixture lies with missing fields or type casts
+- no UI/E2E proof for user-visible change
+### Documentation and DX
+If changed behavior affects README, guides, CLI help, package install, public API, agent skill usage, or examples, check whether docs changed too.
+## Fix Policy
 Findings stay in the response. Ask which repair option to apply before editing code. Do not write process files.
+Return:
+- findings ordered by severity
+- smallest safe repair option
+- broader cleanup option when the smell is real
+- defer option with explicit risk
+- recommendation and route

package/.claude/skills/cc-review/references/plan-review-branch.md CHANGED Viewed

@@ -1,9 +1,114 @@
 # Plan Review Branch
-Read:
+Use this reference when the review target is a plan, investigation handoff, or mixed branch whose plan contract may be wrong.
+## Intake
+Read, in order:
 1. `task.md`
-2. relevant roadmap or issue text
-3. affected code/tests/docs
+2. relevant roadmap, issue, PR text, or user request
+3. affected code/tests/docs referenced by the plan
+4. existing command output only when it proves or disproves a planning assumption
+If no `task.md` exists, review the user-provided plan text and make missing durable task contract a finding.
+## Review Facets
+Select only applicable facets, but do not skip a selected facet to keep the answer short.
+### Strategy
+Check:
+- Is this the right problem?
+- Is the stated user/business outcome direct or only a proxy?
+- What happens if we do nothing?
+- What does the 12-month ideal look like?
+- What existing code or workflow already solves part of this?
+Useful shape:
+```text
+CURRENT -> THIS PLAN -> 12-MONTH IDEAL
+```
+### Engineering
+Check:
+- component boundaries
+- data flow and shadow paths
+- state transitions
+- security boundaries
+- rollback shape
+- testability seam
+- parallelization risk
+For non-trivial plans, reason through:
+```text
+Entry -> validate -> transform -> persist -> output
+  |        |            |            |          |
+ nil     invalid      exception    conflict   stale
+ empty   wrong type   timeout      duplicate  partial
+```
+### Design
+Run only for user-facing UI or interaction flows.
+Check:
+- first, second, third thing the user sees
+- loading / empty / error / success / partial states
+- responsive and accessibility intent
+- generic UI or AI slop risk
+- whether live design review will be needed after implementation
+### DX / Operator
+Run only for API, CLI, SDK, package, docs, agent skill, MCP, or developer/operator surfaces.
+Check:
+- target developer/operator persona
+- time to first value
+- install/run/debug/upgrade path
+- actionable errors: problem + cause + fix
+- copy-paste examples and escape hatches
+### TOC Root Cause
+For complex bugs:
+1. Current reality tree: symptoms, causes, enabling conditions.
+2. Conflict diagram: why the obvious fix conflicts with a real need.
+3. Future reality tree: what the proposed fix changes and what it may break.
+If the root cause is not proven, reroute to `cc-investigate`, not `cc-do`.
+## Planning Smells
+Plans can contain smells before code exists:
+- repeated implementation steps with slight variations
+- parallel data sources
+- task split by technical layer instead of behavior
+- fake abstraction or one-adapter seam
+- missing owner for shared state
+- hand-wavy "handle edge cases" or "add validation"
+Each planning smell becomes a finding in `task.md` and routes to `cc-plan`.
+## Output
+Write plan review findings directly into `task.md`:
+- scope or architecture finding
+- evidence and impact
+- required task or contract change
+- decision options when user judgment is needed
+- reroute recommendation
-Find scope, architecture, test-strategy, and ambiguity problems. Write findings into `task.md`; final response only summarizes the changed sections. Do not write separate files.
+Final response only summarizes changed `task.md` sections and next route. Do not write separate files.

package/.claude/skills/cc-review/references/review-methods.md CHANGED Viewed

@@ -1,13 +1,166 @@
 # Review Methods
-Pick only the methods that match the current risk:
+Use this reference for every `cc-review` run. It defines the method library. Load branch-specific references for concrete workflow steps.
-- diff review
-- test-quality review
-- security review
-- performance review
-- API contract review
-- UI/browser review
-- documentation/PR body review
+## Method Selection
-Each finding needs evidence, impact, recommendation, and route. Do not write process files.
+Pick every method needed by the current risk. This is a routing map, not a finding cap:
+| Risk | Method |
+| --- | --- |
+| unclear goal | goal tree |
+| repeated symptom | current reality tree |
+| hidden tradeoff | conflict diagram |
+| uncertain fix impact | future reality tree |
+| implementation complexity | logic tree and smell scan |
+| UI/runtime mismatch | E2E/plugin verification |
+| code quality or simplification risk | cc-simplify reference plus smell scan |
+| broad implementation diff | risk-lane review swarm profile |
+Selected methods stay in scratch reasoning and final response/task updates. Do not write process files.
+## Review Nodes
+Before findings, mentally create ordered review nodes:
+```text
+R001 plan.strategy.outcome
+  target: task.md
+  method: goal tree
+  check: outcome and scope consistency
+R101 implementation.contract.public-seam
+  target: changed code + tests
+  method: contract fidelity
+  check: public behavior matches task.md
+```
+Node rules:
+- one node reviews one coherent question, artifact, or changed surface
+- every selected method creates at least one node
+- every changed file or user-facing surface is assigned to a node or explicitly skipped
+- every node ends as `checked`, `skipped`, or `blocked`
+- no finding limit exists while nodes remain unchecked
+- prior clean conclusions can be reused only when Git proves the target and dependencies did not change
+## Risk-Lane Review Swarm Profile
+Use this profile when a broad implementation diff, PR landing review, or mixed review benefits from independent context. The profile is a default decomposition, not a requirement to manufacture findings.
+| Lane | Reviewer question |
+| --- | --- |
+| intent-regression | Does the diff match the intended behavior without extra behavior drift, broken edge cases, fallback loss, or caller/callee contract drift? |
+| security-privacy | Did the diff weaken auth, validation, secret handling, sensitive data boundaries, defaults, or trust of external input? |
+| performance-reliability | Did the diff add duplicate work, hot-path cost, missing cleanup, retry storms, ordering races, or brittle failure handling? |
+| contracts-coverage | Did the diff miss API/schema/type/config/flag alignment, migration fallout, regression tests, logs, metrics, assertions, or error paths? |
+Small diffs may use one combined reviewer that covers all lanes. Large or multi-surface diffs should assign separate reviewers for the highest-risk lanes when the host supports subagents.
+## Aggregation
+The main thread owns aggregation:
+- merge duplicate findings under the clearest evidence
+- reject style preferences, nits, and speculative concerns with no concrete impact
+- downgrade low-confidence notes unless they point to critical impact
+- convert intent-unclear claims into decision questions instead of findings
+- order final findings by severity, confidence, and current-scope impact
+Subagent output is evidence input, not verdict.
+## Thinking Tools
+### Goal Tree
+Use when the plan has too many proposed actions and not enough outcome clarity.
+```text
+GOAL
+├── necessary condition A
+│   ├── measurable signal
+│   └── blocked by
+├── necessary condition B
+└── NOT IN SCOPE
+```
+### Current Reality Tree
+Use for bugs and recurring failures.
+```text
+SYMPTOM
+├── direct cause
+│   └── deeper cause
+├── enabling condition
+└── missing control
+```
+### Conflict Diagram
+Use when two requirements appear incompatible.
+```text
+Objective
+├── Need A -> Want X
+└── Need B -> Want not-X
+Assumption to break: ...
+```
+### Future Reality Tree
+Use before recommending a non-trivial redesign.
+```text
+CHANGE
+├── desired effect
+├── possible negative branch
+│   └── prevention
+└── verification signal
+```
+### Logic Tree
+Use for implementation reviews.
+```text
+Entry point
+├── path A
+│   ├── happy
+│   ├── empty
+│   └── error
+└── path B
+```
+## Code Smell Taxonomy
+Only report smells inside the current requirement blast radius or smells made worse by the current work.
+| Smell | Review question | Preferred fix shape |
+| --- | --- | --- |
+| rigidity | Does a small change force unrelated edits? | move decision to one owner |
+| duplication | Is the same logic repeated with small variations? | reuse existing helper or make one narrow helper |
+| cycle | Do modules know each other's internals? | invert dependency or extract boundary |
+| fragility | Can one change break unrelated behavior? | isolate side effects and add focused tests |
+| obscurity | Is intent hidden behind clever names or control flow? | rename, split, or make data shape explicit |
+| data-clump | Do fields always travel together? | group them into one object/value |
+| unnecessary-complexity | Is abstraction solving a hypothetical future? | delete seam or collapse to direct code |
+## Severity
+- `critical`: ships wrong behavior, data/security risk, silent failure, broken root cause, or impossible verification.
+- `important`: likely maintenance, test, UX, DX, performance, or operability problem in current scope.
+- `advisory`: good improvement but not required for this change.
+## Confidence
+- `9-10`: directly verified in code, artifact, command output, UI run, or log.
+- `7-8`: strong evidence from nearby patterns and diff.
+- `5-6`: plausible but needs confirmation; mark as verify-first.
+- `<5`: do not put in main findings unless critical impact.
+## Decision Questions
+Ask only when a finding requires user judgment. Do not stop the whole review at the first decision unless that answer blocks the next review node.
+Plan decisions are written to `task.md`. Implementation repair choices stay in the response.

package/.claude/skills/cc-spec-init/PLAYBOOK.md CHANGED Viewed

@@ -14,6 +14,6 @@ Specs record current capability truth. Roadmap records future work. Git records
 ## Do Not
-- Do not create change-scoped JSON.
+- Do not create extra change-scoped files.
 - Do not store workflow state in specs.
 - Do not turn one requirement's implementation detail into capability truth.

package/.claude/skills/cc-spec-init/SKILL.md CHANGED Viewed

@@ -28,7 +28,7 @@ Allowed outputs:
 - `devflow/specs/INDEX.md`
 - `devflow/specs/capabilities/<capability>.md`
-Do not create change-scoped JSON. Changes link to specs through `task.md`, roadmap text, PR text, and Git commits.
+Changes link to specs through `task.md`, roadmap text, PR text, and Git commits.
 ## Use This Skill When

package/CHANGELOG.md CHANGED Viewed

@@ -9,6 +9,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [4.5.15] - 2026-05-14
+### Added
+- Added Product / Creative Discovery and Second-Move Review gates to `cc-plan` so non-trivial plans confirm product value and shape before engineering details.
+- Updated `cc-plan` task templates, planning contract, playbook, and examples with durable slots for product shape, narrowest wedge, better-version thinking, and question-quality review.
+### Removed
+- Removed the `cc-devflow query` runtime surface and the `workflow-context` query.
+- Removed workflow-context stage-transition requirements from distributed skills; stages now start from `task.md`, Git, and PR or handoff reality.
+## [4.5.14] - 2026-05-14
+### Changed
+- Removed legacy process JSON and report-card assumptions from the public workflow contracts.
+- Restored the artifact-light `cc-plan` planning dialogue flow inside `task.md` contracts.
+- Restored `cc-investigate` root-cause proof, hypothesis, reroute, and feedback-loop guidance without bringing back separate analysis artifacts.
+- Restored `cc-do` TDD execution discipline and `cc-check` fresh-evidence verification flow while keeping durable truth limited to `task.md`, Git, PR briefs, and postmortems.
+- Restored `cc-review` node-by-node review, risk lanes, finding aggregation, and decision-question flow while routing plan findings into `task.md` and implementation findings through user-selected repair options.
 ## [4.5.13] - 2026-05-13
 ### Changed

package/README.md CHANGED Viewed

@@ -95,15 +95,15 @@ flowchart TD
 | --- | --- | --- |
 | `cc-roadmap` | You need product direction, staged scope, or backlog order | `devflow/roadmap.json`, `devflow/ROADMAP.md`, deprecated `devflow/BACKLOG.md` |
 | `cc-next` | You need to pick the next roadmap-aware ready target from roadmap, unarchived local changes, and issue truth | one Goal Packet for `cc-dev` |
-| `cc-dev` | A selected objective should be driven in the current worktree to a remote PR | PDCA/IDCA artifacts plus a PR or handoff |
-| `cc-plan` | A feature or change needs scope, design, and task freezing | `planning/tasks.md#Contract Summary`, `task-manifest.json`, `change-meta.json` |
-| `cc-investigate` | A bug needs symptom, reproduction, root cause, and repair boundary | `planning/tasks.md#Root Cause Contract`, `task-manifest.json`, `change-meta.json` |
-| `cc-do` | Planned or investigated work needs implementation | code, tests, task state, scratch runtime |
-| `cc-review` | Complex plans, investigations, or diffs need optional deep multi-round review before implementation or verification | `review-ledger.jsonl`, optional `review-findings.json`, optional rendered Markdown |
+| `cc-dev` | A selected objective should be driven in the current worktree to a remote PR | `task.md`, Git commits, and a PR or handoff |
+| `cc-plan` | A feature or change needs scope, design, and task freezing | `task.md#Contract Summary` |
+| `cc-investigate` | A bug needs symptom, reproduction, root cause, and repair boundary | `task.md#Root Cause Contract` |
+| `cc-do` | Planned or investigated work needs implementation | code, tests, `task.md` status, Git commit |
+| `cc-review` | Complex plans, investigations, or diffs need optional deep review before implementation or verification | plan findings in `task.md`; implementation findings and repair options in the response |
 | `cc-pr-review` | A remote PR needs an independent review session before landing | PR review packet, findings, and landing verdict |
 | `cc-pr-land` | Reviewed PRs need rebase-first landing into main with parity proof | integrated main plus local/remote parity evidence |
-| `cc-check` | Work needs fresh verification evidence | `report-card.json` |
-| `cc-act` | Verified work needs a PR, local handoff, release note, or closeout | one final handoff file |
+| `cc-check` | Work needs fresh verification evidence | pass/fail/blocked response and Git commit |
+| `cc-act` | Verified work needs a PR, local handoff, or closeout | optional `handoff/pr-brief.md`, Git/PR truth, or incident postmortem |
 Maintenance skills are shipped with the pack:
@@ -114,19 +114,19 @@ Maintenance skills are shipped with the pack:
 `cc-roadmap` now records planning posture, evidence maturity, canonical project language, and durable decision context before recommending a route. That keeps idea-stage, active-user, paying-customer, infrastructure, and recovery work from being forced through the same questions, and prevents roadmap items from inventing a second vocabulary. Developer-facing or operator-facing roadmap items also carry target user, time to first value, magic moment, adoption bottleneck, and domain handoff into `cc-plan`.
-Canonical language and durable decisions stay inside cc-devflow-native sources: `devflow/specs/`, `devflow/roadmap.json`, `devflow/ROADMAP.md`, `planning/tasks.md`, and `change-meta.json`. Legacy `planning/design.md` and `planning/analysis.md` remain readable fallback inputs for older changes.
+Canonical language and durable decisions stay inside cc-devflow-native sources: `devflow/specs/`, `devflow/roadmap.json`, `devflow/ROADMAP.md`, `task.md`, Git history, and PR truth. Legacy planning artifacts are readable fallback inputs only.
 `cc-plan` freezes more implementation decisions before `cc-do` starts. Non-trivial plans compare minimal viable and ideal architecture options, full designs include decision horizon plus error/rescue mapping, and test-first plans record test framework evidence, public test seams, spec-style test names, public verification paths, behavior assertions, mock boundaries, coverage quality, mandatory regression tests, interface depth, Green minimality guards, refactor candidates, and vertical tracer-bullet slices when existing behavior changes. Before handoff, `cc-plan` and `cc-investigate` also reconcile the source roadmap item so RM status, REQ/FIX binding, progress, and spec diagnosis do not drift from the frozen change artifacts.
-Every post-planning stage can start from `cc-devflow query workflow-context --change <id> --change-key <key> --data-only --no-trace --compact`. Treat the result as a context index, not semantic compression: it routes the next stage, names the current task, carries source hashes, `mustNotForget` constraints, default section/JSON refs, trusted commands, fail-closed rules, and machine-readable deep-open conditions. Source artifacts still decide disputed facts. This primarily reduces stage-routing and context-reset reads; end-to-end PDCA/IDCA savings depend on how often agents open `defaultOpen` and `deepOpen` refs. Use `npm run benchmark:workflow-context` to inspect token estimates plus routing correctness over the checked-in and synthetic examples. Use `npm run benchmark:skills` to keep public skill entrypoints thin; deeper planning rules should live behind conditional references instead of default context.
+Every post-planning stage starts from `task.md`, current Git history/status, and PR or handoff truth when present. There is no runtime context query layer; disputed facts must be re-read from source artifacts. Use `npm run benchmark:skills` to keep public skill entrypoints thin; deeper planning rules should live behind conditional references instead of default context.
-`cc-review` is optional and deeper than `cc-check`. It can run immediately after `cc-plan` / `cc-investigate` to review the frozen plan or root-cause contract, or after `cc-do` to review the implementation. It reads prior review records and current git/artifact delta, then records review lifecycle events through `cc-devflow review start`, `record-node`, `add-finding`, and `close` into `review-ledger.jsonl`. Human Markdown reports are rendered on demand with `cc-devflow review render`. When the host supports subagents, selected nodes can be dispatched to independent read-only reviewers so strategy, engineering, design, DX, smell, test, and runtime checks do not share one contaminated context. Broad implementation reviews can use separate risk lanes for intent/regression, security/privacy, performance/reliability, and contracts/coverage before the main thread triages raw findings. Plan reviews borrow strategy/design/engineering/DX methods through progressive references, while implementation reviews inspect diff scope, code smells, tests, UI/runtime behavior, Browser/Computer Use evidence, and logs when applicable. Findings route back to `cc-plan` or `cc-do`; clean implementation reviews continue to `cc-check`.
+`cc-review` is optional and deeper than `cc-check`. It can run immediately after `cc-plan` / `cc-investigate` to review the frozen plan or root-cause contract, or after `cc-do` to review the implementation. Plan and investigation review findings are written directly into `task.md`. Implementation review findings are returned in the response with repair options; the user chooses the repair path before code is edited. PR reviews stay in the response or GitHub review. No local review report, ledger, findings JSON, or other review output file is written.
 ## Verification And Ship Gates
 `cc-check` now treats QA as a feedback-loop problem, not only a green-test problem. Bugfix and behavior work records the loop used to prove reality, expected versus actual behavior, reproduction steps, test boundary quality, and architecture follow-ups when no clean public test seam exists.
-`cc-act` carries that evidence into PR briefs, handoffs, and release notes. It checks source roadmap progress during closeout, updates `devflow/roadmap.json`, and regenerates `devflow/ROADMAP.md` / `devflow/BACKLOG.md` when verified reality changes. Follow-ups must be durable behavior briefs with current behavior, desired behavior, key interfaces, acceptance criteria, and explicit out-of-scope notes before they are written back to roadmap or backlog.
+`cc-act` carries that evidence into PR briefs, handoffs, or incident postmortems when needed. It checks source roadmap progress during closeout, updates `devflow/roadmap.json`, and regenerates `devflow/ROADMAP.md` / `devflow/BACKLOG.md` when verified reality changes. Follow-ups must be durable behavior briefs with current behavior, desired behavior, key interfaces, acceptance criteria, and explicit out-of-scope notes before they are written back to roadmap or backlog.
 ## Installation Modes
@@ -244,19 +244,17 @@ The currently distributed skill folders are:
 - `devflow/specs/` stores durable capability truth: `INDEX.md` plus `capabilities/*.md`.
 - New change directories use `REQ-<number>-<description>` for requirements or `FIX-<number>-<description>` for bug fixes. `REQ` and `FIX` numbers advance independently, so the same number may exist in both prefixes. Parallel worktrees may also create repeated numbers; the full change key must use a specific description to distinguish the work.
-- `devflow/changes/<change>/` stores durable change truth: CLI-generated `change-meta.json`, `planning/tasks.md`, CLI-generated `task-manifest.json`, review ledger/findings records, optional CLI logs for debug/failure, `report-card.json`, and one final handoff file. Task `context.md`, `checkpoint.json`, review markdown, and AI-written process files are not default durable truth.
-- New changes default to one human-authored Markdown artifact: `planning/tasks.md`. Feature plans put the frozen design in `## Contract Summary`; bug investigations put root-cause truth in `## Root Cause Contract`. Legacy `planning/design.md`, `planning/analysis.md`, and `cc-review-*.md` remain fallback inputs, not new default writes.
-- Machine JSON is CLI-owned: write the human contract in `planning/tasks.md`, then run `cc-devflow task-contract compile` / `validate`; do not handwrite `task-manifest.json` or `change-meta.json`.
-- Use `cc-devflow task-contract validate`, `npm run verify:artifacts`, `npm run benchmark:artifacts`, and `npm run benchmark:skills` to keep workflow artifacts and skill entrypoints small and measurable.
+- `devflow/changes/<change>/` stores durable change truth in `task.md`, optional `handoff/pr-brief.md`, and Git commits. Real recurring failures may also write incident postmortems under `devflow/postmortems/`.
+- New changes default to one human-authored Markdown artifact: `task.md`. Feature plans put the frozen design in `## Contract Summary`; bug investigations put root-cause truth in `## Root Cause Contract`. Legacy planning and review artifacts are readable fallback inputs only.
+- Workflow state is Git-owned: keep `task.md` current, commit each completed stage/environment, and do not create extra process files.
+- Use `npm run verify:examples` and `npm run benchmark:skills` to keep workflow truth and skill entrypoints small and measurable.
 - `devflow/workspaces/<change>/` stores ephemeral runtime scratch such as worker assignment, journals, prompts, and session logs.
 - Regenerable files should not be persisted under `devflow/changes/`.
 Artifact contract quick checks:
 ```bash
-npx cc-devflow task-contract validate --change REQ-001 --change-key REQ-001-copy-invite-link
-npm run verify:artifacts
-npm run benchmark:artifacts
+npm run verify:examples
 npm run benchmark:skills
 ```

package/README.zh-CN.md CHANGED Viewed

@@ -95,15 +95,15 @@ flowchart TD
 | --- | --- | --- |
 | `cc-roadmap` | 需要产品方向、阶段范围或 backlog 顺序 | `devflow/roadmap.json`、`devflow/ROADMAP.md`、deprecated `devflow/BACKLOG.md` |
 | `cc-next` | 需要从 roadmap、未归档本地 change 和 issue truth 里选下一个 ready 目标 | 交给 `cc-dev` 的 Goal Packet |
-| `cc-dev` | 已选目标要在当前 worktree 内自动推进到远程 PR | PDCA/IDCA 产物加 PR 或 handoff |
-| `cc-plan` | 新功能或变更需要澄清范围、设计方案、冻结任务 | `planning/tasks.md#Contract Summary`、`task-manifest.json`、`change-meta.json` |
-| `cc-investigate` | Bug 需要症状、复现、根因和修复边界 | `planning/tasks.md#Root Cause Contract`、`task-manifest.json`、`change-meta.json` |
-| `cc-do` | 已计划或已调查的任务需要实现 | 代码、测试、任务状态、scratch runtime |
-| `cc-review` | 复杂方案、调查根因或 diff 需要在实现前或验证前做可选深度多轮 Review | `review-ledger.jsonl`，可选 `review-findings.json`，可按需渲染 Markdown |
+| `cc-dev` | 已选目标要在当前 worktree 内自动推进到远程 PR | `task.md`、Git commit、PR 或 handoff |
+| `cc-plan` | 新功能或变更需要澄清范围、设计方案、冻结任务 | `task.md#Contract Summary` |
+| `cc-investigate` | Bug 需要症状、复现、根因和修复边界 | `task.md#Root Cause Contract` |
+| `cc-do` | 已计划或已调查的任务需要实现 | 代码、测试、`task.md` 状态、Git commit |
+| `cc-review` | 复杂方案、调查根因或 diff 需要在实现前或验证前做可选深度 Review | 计划 finding 写入 `task.md`；执行 finding 和修复选项回到对话 |
 | `cc-pr-review` | 远程 PR 需要单独会话做合并前 Review | PR review packet、findings 和 landing verdict |
 | `cc-pr-land` | 已 Review PR 需要 rebase-first 合并到 main 并证明 parity | 已集成 main 和本地 / 远程一致性证据 |
-| `cc-check` | 工作需要新鲜验证证据 | `report-card.json` |
-| `cc-act` | 已验证工作需要 PR、本地 handoff、release note 或 closeout | 唯一最终 handoff 文件 |
+| `cc-check` | 工作需要新鲜验证证据 | pass/fail/blocked 回复和 Git commit |
+| `cc-act` | 已验证工作需要 PR、本地 handoff 或 closeout | 可选 `handoff/pr-brief.md`、Git/PR 真相或 incident postmortem |
 整包还包含两个维护类 Skill：
@@ -114,13 +114,13 @@ flowchart TD
 `cc-roadmap` 现在会先记录 planning posture、evidence maturity、项目 canonical language 和持久决策上下文，再推荐路线。idea、已有用户、付费客户、infra、recovery 场景不会被套进同一组问题，也不会让 roadmap item 发明第二套词汇。面向开发者或操作者的 roadmap item 还会把目标用户、time to first value、magic moment、adoption bottleneck 和 domain handoff 交给 `cc-plan`。
-Canonical language 和 durable decisions 只收敛到 cc-devflow 原生真相源：`devflow/specs/`、`devflow/roadmap.json`、`devflow/ROADMAP.md`、`planning/tasks.md` 和 `change-meta.json`。历史 `planning/design.md` / `planning/analysis.md` 只作为旧 change 的可读 fallback。
+Canonical language 和 durable decisions 只收敛到 cc-devflow 原生真相源：`devflow/specs/`、`devflow/roadmap.json`、`devflow/ROADMAP.md`、`task.md`、Git history 和 PR truth。历史 planning artifacts 只作为可读 fallback 输入。
 `cc-plan` 会在 `cc-do` 开始前冻结更多实现决策。非 trivial 计划需要比较 minimal viable 和 ideal architecture，full-design 需要包含 implementation decision horizon 和 error/rescue map；测试计划要记录测试框架证据、public test seam、spec-style test name、public verification path、behavior assertion、mock boundary、覆盖质量、强制 regression test、interface depth、Green minimality guard、refactor candidates 和 vertical tracer-bullet slices。交接前，`cc-plan` 和 `cc-investigate` 还会校准 source roadmap item，让 RM 状态、REQ/FIX 绑定、progress 和 spec diagnosis 不再漂移。
-planning 之后的每个阶段都可以先运行 `cc-devflow query workflow-context --change <id> --change-key <key> --data-only --no-trace --compact`。把结果当成 context index，而不是语义压缩：它负责路由下一阶段、标记当前 task、携带 source hash、`mustNotForget` 约束、默认 section/JSON refs、可信命令、fail-closed 规则和机器可读 deep-open 条件；有争议的事实仍由源 artifact 裁决。它主要降低 stage-routing 和 context-reset 的读取成本；端到端 PDCA/IDCA 节省取决于 agent 实际打开多少 `defaultOpen` 和 `deepOpen` refs。可用 `npm run benchmark:workflow-context` 查看仓库示例和合成用例上的 token 估算与路由正确性。用 `npm run benchmark:skills` 保持 public skill 入口足够薄；深层规划规则应该放在条件 reference 后面，而不是默认上下文里。
+planning 之后的每个阶段都从 `task.md`、当前 Git history/status，以及存在时的 PR 或 handoff truth 开始。系统不再提供 runtime context query 层；有争议的事实必须回到源 artifact 重新读取。用 `npm run benchmark:skills` 保持 public skill 入口足够薄；深层规划规则应该放在条件 reference 后面，而不是默认上下文里。
-`cc-review` 是可选的深度 Review，不替代 `cc-check`。它可以接在 `cc-plan` / `cc-investigate` 后审冻结的计划或根因合同，也可以接在 `cc-do` 后审实现。它先读取上次 Review 记录和当前 git/artifact delta，再通过 `cc-devflow review start`、`record-node`、`add-finding`、`close` 把生命周期事件写进 `review-ledger.jsonl`。需要人类 Markdown 报告时，再用 `cc-devflow review render` 按需渲染。宿主支持 subAgent 时，选中的节点可以派给独立只读 reviewer，让 strategy、engineering、design、DX、坏味道、测试和运行时审查不共享同一个被污染的上下文。复杂实现 Review 可以把 intent/regression、security/privacy、performance/reliability、contracts/coverage 拆成独立风险 lane，再由主线程聚合和筛掉弱 findings。计划 Review 通过渐进式 references 借鉴 strategy / design / engineering / DX 方法；实现 Review 检查 diff 范围、代码坏味道、测试、UI/runtime 行为、Browser/Computer Use 证据和日志。Finding 回到 `cc-plan` 或 `cc-do`；实现 Review 干净后再进入 `cc-check`。
+`cc-review` 是可选的深度 Review，不替代 `cc-check`。它可以接在 `cc-plan` / `cc-investigate` 后审冻结的计划或根因合同，也可以接在 `cc-do` 后审实现。计划 / 调查 Review 的 finding 直接写进 `task.md`。执行 Review 的 finding 在当前回复里组织成修复选项，用户选择后才改代码。PR Review 只留在对话或 GitHub review 中。不写本地 review report、ledger、findings JSON 或其它 Review 产物文件。
 ## 验证与交付门禁
@@ -244,19 +244,17 @@ npx cc-devflow config doctor --cwd /path/to/your/project
 - `devflow/specs/` 保存 durable capability truth：`INDEX.md` 和 `capabilities/*.md`。
 - 新 change 目录使用 `REQ-<number>-<description>` 表示需求，使用 `FIX-<number>-<description>` 表示 Bug 修复。`REQ` 和 `FIX` 各自递增自己的编号，跨前缀同号允许共存。并行工作树也可能产生重复编号，必须用完整 change key 的描述区分业务内容。
-- `devflow/changes/<change>/` 保存 durable change truth：CLI 生成的 `change-meta.json`、`planning/tasks.md`、CLI 生成的 `task-manifest.json`、review ledger / findings 记录、debug / failed 的可选 CLI 日志、`report-card.json` 和唯一最终 handoff 文件。任务级 `context.md`、`checkpoint.json`、review markdown 和 AI 手写过程文件不是默认 durable truth。
-- 新 change 默认只有一个人工编写的 Markdown artifact：`planning/tasks.md`。功能计划把冻结设计写进 `## Contract Summary`；Bug 调查把根因真相写进 `## Root Cause Contract`。历史 `planning/design.md`、`planning/analysis.md` 和 `cc-review-*.md` 只作为旧 change 的 fallback 输入，不再是新默认写入。
-- 机器态 JSON 归 CLI 所有：先把人类合同写进 `planning/tasks.md`，再运行 `cc-devflow task-contract compile` / `validate`；不要手写 `task-manifest.json` 或 `change-meta.json`。
-- 用 `cc-devflow task-contract validate`、`npm run verify:artifacts`、`npm run benchmark:artifacts` 和 `npm run benchmark:skills` 保持 workflow artifact 与 skill 入口小而可测。
+- `devflow/changes/<change>/` 的 durable change truth 只保留 `task.md`、可选 `handoff/pr-brief.md` 和 Git commits。真实复发故障可以在 `devflow/postmortems/` 写 incident postmortem。
+- 新 change 默认只有一个人工编写的 Markdown artifact：`task.md`。功能计划把冻结设计写进 `## Contract Summary`；Bug 调查把根因真相写进 `## Root Cause Contract`。历史 planning / review artifacts 只作为可读 fallback 输入。
+- 流程状态归 Git：保持 `task.md` 当前，每个完成阶段 / 执行环境提交 commit，不创建额外过程文件。
+- 用 `npm run verify:examples` 和 `npm run benchmark:skills` 保持 workflow truth 与 skill 入口小而可测。
 - `devflow/workspaces/<change>/` 保存 ephemeral runtime scratch，例如 worker assignment、journal、prompt 和 session log。
 - 能从 durable truth 再生成的文件，不应该持久化到 `devflow/changes/`。
 Artifact contract 快速检查：
 ```bash
-npx cc-devflow task-contract validate --change REQ-001 --change-key REQ-001-copy-invite-link
-npm run verify:artifacts
-npm run benchmark:artifacts
+npm run verify:examples
 npm run benchmark:skills
 ```