npm - @curdx/flow - Versions diffs - 1.1.11 → 2.0.0-beta.10 - Mend

@curdx/flow 1.1.11 → 2.0.0-beta.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (96) hide show

package/.claude-plugin/marketplace.json +3 -3
package/.claude-plugin/plugin.json +4 -11
package/CHANGELOG.md +99 -0
package/README.md +74 -102
package/README.zh.md +2 -2
package/agent-preamble/preamble.md +81 -11
package/agents/flow-adversary.md +41 -56
package/agents/flow-architect.md +24 -11
package/agents/flow-debugger.md +2 -2
package/agents/flow-edge-hunter.md +20 -6
package/agents/flow-executor.md +3 -3
package/agents/flow-planner.md +51 -48
package/agents/flow-product-designer.md +15 -2
package/agents/flow-qa-engineer.md +4 -4
package/agents/flow-researcher.md +18 -3
package/agents/flow-reviewer.md +5 -1
package/agents/flow-security-auditor.md +2 -2
package/agents/flow-triage-analyst.md +4 -4
package/agents/flow-ui-researcher.md +7 -7
package/agents/flow-ux-designer.md +3 -3
package/agents/flow-verifier.md +47 -14
package/bin/curdx-flow.js +13 -1
package/cli/doctor.js +28 -13
package/cli/install.js +62 -36
package/cli/protocols.js +63 -10
package/cli/registry.js +73 -0
package/cli/uninstall.js +9 -11
package/cli/upgrade.js +6 -10
package/cli/utils.js +104 -56
package/commands/debug.md +10 -10
package/commands/fast.md +1 -1
package/commands/help.md +109 -87
package/commands/implement.md +7 -7
package/commands/init.md +18 -7
package/commands/review.md +114 -130
package/commands/spec.md +131 -89
package/commands/start.md +130 -153
package/commands/verify.md +110 -92
package/gates/adversarial-review-gate.md +20 -20
package/gates/coverage-audit-gate.md +1 -1
package/gates/devex-gate.md +5 -6
package/gates/edge-case-gate.md +2 -2
package/gates/security-gate.md +3 -3
package/hooks/hooks.json +0 -11
package/hooks/scripts/quick-mode-guard.sh +12 -9
package/hooks/scripts/session-start.sh +2 -2
package/hooks/scripts/stop-watcher.sh +25 -15
package/knowledge/epic-decomposition.md +2 -2
package/knowledge/execution-strategies.md +10 -9
package/knowledge/planning-reviews.md +6 -6
package/knowledge/spec-driven-development.md +11 -10
package/knowledge/two-stage-review.md +6 -5
package/knowledge/wave-execution.md +5 -5
package/package.json +4 -2
package/skills/brownfield-index/SKILL.md +62 -0
package/skills/browser-qa/SKILL.md +50 -0
package/skills/epic/SKILL.md +68 -0
package/skills/security-audit/SKILL.md +50 -0
package/skills/ui-sketch/SKILL.md +49 -0
package/templates/config.json.tmpl +1 -1
package/templates/design.md.tmpl +32 -112
package/templates/requirements.md.tmpl +25 -43
package/templates/research.md.tmpl +37 -68
package/templates/tasks.md.tmpl +27 -84
package/agents/persona-amelia.md +0 -128
package/agents/persona-david.md +0 -141
package/agents/persona-emma.md +0 -179
package/agents/persona-john.md +0 -105
package/agents/persona-mary.md +0 -95
package/agents/persona-oliver.md +0 -136
package/agents/persona-rachel.md +0 -126
package/agents/persona-serena.md +0 -175
package/agents/persona-winston.md +0 -117
package/commands/audit.md +0 -170
package/commands/autoplan.md +0 -184
package/commands/design.md +0 -155
package/commands/discuss.md +0 -162
package/commands/doctor.md +0 -124
package/commands/index.md +0 -261
package/commands/install-deps.md +0 -128
package/commands/party.md +0 -241
package/commands/plan-ceo.md +0 -117
package/commands/plan-design.md +0 -107
package/commands/plan-dx.md +0 -104
package/commands/plan-eng.md +0 -108
package/commands/qa.md +0 -118
package/commands/requirements.md +0 -146
package/commands/research.md +0 -141
package/commands/security.md +0 -109
package/commands/sketch.md +0 -118
package/commands/spike.md +0 -181
package/commands/status.md +0 -139
package/commands/switch.md +0 -95
package/commands/tasks.md +0 -189
package/commands/triage.md +0 -160
package/hooks/scripts/fail-tracker.sh +0 -31

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -5,14 +5,14 @@
     "email": "bydongxin@gmail.com"
   },
   "metadata": {
-    "description": "CurDX-Flow marketplace — distributes the curdx-flow meta-framework plugin",
-    "version": "1.1.11"
+    "description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
+    "version": "2.0.0-beta.10"
   },
   "plugins": [
     {
       "name": "curdx-flow",
       "source": "./",
-      "description": "AI engineering workflow meta-framework — spec-driven + autonomous execution + quality gates + multi-agent collaboration",
+      "description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
       "keywords": [
         "workflow",
         "spec-driven",

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,13 +1,13 @@
 {
   "name": "curdx-flow",
-  "version": "1.1.11",
-  "description": "AI engineering workflow meta-framework — spec-driven + autonomous execution + quality gates + multi-agent collaboration. Distills Karpathy + BMAD + GSD + Ralph + Superpowers + GStack and orchestrates context7 / sequential-thinking / chrome-devtools / frontend-design / claude-mem / pua dependencies.",
+  "version": "2.0.0-beta.10",
+  "description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
   "author": {
     "name": "wdx",
     "email": "bydongxin@gmail.com"
   },
-  "homepage": "https://github.com/wdx/curdx-flow",
-  "repository": "https://github.com/wdx/curdx-flow",
+  "homepage": "https://github.com/curdx/curdx-flow",
+  "repository": "https://github.com/curdx/curdx-flow",
   "license": "MIT",
   "keywords": [
     "workflow",
@@ -31,13 +31,6 @@
         "-y",
         "@modelcontextprotocol/server-sequential-thinking"
       ]
-    },
-    "chrome-devtools": {
-      "command": "npx",
-      "args": [
-        "-y",
-        "chrome-devtools-mcp@latest"
-      ]
     }
   }
 }

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,105 @@
 All notable changes to CurDX-Flow will be documented here.
+## [Unreleased]
+### Fixed (P0)
+- `cli/utils.js` — `findRuntime()` referenced `existsSync` without importing it; any user whose `bun`/`uv` was not on PATH hit `ReferenceError` during install or doctor.
+- `hooks/scripts/stop-watcher.sh` — `export STATE_FILE` was placed after the python heredoc, so the stop-hook execution strategy never activated. Moved the export before the heredoc.
+- `package.json` — `skills/` directory was missing from `files[]`; the 5 bundled skills were stripped from the published tarball.
+- `cli/uninstall.js` + `cli/upgrade.js` — `chrome-devtools-mcp` (added as a recommended plugin in beta.8) was missing from uninstall/upgrade lists, making it installable but uninstallable.
+- `cli/protocols.js` — `injectGlobalProtocols()` returned action `"created"` on both ternary branches, silently collapsing the append-to-existing-file case. Atomic write + corrupted-block detection added.
+### Changed (structural)
+- New `cli/registry.js` is the single source of truth for recommended plugins. `install.js`, `uninstall.js`, `upgrade.js`, and `doctor.js` all import from it.
+- `commands/start.md` and `commands/spec.md` now produce `.state.json` files that match `schemas/spec-state.schema.json` (field names: `spec_name` / `created` / `updated` / `version`; initial phase is `research`, not the undefined `created`).
+- All python heredocs inside hook scripts use quoted delimiters (`<<'PY'`) and read `STATE_FILE` via `os.environ`, closing a shell→python code-injection surface triggered by unusual spec names.
+### Removed
+- `hooks/scripts/fail-tracker.sh` and its `PostToolUseFailure` registration — the counter was written but never read by any consumer (the intended pua escalation was never implemented). Can be reintroduced when the consumer exists.
+## [2.0.0-beta.1] - 2026-04-20
+### BREAKING — Major redesign: Discipline Layer, not meta-framework
+v2 reframes the project from "AI engineering workflow meta-framework distilling 6 methodologies" into a focused **Claude Code Discipline Layer** — spec-driven workflow + goal-backward verification + Karpathy discipline. 30 slash commands became 9. 24 agents became 15.
+See [MIGRATION.md](./MIGRATION.md) for the full v1 → v2 command mapping.
+### Removed — 21 slash commands
+Folded into `/curdx-flow:spec` via flags:
+- `/research`, `/requirements`, `/design`, `/tasks` → `/spec --phase=<name>`
+- `/plan-ceo`, `/plan-eng`, `/plan-design`, `/plan-dx`, `/autoplan` → `/spec --review[=<dim>]`
+- `/audit` → `/verify --strict`
+Moved to auto-invoked skills (also slash-callable by their skill name):
+- `/triage` → `epic` skill
+- `/qa` → `browser-qa` skill
+- `/sketch` → `ui-sketch` skill
+- `/security` → `security-audit` skill
+- `/index` → `brownfield-index` skill
+Moved to CLI-only (outside Claude Code):
+- `/doctor` → `npx @curdx/flow doctor`
+- `/install-deps` → `npx @curdx/flow install --all`
+Removed outright:
+- `/party` — gimmick, no measured value. Ask Claude directly for multi-perspective discussion.
+- `/discuss` — edit `.flow/STATE.md` directly to log decisions.
+- `/status` — use `/curdx-flow:start --list`, or read `.flow/specs/<name>/.state.json` directly.
+- `/switch` — folded into `/curdx-flow:start <spec-name>`.
+- `/spike` — folded into `/curdx-flow:fast "spike: <hypothesis>"`.
+### Removed — 9 persona agents
+Mary, John, Winston, Amelia, Rachel, David, Oliver, Serena, Emma. Their voices merged into the corresponding `flow-*` functional agents. `/party` is gone too.
+### Added — 5 auto-invoked skills
+Each skill has extensive trigger keywords in English and Chinese so Claude auto-invokes based on context. All are also slash-callable (`/curdx-flow:<skill-name>`).
+- `skills/epic/` — vertical-slice decomposition of large features
+- `skills/browser-qa/` — real-browser testing via chrome-devtools MCP
+- `skills/ui-sketch/` — UI design variant generation (uses `frontend-design` skill)
+- `skills/security-audit/` — OWASP + STRIDE + CVE scan
+- `skills/brownfield-index/` — map unfamiliar / legacy codebases
+### Changed — expanded flag surface on core commands
+- `/curdx-flow:spec` gains `--phase=<X[,Y,...]>`, `--until=<phase>`, `--review[=<dim>]`, `--regenerate`, `--resume`
+- `/curdx-flow:start` gains `--resume`, `--list`, `--mode=<fast|standard|enterprise>` and now handles spec switching (absorbing v1 `/switch`)
+- `/curdx-flow:verify` gains `--strict` (multi-source coverage audit, absorbing v1 `/audit`)
+- `/curdx-flow:review` normalizes flags to `--stage=<1|2|both>`, `--adversarial`, `--edge-case`
+### Changed — mode semantics
+Modes reduced from 4 (`sketch / fast / standard / enterprise`) to 3 (`fast / standard / enterprise`). `sketch` folded into the `ui-sketch` skill which runs regardless of mode.
+### Unchanged (intentionally)
+- `.flow/` directory structure — fully backwards-compatible with v1 specs
+- 3 auto-installed MCPs (context7, sequential-thinking, chrome-devtools)
+- 5 hook events (SessionStart, InstructionsLoaded, Stop, PreToolUse, PostToolUseFailure)
+- 8 composable gates (karpathy, verification, tdd, coverage-audit, adversarial-review, edge-case, security, devex)
+- Templates (`templates/*.tmpl`) and their rendering pipeline
+- The CLI (install, doctor, upgrade, uninstall)
+- Global `~/.claude/CLAUDE.md` protocol injection
+### Why
+Field feedback on v1:
+- Users couldn't discover commands — 30 options in the slash menu exceeds working memory
+- Claude Code auto-truncates slash descriptions once they exceed ~8 KB, hitting us at 30 commands
+- Peer tools (Spec Kit ~7, BMAD ~15, Aider ~15) ship with far fewer commands
+- "Meta-framework distilling 6 methodologies" is a maintainer concept, not a user value prop
+- The real value is **Karpathy discipline + goal-backward verification** — everything else is scaffolding
+v2 keeps the scaffolding (agents, gates, hooks, knowledge docs) but hides it behind the 9 commands that real users use.
 ## [1.1.1] - 2026-04-20
 ### Fixed

package/README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 # CurDX-Flow
-> **AI Engineering Workflow Meta-Framework for Claude Code**
-> Turn Claude Code into a disciplined engineering team — orchestrate MCPs and plugins, enforce Karpathy's 4 principles, and ship quality software with spec-driven workflows.
+> **Stop Claude Code from faking "done".**
+> Spec-driven workflow + goal-backward verification + Karpathy discipline.
 [![npm version](https://img.shields.io/npm/v/@curdx/flow.svg)](https://www.npmjs.com/package/@curdx/flow)
 [![License](https://img.shields.io/badge/license-MIT-green)](./LICENSE)
@@ -10,148 +10,120 @@
 ---
-## What It Is
+## Why this exists
-CurDX-Flow is a Claude Code plugin that distills six proven AI-engineering workflows (Karpathy guidelines, BMAD-METHOD, get-shit-done, gstack, smart-ralph, superpowers) into a single composable system — then delegates infrastructure to specialized third-party plugins (context7, sequential-thinking, chrome-devtools, claude-mem, pua, frontend-design).
+Claude Code drifts. It claims features work without running the tests. It uses stale library APIs from its training data. It takes three questions worth of code and squeezes in six.
-**It doesn't reinvent wheels. It orchestrates good ones.**
+CurDX-Flow is a thin **discipline layer** on top of Claude Code that makes non-trivial feature work actually work:
-## At a Glance (v1.1.0)
+1. **A spec workflow** (research → requirements → design → tasks) for features too big to vibe-code.
+2. **Goal-backward verification** that scans your implementation against your own FR / AC / AD and catches stubs, fake completions, and untested acceptance criteria.
+3. **Karpathy's 4 principles** (think-first, simplicity, surgical changes, goal-driven) enforced via always-on hooks and gates.
-- **31 commands** — initialize, research, design, execute, verify, review, debug, QA, security, ...
-- **24 agents** — 15 functional + 9 personas (Mary, John, Winston, Amelia, Rachel, David, Oliver, Serena, Emma)
-- **8 composable gates** — Karpathy / Verification / TDD / Coverage / Adversarial / Edge-Case / Security / DevEx
-- **4 execution strategies** — linear / subagent / stop-hook / wave (auto-routed)
-- **10 knowledge docs** — spec-driven dev, POC-first, atomic commits, execution strategies, ...
-- **5 hook events** — SessionStart / InstructionsLoaded / Stop / PreToolUse / PostToolUseFailure
-- **3 auto-installed MCPs** — context7, sequential-thinking, chrome-devtools
-- **Graceful degradation** — missing deps → fallback modes with clear notices
-## Why Use It
-Existing AI-coding workflows suffer from:
+Not a framework. Not a methodology library. A discipline layer.
-| Problem | CurDX-Flow's Answer |
-|---------|---------------------|
-| Context rot (long sessions degrade) | Subagent isolation + stop-hook loops + claude-mem |
-| No discipline (AI skips, hallucinates) | Karpathy L1 + verification-gate + TDD-gate enforcement |
-| Scattered best practices | Six workflows distilled into one meta-framework |
-| Weak verification (claim "done" without proof) | Goal-backward verification + forbidden-word detection |
-| Single-viewpoint decisions | Party Mode (real multi-agent, not role-play) + adversarial review |
+## Install (one command)
-## Install
-### Recommended: one-shot via npm (works in any project, fast on Chinese npm mirrors)
 ```bash
-# Interactive install (picks recommended plugins)
-npx @curdx/flow install
-# Or install everything automatically
 npx @curdx/flow install --all
-# After the one-time install, initialize your project INSIDE Claude Code:
-#   $ claude
-#   /curdx-flow:init
-#   /curdx-flow:start my-feature "<description>"
-# Other CLI lifecycle operations
-npx @curdx/flow doctor      # Health check
-npx @curdx/flow upgrade     # Update all plugins
-npx @curdx/flow uninstall   # Remove curdx-flow
-```
-### From the Claude Code marketplace
-```
-/plugin marketplace add curdx/curdx-flow
-/plugin install curdx-flow
 ```
-### Local dev
-```bash
-git clone https://github.com/curdx/curdx-flow
-claude --plugin-dir ./curdx-flow
-```
+This installs the Claude Code plugin (fully offline from the npm package — no GitHub fetch), the 3 auto-registered MCP servers, and three recommended companion plugins (pua, claude-mem, frontend-design).
-3 MCPs auto-install. Then run:
+## 9 slash commands, that's it
 ```
-/curdx-flow:install-deps    # interactive install of pua / claude-mem / frontend-design
-/curdx-flow:doctor          # verify health
-/curdx-flow:init            # initialize your project
+/curdx-flow:init           Initialize .flow/ in the current project
+/curdx-flow:start          Create / resume / switch a feature spec
+/curdx-flow:spec           Write or refresh the spec (--phase, --review flags)
+/curdx-flow:implement      Execute the tasks
+/curdx-flow:verify         Goal-backward check ← the moat
+/curdx-flow:review         Two-stage code review
+/curdx-flow:fast           Skip spec for one-off tasks
+/curdx-flow:debug          Systematic 4-stage debugging
+/curdx-flow:help           Reference
 ```
-See [`docs/getting-started.md`](./docs/getting-started.md) for a 5-minute walkthrough.
+Plus 5 auto-invoked skills (Claude picks them up from context): `epic`, `browser-qa`, `ui-sketch`, `security-audit`, `brownfield-index`.
-## Quick Start
+## Quick start
 ```bash
-# 1. In your project
+# 1. In your project, initialize
+cd ~/your-project
+claude
 /curdx-flow:init
 # 2. Start a feature
-/curdx-flow:start jwt-auth "Add JWT authentication to REST API"
+/curdx-flow:start jwt-auth "Add JWT authentication to the REST API"
-# 3. Generate full spec (research → requirements → design → tasks)
+# 3. Generate the spec (research → requirements → design → tasks)
 /curdx-flow:spec
-# 4. Execute (auto-picks best strategy)
+# 4. Execute
 /curdx-flow:implement
 # 5. Verify + review
 /curdx-flow:verify
 /curdx-flow:review
-# Done. git push.
+# Done.
 ```
-For more scenarios (brownfield, epic, fast, UI-heavy), see [`docs/workflows.md`](./docs/workflows.md).
+See [`docs/getting-started.md`](./docs/getting-started.md) for a five-minute walkthrough, or [`docs/workflows.md`](./docs/workflows.md) for typical scenarios (greenfield / brownfield / epic / fast / enterprise).
-## Five Pillars
+## The three modes
-1. **Mind (L1 baseline)** — Karpathy 4 principles + mandatory tool rules + 3 red lines
-2. **Spec** — Research → Requirements → Design → Tasks (scalable, 4-file lite to 12-file enterprise)
-3. **Agents** — 15 functional + 9 persona (Mary / John / Winston / ...)
-4. **Flow** — 4 execution strategies, auto-routed
-5. **Memory** — claude-mem (implicit) + `.flow/STATE.md` (explicit decisions as D-NN)
+Set when you run `/curdx-flow:start --mode=<mode>`. Each mode decides which gates are enforced.
-See [`docs/ethos.md`](./docs/ethos.md) for design philosophy.
+| Mode | Workflow | Gates |
+|------|----------|-------|
+| `fast` | Skip the spec — just execute | karpathy + verification |
+| `standard` (default) | Full 4-phase spec | + tdd + coverage-audit |
+| `enterprise` | Full spec + planning review + adversarial + edge-case + security audit | All gates |
-## Documentation
+## Upgrading from v1.x
-| Doc | When to read |
-|-----|--------------|
-| [`docs/getting-started.md`](./docs/getting-started.md) | First time, 5-minute walkthrough |
-| [`docs/command-reference.md`](./docs/command-reference.md) | All 31 commands with usage |
-| [`docs/agent-reference.md`](./docs/agent-reference.md) | All 24 agents + personas |
-| [`docs/workflows.md`](./docs/workflows.md) | 5 typical scenarios (greenfield/brownfield/epic/fast/UI) |
-| [`docs/architecture.md`](./docs/architecture.md) | Internal design for extenders |
-| [`docs/ethos.md`](./docs/ethos.md) | Why it's built this way |
-| [`docs/troubleshoot.md`](./docs/troubleshoot.md) | Fix common problems |
+v2 is a major rewrite. Thirty slash commands became nine. If you're coming from v1, see [`MIGRATION.md`](./MIGRATION.md) for the mapping table. If you'd rather stay on v1:
-## What's Not Here (Yet)
+```bash
+npm i -g @curdx/flow@^1.1
+```
-**Intentionally skipped in v1.0**:
-- **Phase 6** (ship / land / canary / retro) — Low ROI for private-repo use cases. Future release if community asks.
-- **Integration checker** — For Epic cross-spec validation. Manual for now.
-- **Checkpoint save/restore** — Current `.flow/STATE.md` covers most needs.
-- **Multi-runtime support** — Claude Code only. Other tools later if their plugin API matures.
+## What's in the box
-See [`CHANGELOG.md`](./CHANGELOG.md) for full version history.
+- **9 slash commands** + **5 auto-invoked skills** (the complete public surface)
+- **15 internal agents** that the commands dispatch (no user-visible personas — that was v1)
+- **8 composable gates** — Karpathy / Verification / TDD / Coverage / Adversarial / Edge-Case / Security / DevEx
+- **4 execution strategies** for `/curdx-flow:implement` (linear / subagent / stop-hook / wave, auto-routed)
+- **5 hook events** that enforce the discipline without user action
+- **3 auto-installed MCPs** — context7 (latest docs), sequential-thinking (structured reasoning), chrome-devtools (browser automation)
+- **Offline-capable install** — the npm package ships the full plugin body; zero GitHub round-trips during install
+## Documentation
+| Doc | When to read |
+|-----|--------------|
+| [`docs/getting-started.md`](./docs/getting-started.md) | First time — five-minute walkthrough |
+| [`docs/command-reference.md`](./docs/command-reference.md) | All 9 commands + 5 skills with flags |
+| [`docs/agent-reference.md`](./docs/agent-reference.md) | The 15 internal agents |
+| [`docs/workflows.md`](./docs/workflows.md) | Typical scenarios with exact command sequences |
+| [`docs/architecture.md`](./docs/architecture.md) | Internal design for contributors and extenders |
+| [`docs/ethos.md`](./docs/ethos.md) | Why we built it this way |
+| [`docs/troubleshoot.md`](./docs/troubleshoot.md) | Symptom-indexed fixes |
+| [`docs/releasing.md`](./docs/releasing.md) | Maintainer: how to cut a new version |
+| [`MIGRATION.md`](./MIGRATION.md) | v1 → v2 command mapping |
 ## Credits
-CurDX-Flow is a distillation, not an invention. Deeply indebted to:
+CurDX-Flow v2 is a distillation, not an invention. Ideas we depend on:
-- [**andrej-karpathy-skills**](https://github.com/forrestchang/andrej-karpathy-skills) — the 4 principles
-- [**smart-ralph**](https://github.com/Nibzard/smart-ralph) — spec engine + stop-hook loops
+- [**Andrej Karpathy's 4 principles**](https://github.com/forrestchang/andrej-karpathy-skills) — the discipline foundation
+- [**GitHub Spec Kit**](https://github.com/github/spec-kit) — spec-driven development boiled down
 - [**superpowers**](https://github.com/obra/superpowers) — subagent discipline + TDD + two-stage review
-- [**BMAD-METHOD**](https://github.com/bmad-code-org/BMAD-METHOD) — personas + party mode + adversarial review
-- [**get-shit-done**](https://github.com/ryangentry/get-shit-done) — wave execution + decision locking + multi-source audit
-- [**gstack**](https://github.com/garrytan/gstack) (Garry Tan) — planning reviews + DX philosophy
-- [**pua**](https://github.com/tanweai/pua) — persistence + 3 red lines
-- [**claude-mem**](https://github.com/thedotmack/claude-mem) — automatic cross-session memory
-- [**frontend-design skill**](https://claude.ai/skills/frontend-design) — distinctive UI (Anthropic official)
-- **context7** (Upstash) + **sequential-thinking** (Anthropic) + **chrome-devtools-mcp** (Chrome team)
+- [**pua**](https://github.com/tanweai/pua) — the three red lines
+- [**claude-mem**](https://github.com/thedotmack/claude-mem) — cross-session memory
+- **Anthropic official** — context7 (Upstash), sequential-thinking, chrome-devtools-mcp, frontend-design skill
 ## License
@@ -159,9 +131,9 @@ MIT. See [`LICENSE`](./LICENSE).
 ## Contributing
-- Issues: https://github.com/wdx/curdx-flow/issues
-- PRs welcome but please walk through `/curdx-flow:spec` yourself first (eat your own dogfood)
-- Code of conduct: be kind + specific + don't rage at the LLM
+- Issues: https://github.com/curdx/curdx-flow/issues
+- PRs: please eat your own dogfood — run `/curdx-flow:spec` on your own PR first
+- Be specific. No rage-at-the-LLM. The point is to make Claude reliable, not to whine about when it isn't.
 ---

package/README.zh.md CHANGED Viewed

@@ -23,8 +23,8 @@ CurDX-Flow 是一个 Claude Code 插件，把 6 个验证过的 AI 工程工作
 - **8 个可组合 Gate** — Karpathy / Verification / TDD / Coverage / Adversarial / Edge-Case / Security / DevEx
 - **4 种执行策略** — linear / subagent / stop-hook / wave（自动路由）
 - **10 个知识文档** — 规格驱动 / POC-First / 原子提交 / 执行策略 / ...
-- **5 个 hook 事件** — SessionStart / InstructionsLoaded / Stop / PreToolUse / PostToolUseFailure
-- **3 个自动安装的 MCP** — context7 / sequential-thinking / chrome-devtools
+- **4 个 hook 事件** — SessionStart / InstructionsLoaded / Stop / PreToolUse
+- **2 个自动安装的 MCP + 1 个推荐插件** — context7 / sequential-thinking（plugin.json 内置）+ chrome-devtools-mcp（recommended，beta.8 解耦）
 - **优雅降级** — 依赖缺失时进入 fallback 模式并清晰告知
 ## 为什么用

package/agent-preamble/preamble.md CHANGED Viewed

@@ -30,35 +30,58 @@
 - Do not say done/fixed/working without evidence
 - Tests first, goals first
+### 5. Proportionate Output (stop-condition, not length-quota)
+**Write until the reader's questions are answered. Then stop.** There is no minimum length, no maximum length, no target range. Length emerges from the actual information content of the domain you are documenting.
+Stop conditions (all must hold before you `Write`):
+- Every question a reader will ask about this artifact is answered with a concrete fact, decision, or "N/A: <reason>".
+- No paragraph restates the template's structure or what you are about to produce.
+- No paragraph repeats upstream content (the goal from `.state.json`, a section of requirements.md in your design.md) — reference it instead.
+- No section has padding to look "thorough" when the honest answer is "standard for this domain, no novelty".
+Research reference: Anthropic's own prompt guidance — ["arbitrary iteration caps" are an anti-pattern](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices); use a stop condition instead. Claude Opus 4.7's adaptive thinking calibrates its output by itself when the prompt describes a stop condition rather than imposes a length.
+Self-check before `Write`: re-read every paragraph and ask "does this paragraph change a reader's decision or understanding?" If no, delete it. Iterate.
 ---
 ## L2: Mandatory Tool Rules (enforced)
 ### Documentation lookup → context7 MCP
-For any question involving a library / framework / SDK / CLI / API:
+Query `context7` when EITHER is true:
+- The library API is version-sensitive (recent breaking change, typed API in a new version, deprecated method you're considering).
+- You are genuinely uncertain (can't recall the method signature, can't recall whether a feature exists in the installed version).
 ```
 1. mcp__context7__resolve-library-id("react")     → resolve library ID
 2. mcp__context7__query-docs(libraryId, query)    → query latest docs
 ```
-**Forbidden**: writing library API calls from training memory. Training data may be stale.
+Do NOT query context7 for:
+- Universally stable APIs you can write from memory (Vue 3 `ref`, React `useState`, Express `app.get`, SQL `SELECT`).
+- Syntax you would paste into a test file without thinking.
+- Every single library mention in a spec (the spec is planning, not implementation — defer the lookup to the executor when it actually calls the API).
+**Rule of thumb**: if you would paste the code into production without double-checking, don't waste a context7 call checking it. If you would hesitate, query. Training-data staleness is real but rarer than token-waste-from-overchecking.
-**Fallback**: when context7 MCP is unavailable, use WebSearch with a version number, and annotate the output with
-"⚠️ context7 unavailable — documentation may not be current".
+**Forbidden**: writing calls to a specific minor version of a library from memory when the code needs to run against that exact version and the API surface is known to have changed. Then you MUST query context7.
+**Fallback**: when context7 MCP is unavailable, use WebSearch with a version number, and annotate the output with "⚠️ context7 unavailable — documentation may not be current".
 ---
 ### Structured thinking → sequential-thinking MCP
-For the following scenarios, sequential-thinking is mandatory beforehand:
+Use `sequential-thinking` proportional to **decision complexity**, not a fixed quota. The numbers below are **ceilings for genuinely hard cases**, not floors to hit:
+| Task | Guideline |
+|------|-----------|
-- Planning (≥5 thoughts)
-- Architecture design (≥8 thoughts)
-- Epic decomposition (≥10 thoughts)
-- Adversarial review (≥6 thoughts)
-- Complex bug root-cause analysis (≥5 thoughts)
+**Principle**: running 8 thoughts to pick between Vue and React for a Todo is waste. Running 1 thought to architect a distributed queue is irresponsible. Match effort to stakes.
+Hard rule: do NOT emit empty thoughts ("Thought 4: let me also consider X… X is fine"). If you've reached the answer, stop.
 ```
 mcp__sequential-thinking__sequentialthinking({
@@ -70,7 +93,7 @@ mcp__sequential-thinking__sequentialthinking({
 ```
 **Fallback**: when seq-think is unavailable, simulate it inside `<thinking>...</thinking>` blocks
-in the response, still listing numbered thoughts (at least 5).
+in the response, still listing numbered thoughts proportional to real decision complexity.
 ---
@@ -210,5 +233,52 @@ When you need to delegate to a sub-agent:
 ---
+## L8: Long-artifact handling (truncation prevention)
+When your job is to produce a long Markdown artifact (`tasks.md`, `verification-report.md`, `review-report.md`, `research.md`, `requirements.md`, `design.md`, etc.), follow these rules. Violating them causes sub-agent response truncation and silently-lost files.
+### Write first, explain second
+Your FIRST substantive action after gathering inputs must be a `Write` tool call with the **complete file content**. Do NOT paste the content as assistant text before writing.
+- ✗ *"Here's the tasks.md I'll write:"* followed by a 500-line markdown code block, then a `Write` call containing the same 500 lines — this doubles the output tokens and usually hits the truncation limit mid-`Write`, leaving the file missing or partial.
+- ✓ Immediately `Write` the file with full content. Then output a ≤ 5-line summary.
+### Do not preview
+Never output the file's content in your response. The file IS the deliverable — the reader opens it. Your response is just the ack that you wrote it.
+### After write, summarize only
+After `Write` returns success, respond with **at most 5 lines** summarizing what you wrote:
+```
+✓ Wrote .flow/specs/<spec>/tasks.md
+  40 tasks across 5 phases
+  Coverage: FR 10/10, AC 12/12, AD 4/4
+  Next: /curdx-flow:implement
+```
+Do not re-paste any file contents. Do not narrate your reasoning. Do not list every task inline.
+### Split when a single `Write` call would approach the output budget
+If the artifact is large enough that one `Write` call risks truncation (sub-agent output tokens are finite), split it:
+- `tasks.md` references `tasks-phase-1.md` … `tasks-phase-5.md`
+- Each phase file is its own `Write` call
+- The index file is a short table linking to the phase files
+Judge by the nature of the content, not a hardcoded line count — the same content density varies wildly in line count depending on how many tables and lists it contains. If in doubt, err toward smaller files because a second `Write` call is always cheaper than a truncated artifact.
+### If you see a token-budget warning
+Stop narrating and call `Write` with whatever content is ready. Sub-agents do not have a "next response" — continuation is not possible after truncation. Save what you have, then return.
+### Why this matters
+Sub-agents invoked via the `Task` tool have a ~16 K output-token budget per invocation. A naive agent that previews then writes consumes those tokens twice — once as prose, once inside the tool call — and truncation typically lands inside the `Write` call itself. The parent command then reports "agent did not complete" and re-dispatches, burning compute for no new artifact. Writing first eliminates the failure mode at the source.
+---
 **Remember**: this preamble exists because, without discipline, AI tends to slack off, hallucinate, and over-engineer.
 These rules are not constraints — they are the tools that make you reliable.