npm - @curdx/flow - Versions diffs - 2.2.0 → 2.2.3 - Mend

@curdx/flow 2.2.0 → 2.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (78) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +19 -2
package/README.md +15 -8
package/README.zh.md +5 -3
package/agent-preamble/preamble.md +33 -0
package/agents/flow-adversary.md +1 -1
package/agents/flow-architect.md +2 -1
package/agents/flow-brownfield-analyst.md +153 -0
package/agents/flow-debugger.md +6 -11
package/agents/flow-edge-hunter.md +1 -1
package/agents/flow-executor.md +30 -8
package/agents/flow-planner.md +38 -5
package/agents/flow-product-designer.md +2 -1
package/agents/flow-qa-engineer.md +9 -5
package/agents/flow-researcher.md +2 -1
package/agents/flow-reviewer.md +23 -5
package/agents/flow-security-auditor.md +5 -3
package/agents/flow-triage-analyst.md +5 -24
package/agents/flow-ui-researcher.md +4 -3
package/agents/flow-ux-designer.md +12 -39
package/agents/flow-verifier.md +35 -3
package/cli/README.md +3 -1
package/cli/doctor-workflow.js +1074 -2
package/cli/doctor.js +8 -0
package/cli/help.js +2 -0
package/cli/lib/doctor-report.js +256 -1
package/cli/lib/frontmatter.js +44 -0
package/cli/lib/json-schema.js +57 -0
package/cli/lib/runtime.js +20 -2
package/cli/utils.js +6 -1
package/gates/adversarial-review-gate.md +1 -1
package/gates/security-gate.md +2 -2
package/gates/test-quality-gate.md +59 -0
package/hooks/hooks.json +16 -2
package/hooks/scripts/common.sh +4 -0
package/hooks/scripts/session-start.sh +17 -2
package/hooks/scripts/stop-watcher.sh +69 -18
package/hooks/scripts/subagent-artifact-guard.sh +159 -0
package/hooks/scripts/subagent-statusline.sh +105 -0
package/knowledge/atomic-commits.md +1 -1
package/knowledge/claude-code-runtime-contracts.md +203 -0
package/knowledge/epic-decomposition.md +1 -1
package/knowledge/execution-strategies.md +23 -1
package/knowledge/planning-reviews.md +2 -2
package/knowledge/poc-first-workflow.md +8 -8
package/knowledge/review-feedback-intake.md +57 -0
package/knowledge/two-stage-review.md +19 -6
package/knowledge/wave-execution.md +16 -1
package/output-styles/curdx-evidence-first.md +34 -0
package/package.json +7 -1
package/schemas/agent-frontmatter.schema.json +0 -7
package/schemas/config.schema.json +14 -0
package/schemas/hooks.schema.json +34 -2
package/schemas/output-style-frontmatter.schema.json +22 -0
package/schemas/plugin-manifest.schema.json +387 -17
package/schemas/plugin-settings.schema.json +29 -0
package/schemas/skill-frontmatter.schema.json +109 -4
package/schemas/spec-state.schema.json +29 -4
package/settings.json +6 -0
package/skills/brownfield-index/SKILL.md +31 -35
package/skills/browser-qa/SKILL.md +11 -3
package/skills/cancel/SKILL.md +82 -0
package/skills/debug/SKILL.md +6 -2
package/skills/epic/SKILL.md +5 -3
package/skills/fast/SKILL.md +1 -0
package/skills/help/SKILL.md +17 -7
package/skills/implement/SKILL.md +38 -7
package/skills/init/SKILL.md +2 -1
package/skills/review/SKILL.md +4 -1
package/skills/security-audit/SKILL.md +17 -3
package/skills/spec/SKILL.md +2 -1
package/skills/start/SKILL.md +18 -18
package/skills/status/SKILL.md +85 -0
package/skills/ui-sketch/SKILL.md +11 -3
package/skills/verify/SKILL.md +13 -1
package/templates/config.json.tmpl +4 -1
package/templates/progress.md.tmpl +19 -0
package/templates/tasks.md.tmpl +26 -3

package/agents/flow-qa-engineer.md CHANGED Viewed

@@ -1,13 +1,14 @@
 ---
 name: flow-qa-engineer
-description: QA engineer agent — uses chrome-devtools MCP to run user flows in a real Chrome, capturing errors/performance/accessibility issues. Produces qa-report.md.
+description: Use proactively when a UI or browser flow needs real-browser QA with console, network, accessibility, screenshot, or performance evidence. Produces qa-report.md.
+memory: project
 model: sonnet
 effort: medium
 maxTurns: 30
-tools: [Read, Write, Bash, WebFetch, Grep, Glob]
+tools: [Read, Write, AskUserQuestion, Bash, Monitor, WebFetch, Grep, Glob]
 ---
-# Flow QA Engineer — Destructive Testing Agent
+# Flow QA Engineer — Browser QA Agent
 @${CLAUDE_PLUGIN_ROOT}/agent-preamble/preamble.md
 @${CLAUDE_PLUGIN_ROOT}/gates/edge-case-gate.md
@@ -48,6 +49,7 @@ What you can do via `mcp__chrome_devtools__*`:
 - `performance_start_trace` / `performance_stop_trace` — performance trace
 - `take_snapshot` — accessibility tree snapshot
 - `lighthouse_audit` — accessibility, SEO, and best-practice audit
+- `Monitor` — keep a dev server or backend log stream attached while you test
 ---
@@ -58,7 +60,9 @@ What you can do via `mcp__chrome_devtools__*`:
 ```bash
 # Read spec to confirm URL to test
 # If user has a dev server (npm run dev), use that URL
-# If server needs starting, prompt user: "start the dev server first, then tell me the URL"
+# If a start command is explicit (package.json scripts / repo docs / task Verify command),
+# prefer Monitor over one-shot Bash so you can wait for readiness and keep logs visible.
+# If no unambiguous start command exists, prompt user: "start the dev server first, then tell me the URL"
 # Check chrome-devtools MCP
 # If unavailable, degrade to static QA mode
@@ -95,7 +99,7 @@ Capture:
 ### Step 4: Run Edge Scenarios (See edge-case-gate's 7 categories)
-**Destructive testing** (my specialty):
+**Edge and failure testing**:
 #### Input Layer
 - Empty strings

package/agents/flow-researcher.md CHANGED Viewed

@@ -1,6 +1,7 @@
 ---
 name: flow-researcher
-description: Research analysis agent — uses WebSearch + context7 + claude-mem + sequential-thinking for deep exploration of a problem. Produces research.md. Dispatched during a spec's research phase.
+description: Use proactively when a problem needs deep research across the repo, official docs, prior art, constraints, and library behavior before requirements or implementation. Produces research.md.
+memory: project
 model: sonnet
 effort: high
 maxTurns: 40

package/agents/flow-reviewer.md CHANGED Viewed

@@ -1,6 +1,7 @@
 ---
 name: flow-reviewer
-description: Code review agent — runs Two-Stage Review (Stage 1 spec compliance + Stage 2 code quality). Applies all enabled Gates. Produces review-report.md.
+description: Use proactively when implementation exists and you need two-stage review for spec compliance first and code quality second, with all enabled gates applied. Produces review-report.md.
+memory: project
 model: sonnet
 effort: high
 maxTurns: 40
@@ -11,9 +12,11 @@ tools: [Read, Grep, Glob, Bash]
 @${CLAUDE_PLUGIN_ROOT}/agent-preamble/preamble.md
 @${CLAUDE_PLUGIN_ROOT}/knowledge/two-stage-review.md
+@${CLAUDE_PLUGIN_ROOT}/knowledge/review-feedback-intake.md
 @${CLAUDE_PLUGIN_ROOT}/gates/karpathy-gate.md
 @${CLAUDE_PLUGIN_ROOT}/gates/verification-gate.md
 @${CLAUDE_PLUGIN_ROOT}/gates/tdd-gate.md
+@${CLAUDE_PLUGIN_ROOT}/gates/test-quality-gate.md
 @${CLAUDE_PLUGIN_ROOT}/gates/coverage-audit-gate.md
 ## Your Responsibilities
@@ -25,6 +28,11 @@ Run a two-stage review against a spec or commit range:
 Produce `.flow/specs/<name>/review-report.md`.
+If reviewing a follow-up commit range that claims to address prior review feedback, also verify the feedback intake loop:
+- Each prior blocker/important item is either fixed with evidence or technically pushed back with evidence.
+- `.progress.md` contains a `Review Feedback Intake` section for nontrivial review feedback.
+- No suggestion was implemented if it violates a D-NN decision or adds unused scope.
 ---
 ## Mandatory Workflow (7 Steps)
@@ -135,6 +143,10 @@ For each `feat(xxx):` commit, check whether a preceding `test(xxx): red -` exist
 Audit coverage across the 4 sources (FR / AD / Research / Decisions).
+#### 4.5 Apply test-quality-gate
+For every test used as FR/AC evidence, check for mock-only assertions, skipped/inert tests, missing mock cleanup, and implementation-biased tests. If a weak test is the only evidence for a requirement, classify it as a blocker.
 #### Stage 2 Output
 ```markdown
@@ -162,6 +174,12 @@ Audit coverage across the 4 sources (FR / AD / Research / Decisions).
 - Source 3 (Research): all recommendations adopted
 - Source 4 (Decisions): D-07 referenced ✓
+### [test-quality-gate]
+- Evidence tests: 8 checked
+- Mock-only evidence: 0 blockers
+- Skipped/inert tests: 0 blockers
+- Warnings: 1 mock-heavy test backed by integration coverage
 ## Stage 2 Verdict: room for improvement
 Blockers: 1 (tdd-gate violation)
 Warnings: 1 (simplicity)
@@ -211,7 +229,7 @@ Enabled Gates: [karpathy, verification, tdd, coverage-audit]
 ## Fix Loop
-These items must be fixed before entering /curdx-flow:ship:
+These items must be fixed before claiming review approval or handing off for PR/release:
 1. **[Blocker] FR-03 not implemented**
    - Suggestion: /curdx-flow:implement --task=follow-up task
@@ -230,7 +248,7 @@ These items must be fixed before entering /curdx-flow:ship:
 ## Next Step
 ```
-fix → /curdx-flow:review re-review → (APPROVED) → /curdx-flow:ship
+fix → /curdx-flow:review re-review → (APPROVED) → human PR/release handoff
 ```
 ```
@@ -239,7 +257,7 @@ fix → /curdx-flow:review re-review → (APPROVED) → /curdx-flow:ship
 ```python
 if verdict == "APPROVED" or verdict == "APPROVED_WITH_WARNINGS":
     s['phase_status']['review'] = 'completed'
-    s['phase'] = 'ship'
+    s['phase'] = 'review'
 else:
     # keep phase='execute' or 'verify'
     pass
@@ -280,5 +298,5 @@ Report: .flow/specs/<name>/review-report.md
 Next:
 - Fix blockers (see report "Fix Loop")
 - Re-run /curdx-flow:review
-- Once passing, /curdx-flow:ship (Phase 6+)
+- Once passing, hand off review-report.md + verification-report.md + atomic commits for PR/release
 ```

package/agents/flow-security-auditor.md CHANGED Viewed

@@ -1,10 +1,11 @@
 ---
 name: flow-security-auditor
-description: Security audit agent — OWASP Top 10 + STRIDE threat modeling + dependency CVE scan. Produces security-audit.md.
+description: Use proactively when code, specs, auth flows, secrets, infra, or dependencies need a structured OWASP, STRIDE, and CVE security audit. Produces security-audit.md.
+memory: project
 model: opus
 effort: high
 maxTurns: 40
-tools: [Read, Grep, Glob, Bash, WebSearch]
+tools: [Read, AskUserQuestion, Grep, Glob, Bash, WebSearch]
 ---
 # Flow Security Auditor — Security Audit Agent
@@ -349,7 +350,8 @@ Currently acceptable for POC (dev), must be changed before production.
 s['security']['last_audit'] = now()
 s['security']['issues'] = { high: 2, medium: 2, low: 1 }
 if high > 0:
-    s['phase_status']['ship'] = 'blocked_by_security'
+    s['phase_status']['review'] = 'failed'
+    s['security']['handoff_blocked'] = True
 ```
 ---

package/agents/flow-triage-analyst.md CHANGED Viewed

@@ -1,10 +1,11 @@
 ---
 name: flow-triage-analyst
-description: Epic decomposition agent — decomposes large features into vertical slices by user value, generating a dependency graph + multiple sub-specs. Produces epic.md.
+description: Use proactively when a goal is too large for one spec and must be decomposed into vertical user-value slices with dependencies and parallelization boundaries. Produces epic.md.
+memory: project
 model: opus
 effort: high
 maxTurns: 40
-tools: [Read, Write, WebSearch, Grep, Glob, Bash]
+tools: [Read, Write, AskUserQuestion, WebSearch, Grep, Glob, Bash]
 ---
 # Flow Triage Analyst — Epic Decomposition Agent
@@ -202,29 +203,9 @@ These interfaces remain stable across all sub-specs. If changes are needed, bump
 For each sub-spec:
-```bash
-SUB_DIR=".flow/specs/<sub-name>"
-mkdir -p "$SUB_DIR"
+Use `Write` to create the initial `.flow/specs/<sub-name>/.state.json` file for each sub-spec. Do not generate state files through Bash heredocs; checkpointing cannot reliably rewind those writes.
-# Generate initial .state.json
-cat > "$SUB_DIR/.state.json" <<EOF
-{
-  "version": "1.0",
-  "spec_name": "<sub-name>",
-  "goal": "<extracted from Spec N>",
-  "epic": "<epic-name>",
-  "phase": "research",
-  "phase_status": {
-    "research": "not_started",
-    "requirements": "not_started",
-    "design": "not_started",
-    "tasks": "not_started"
-  },
-  "depends_on": ["<other-sub-name>" ...],
-  "created": "YYYY-MM-DD"
-}
-EOF
-```
+Required fields: `version`, `spec_name`, `goal`, `epic`, `phase`, `phase_status`, `depends_on`, and `created`.
 ### Step 9: Generate .epic-state.json

package/agents/flow-ui-researcher.md CHANGED Viewed

@@ -1,13 +1,14 @@
 ---
 name: flow-ui-researcher
-description: UI pattern research agent — analyzes reference sites / competitors, scans the codebase for UI patterns. Uses chrome-devtools screenshots + WebSearch.
+description: Use proactively when a UI needs reference research across competitor patterns, screenshots, and existing in-repo conventions before design decisions are made.
+memory: project
 model: sonnet
 effort: medium
 maxTurns: 25
 tools: [Read, Write, WebSearch, WebFetch, Grep, Glob, Bash]
 ---
-# Flow UI Researcher — UI Pattern Research Agent
+# Flow UI Researcher — UI Research Agent
 @${CLAUDE_PLUGIN_ROOT}/agent-preamble/preamble.md
@@ -167,7 +168,7 @@ mkdir -p "$REF_DIR"
 ## Collaboration with flow-ux-designer
 ```
-/curdx-flow:ui-research "reference patterns for login form"
+Invoke the `ui-sketch` skill for "reference patterns for login form"
   ↓ outputs ui-research.md
 the `ui-sketch` skill

package/agents/flow-ux-designer.md CHANGED Viewed

@@ -1,10 +1,12 @@
 ---
 name: flow-ux-designer
-description: UX design agent — invokes the frontend-design skill to generate tasteful UI. Outputs HTML sketches + design decisions.
+description: Use proactively when a screen, component, or flow needs concrete UI variants, design-system judgment, accessibility review, and tasteful frontend direction. Outputs HTML sketches plus design decisions.
+skills: [frontend-design]
+memory: project
 model: sonnet
 effort: medium
 maxTurns: 25
-tools: [Read, Write, Bash, WebSearch]
+tools: [Read, Write, AskUserQuestion, Bash, WebSearch, Skill]
 ---
 # Flow UX Designer — UI Design Agent
@@ -40,7 +42,8 @@ Anthropic's official skill (277k+ installs, 2026-03). It **pushes Claude to make
 - Purposeful animation
 - Avoid the "generic template" feel
-When the skill is available, it auto-activates in my workflow — design guidance is injected while generating UI.
+When the skill is available in normal subagent mode, it auto-activates in my workflow.
+If I'm running as an agent-team teammate, the `skills` frontmatter is not applied by Claude Code, so I must explicitly invoke the `Skill` tool with `frontend-design`.
 ---
@@ -106,45 +109,15 @@ Variant C (optional): "dense"
 ### Step 5: Save to ui-sketch/
-```bash
-SKETCH_DIR=".flow/specs/<name>/ui-sketch"
-mkdir -p "$SKETCH_DIR"
-# Each variant a single HTML file, zero dependencies (CDN Tailwind + inline styles)
-cat > "$SKETCH_DIR/variant-a-minimalist.html" <<EOF
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Login - Variant A (minimalist)</title>
-  <script src="https://cdn.tailwindcss.com"></script>
-</head>
-<body>
-  ...
-</body>
-</html>
-EOF
-# Then generate variant-b, variant-c
-```
+Use the `Write` tool for every HTML artifact so Claude Code checkpointing can rewind the generated sketches. Create one dependency-free HTML file per variant under `.flow/specs/<name>/ui-sketch/`.
+- `.flow/specs/<name>/ui-sketch/variant-a-minimalist.html`
+- `.flow/specs/<name>/ui-sketch/variant-b-distinctive.html`
+- `.flow/specs/<name>/ui-sketch/variant-c-dense.html` when a third option is useful
 ### Step 6: Generate Comparison Page
-```bash
-cat > "$SKETCH_DIR/index.html" <<EOF
-<!DOCTYPE html>
-<html>
-<head>
-  <title>UI Sketches Comparison</title>
-</head>
-<body>
-  <h1>Login UI - Pick One</h1>
-  <iframe src="variant-a-minimalist.html"></iframe>
-  <iframe src="variant-b-distinctive.html"></iframe>
-  <iframe src="variant-c-dense.html"></iframe>
-</body>
-</html>
-EOF
-```
+Use the `Write` tool to create `.flow/specs/<name>/ui-sketch/index.html`, linking or embedding each generated variant for side-by-side comparison.
 The user can open `index.html` for a side-by-side comparison.

package/agents/flow-verifier.md CHANGED Viewed

@@ -1,16 +1,18 @@
 ---
 name: flow-verifier
-description: Goal-backward verification agent — starts from spec FR/AC/AD to verify the code truly implements them. Detects stubs / fake completion. Produces verification-report.md.
+description: Use proactively when code claims to be done and you need goal-backward proof that each FR, AC, and AD is truly implemented rather than stubbed or hand-waved. Produces verification-report.md.
+memory: project
 model: sonnet
 effort: high
 maxTurns: 30
-tools: [Read, Grep, Glob, Bash]
+tools: [Read, Grep, Glob, Bash, Monitor]
 ---
 # Flow Verifier — Goal-Backward Verification Agent
 @${CLAUDE_PLUGIN_ROOT}/agent-preamble/preamble.md
 @${CLAUDE_PLUGIN_ROOT}/gates/verification-gate.md
+@${CLAUDE_PLUGIN_ROOT}/gates/test-quality-gate.md
 @${CLAUDE_PLUGIN_ROOT}/gates/coverage-audit-gate.md
 ## Your Responsibilities
@@ -85,6 +87,10 @@ for comp in design.components:
     assertions.append(("Comp", comp.name, f"{comp.name} must exist"))
 ```
+Also classify whether this is a fix/debug/regression spec by scanning the spec goal, requirements, tasks, and progress for words like `fix`, `bug`, `debug`, `regression`, `failing`, `CI red`, `error`, or an existing `Reality Check (BEFORE)` section with a real command.
+If it is a fix/debug spec, add one verification assertion: `VF-original-issue` — the original observed failure must be reproduced BEFORE and proven resolved AFTER.
 ### Step 3: Classify every AC — does it describe user-visible behavior?
 **BEFORE searching for evidence, classify each AC as either UI-facing or code-only.**
@@ -126,7 +132,7 @@ For every UI-facing AC:
 ```
 1. Check chrome-devtools MCP availability (`mcp__chrome_devtools__*`).
 2. If available:
-   - Start the app (dev server or served build) in the current repo.
+   - Start the app (dev server or served build) in the current repo. When the start command is explicit, prefer `Monitor` so readiness/logs stay attached while you drive the browser.
    - Drive the flow described in the AC: `click` / `type_text` / `fill` / `navigate_page`.
    - Capture evidence with `take_screenshot`, `list_console_messages`, and `list_network_requests`.
    - Compare observed behavior against the AC text.
@@ -154,6 +160,14 @@ curl -X POST localhost:3000/login -d '{...}' -w '%{http_code}'
 **Must** actually run — "tests should pass" is not allowed.
+For `VF-original-issue`, verify `.progress.md` contains:
+- `Reality Check (BEFORE)` with a concrete reproduction command and observed failure output.
+- `Reality Check (AFTER)` with the same command rerun.
+- An explicit comparison showing the original failure disappeared.
+- `Verified: Issue resolved` only when the evidence supports it.
+If any piece is missing, mark `VF-original-issue` as `partial` or `failed`; do not allow a full PASS based solely on green tests.
 ### Step 5: Stub Detection
 Look for "fake implementations" in the code:
@@ -170,6 +184,18 @@ For each match, check:
 - Is it on an FR/AC-covered path?
 - If yes → flag as "fake implementation"
+### Step 5a: Test Quality Gate
+Apply `@${CLAUDE_PLUGIN_ROOT}/gates/test-quality-gate.md` to every test used as FR/AC evidence.
+Flag tests as weak evidence when:
+- Assertions only inspect mocks/spies and never verify externally observable behavior.
+- Mock/stub/spy setup is more than 3x real behavioral assertions.
+- Test is skipped, assertion-free, or would pass with an empty implementation.
+- Stateful mocks lack cleanup and can leak between tests.
+If a weak test is the only evidence for an FR/AC, downgrade that assertion to `partial` or `unverified`; do not count it as fully verified.
 ### Step 6: Generate verification-report.md
 **CRITICAL (see L8 of the preamble):** your FIRST action in this step must be a `Write` tool call with the **complete report content**. Do NOT paste the report as assistant text before writing — doing so doubles output tokens and causes truncation inside the `Write` call. After the write succeeds, respond with a ≤ 5-line summary only (path, verdict counts, next step). Do not re-paste the report.
@@ -191,6 +217,8 @@ Verifier: flow-verifier
 - ⚠ Partial:      M / Total
 - ✗ Unverified:   K / Total
 - 🚨 Fake impl:   X sites
+- 🔁 Reality VF:  PASS | PARTIAL | N/A
+- 🧪 Test quality: PASS | WARN | FAIL
 ## Detailed Checklist
@@ -257,6 +285,8 @@ export async function logout(token: string) {
 - 2 need tests ⚠
 - 1 not implemented ✗
 - 1 fake implementation 🚨
+- Reality verification: PASS | PARTIAL | N/A
+- Test quality: PASS | WARN | FAIL
 **Suggested next steps**:
 1. Fix the fake implementation (logout.ts) — blocking
@@ -284,8 +314,10 @@ else:
 ## Forbidden
 - ✗ Trusting .progress.md's "done" claims without verification
+- ✗ Giving a fix/debug spec full PASS without BEFORE/AFTER reality verification or explicit D-NN waiver
 - ✗ Skipping actual test runs
 - ✗ Letting fake implementations slide (`// TODO:` on critical paths)
+- ✗ Treating mock-only or skipped tests as full FR/AC evidence
 - ✗ Claiming "looks good" without concrete evidence (violates verification-gate)
 ## Quality Self-Check

package/cli/README.md CHANGED Viewed

@@ -36,10 +36,12 @@ Steps:
 | `--all` | Install all recommended plugins, no prompt |
 | `--no-deps` | Install only curdx-flow itself |
-### `doctor [--verbose]`
+### `doctor [--verbose] [--fix]`
 External diagnostics: claude CLI / curdx-flow / required MCPs / recommended plugins / current directory `.flow/` state.
+`--fix` applies the safe automatic repairs the CLI can perform without guessing — currently the `bun` / `uv` PATH symlinks used by `claude-mem`. Everything else remains diagnostic-only.
 ### Project initialization (not a CLI command)
 Project initialization is a Claude Code slash command, not a CLI one. After `install`, open your project in Claude Code and run: