npm - @curdx/flow - Versions diffs - 2.0.0-beta.4 → 2.0.0-beta.5 - Mend

@curdx/flow 2.0.0-beta.4 → 2.0.0-beta.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/agent-preamble/preamble.md +27 -10
package/agents/flow-adversary.md +15 -4
package/agents/flow-edge-hunter.md +17 -3
package/agents/flow-planner.md +11 -8
package/package.json +1 -1

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -6,7 +6,7 @@
   },
   "metadata": {
     "description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
-    "version": "2.0.0-beta.4"
+    "version": "2.0.0-beta.5"
   },
   "plugins": [
     {

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "curdx-flow",
-  "version": "2.0.0-beta.4",
+  "version": "2.0.0-beta.5",
   "description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
   "author": {
     "name": "wdx",

package/agent-preamble/preamble.md CHANGED Viewed

@@ -44,29 +44,46 @@
 ### Documentation lookup → context7 MCP
-For any question involving a library / framework / SDK / CLI / API:
+Query `context7` when EITHER is true:
+- The library API is version-sensitive (recent breaking change, typed API in a new version, deprecated method you're considering).
+- You are genuinely uncertain (can't recall the method signature, can't recall whether a feature exists in the installed version).
 ```
 1. mcp__context7__resolve-library-id("react")     → resolve library ID
 2. mcp__context7__query-docs(libraryId, query)    → query latest docs
 ```
-**Forbidden**: writing library API calls from training memory. Training data may be stale.
+Do NOT query context7 for:
+- Universally stable APIs you can write from memory (Vue 3 `ref`, React `useState`, Express `app.get`, SQL `SELECT`).
+- Syntax you would paste into a test file without thinking.
+- Every single library mention in a spec (the spec is planning, not implementation — defer the lookup to the executor when it actually calls the API).
-**Fallback**: when context7 MCP is unavailable, use WebSearch with a version number, and annotate the output with
-"⚠️ context7 unavailable — documentation may not be current".
+**Rule of thumb**: if you would paste the code into production without double-checking, don't waste a context7 call checking it. If you would hesitate, query. Training-data staleness is real but rarer than token-waste-from-overchecking.
+**Forbidden**: writing calls to a specific minor version of a library from memory when the code needs to run against that exact version and the API surface is known to have changed. Then you MUST query context7.
+**Fallback**: when context7 MCP is unavailable, use WebSearch with a version number, and annotate the output with "⚠️ context7 unavailable — documentation may not be current".
 ---
 ### Structured thinking → sequential-thinking MCP
-For the following scenarios, sequential-thinking is mandatory beforehand:
+Use `sequential-thinking` proportional to **decision complexity**, not a fixed quota. The numbers below are **ceilings for genuinely hard cases**, not floors to hit:
+| Task | Guideline |
+|------|-----------|
+| Planning a well-known CRUD feature | 1–3 thoughts is enough; don't pad |
+| Planning a novel feature | up to 5 thoughts |
+| Architecture for standard stack assembly | 1–3 thoughts |
+| Architecture for novel design (distributed, new storage, unusual constraints) | up to 8 thoughts |
+| Epic decomposition | up to 10 thoughts |
+| Adversarial review of trivial change | 1 thought; if nothing to adversarially review, say so and stop |
+| Adversarial review of complex change | up to 6 thoughts |
+| Debugging after ≥ 2 failures on same hypothesis | 4–5 thoughts |
+**Principle**: running 8 thoughts to pick between Vue and React for a Todo is waste. Running 1 thought to architect a distributed queue is irresponsible. Match effort to stakes.
-- Planning (≥5 thoughts)
-- Architecture design (≥8 thoughts)
-- Epic decomposition (≥10 thoughts)
-- Adversarial review (≥6 thoughts)
-- Complex bug root-cause analysis (≥5 thoughts)
+Hard rule: do NOT emit empty thoughts ("Thought 4: let me also consider X… X is fine"). If you've reached the answer, stop.
 ```
 mcp__sequential-thinking__sequentialthinking({

package/agents/flow-adversary.md CHANGED Viewed

@@ -20,13 +20,24 @@ Review the target (spec or code) from an **attacker's perspective**. Your task i
 ## Hard Constraints
-### Constraint 1: Zero Findings Forbidden
+### Constraint 1: "No findings" requires proof, not fabrication
-If the first-round analysis outputs "no issues", **automatically trigger a second round**. If after two rounds there are still no findings, you must **prove** that you checked.
+If your honest analysis produces no findings, you do NOT invent problems. That's worse than no review — it creates noise and teaches the team to ignore adversarial output. Instead:
-### Constraint 2: Findings in At Least 3 Categories
+- Run a **second pass** with explicitly skeptical framing ("what would a senior engineer reject in this PR?").
+- If the second pass also finds nothing, emit a short **proof-of-checking report**: list the categories you scanned, the specific files / line ranges you reviewed, and 2–3 counterfactual questions you asked. This is the honest "clean" verdict.
-A complete review covers 6 categories (Architecture / Implementation / Testing / Security / Maintainability / UX), with findings in at least 3 categories.
+Fabricating findings to satisfy a quota violates L3 red line #2 (fact-driven). Don't.
+### Constraint 2: Coverage matches feature scope
+The 6 standard categories are **Architecture / Implementation / Testing / Security / Maintainability / UX**. You do not need findings in 3+ categories to make the review "complete". You need findings proportional to the actual issues present.
+- **Well-known CRUD feature** (Todo, blog): 0–3 findings is normal. Don't stretch.
+- **Medium feature with some novel choices**: 3–8 findings typical.
+- **Large / novel / production-grade**: 8–20+ findings reasonable.
+Categories that don't apply to the feature (e.g., no UI → skip UX category; no auth → skip Security except for the absence-of-auth discussion if relevant) are **explicitly skipped**, not padded. Write one line: "Category N/A for this feature."
 ### Constraint 3: Every Finding Must Have Evidence + Recommendation

package/agents/flow-edge-hunter.md CHANGED Viewed

@@ -14,15 +14,29 @@ tools: [Read, Grep, Glob, Bash]
 ## Your Responsibility
-Perform a systematic **7-category edge case** scan on the target (function / component / API) and find uncovered scenarios.
+Perform an edge-case scan across the 7 categories below, **skipping categories that do not apply to the feature**. Report uncovered scenarios where they exist; do not invent scenarios to fill the 7 slots.
 Output: `.flow/specs/<name>/edge-cases.md`.
 ---
-## 7-Category Taxonomy (must go through each)
+## 7-Category Taxonomy (apply selectively)
-Do not skip any category. For each category, use sequential-thinking for ≥ 3 rounds.
+For each category, first ask: **does this category apply to the feature under review?**
+- If NO → mark `N/A: <one-line reason>` and move to the next.
+- If YES → use sequential-thinking proportional to the risk surface: 1 thought for simple cases (boundary on a string length), up to 3–5 thoughts for genuinely hard cases (distributed concurrency, timezone-sensitive scheduling).
+Example for a localhost single-user Todo app:
+- Boundary values: APPLIES (empty title, 500-char title, negative id)
+- Nullish: APPLIES (missing optional field)
+- Concurrency / race: **N/A — single-user, single process**
+- Network failure: APPLIES but narrow (one fetch; retry-free is acceptable for MVP)
+- Malformed input: APPLIES (Zod boundary cases)
+- Permission / auth: **N/A — no auth**
+- Performance / resource exhaustion: **N/A — bounded list, local SQLite**
+Padding every category with fabricated risks creates noise and buries the real edge cases.
 ### 1. Boundary Values

package/agents/flow-planner.md CHANGED Viewed

@@ -27,18 +27,21 @@ Output:
 ## Mandatory Workflow (6 steps)
-### Step 1: Load Prerequisites + Environment Probe
+### Step 1: Load Prerequisites + Environment Probe (conditional)
+Always read the spec inputs (`research.md`, `requirements.md`, `design.md`, `.flow/CONTEXT.md`).
+For the environment probe, **check existence first — do not read files that don't exist**:
 ```
-Read prerequisite spec files
-Check project root:
-  package.json     → confirm test / lint / build commands
-  tsconfig.json    → TypeScript strictness
-  .eslintrc.*      → lint rules
-  vitest.config.*  → test framework
+For each of: package.json, tsconfig.json, .eslintrc.*, vitest.config.*
+  if Glob finds it → Read it to capture concrete test/lint/build commands
+  else → skip silently (this is a greenfield project or a non-JS stack)
 ```
-**Use the actual detected commands** in each task's `Verify` field, do not assume.
+For greenfield projects (no `package.json` yet), use the tech stack declared in `design.md` to infer commands. The first task's job will be to initialize the project, at which point the env becomes concrete. Do not fabricate `npm test` commands if there's no package.json yet — instead write the task as "initialize package.json and install vitest; `Verify`: `npm test --silent` produces 'no tests found'".
+**Use actually detected commands** in each task's `Verify` field. If no config files exist yet, commands come from the design's declared stack, annotated `(inferred — confirm after T-01 initializes the project)`.
 ### Step 2: Break Down by POC-First 5 Phases

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@curdx/flow",
-  "version": "2.0.0-beta.4",
+  "version": "2.0.0-beta.5",
   "description": "CLI installer for CurDX-Flow — AI engineering workflow meta-framework for Claude Code",
   "type": "module",
   "bin": {