npm - @curdx/flow - Versions diffs - 2.0.0-beta.1 → 2.0.0-beta.3 - Mend

@curdx/flow 2.0.0-beta.1 → 2.0.0-beta.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/agent-preamble/preamble.md +47 -0
package/agents/flow-planner.md +21 -39
package/agents/flow-reviewer.md +5 -1
package/agents/flow-verifier.md +6 -0
package/cli/install.js +16 -5
package/commands/review.md +9 -0
package/commands/spec.md +24 -0
package/commands/verify.md +13 -0
package/package.json +1 -1

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -6,7 +6,7 @@
   },
   "metadata": {
     "description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
-    "version": "2.0.0-beta.1"
+    "version": "2.0.0-beta.3"
   },
   "plugins": [
     {

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "curdx-flow",
-  "version": "2.0.0-beta.1",
+  "version": "2.0.0-beta.3",
   "description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
   "author": {
     "name": "wdx",

package/agent-preamble/preamble.md CHANGED Viewed

@@ -210,5 +210,52 @@ When you need to delegate to a sub-agent:
 ---
+## L8: Long-artifact handling (truncation prevention)
+When your job is to produce a long Markdown artifact (`tasks.md`, `verification-report.md`, `review-report.md`, `research.md`, `requirements.md`, `design.md`, etc.), follow these rules. Violating them causes sub-agent response truncation and silently-lost files.
+### Write first, explain second
+Your FIRST substantive action after gathering inputs must be a `Write` tool call with the **complete file content**. Do NOT paste the content as assistant text before writing.
+- ✗ *"Here's the tasks.md I'll write:"* followed by a 500-line markdown code block, then a `Write` call containing the same 500 lines — this doubles the output tokens and usually hits the truncation limit mid-`Write`, leaving the file missing or partial.
+- ✓ Immediately `Write` the file with full content. Then output a ≤ 5-line summary.
+### Do not preview
+Never output the file's content in your response. The file IS the deliverable — the reader opens it. Your response is just the ack that you wrote it.
+### After write, summarize only
+After `Write` returns success, respond with **at most 5 lines** summarizing what you wrote:
+```
+✓ Wrote .flow/specs/<spec>/tasks.md
+  40 tasks across 5 phases
+  Coverage: FR 10/10, AC 12/12, AD 4/4
+  Next: /curdx-flow:implement
+```
+Do not re-paste any file contents. Do not narrate your reasoning. Do not list every task inline.
+### Split if >200 lines
+If the artifact would exceed ~200 lines of Markdown, split it:
+- `tasks.md` references `tasks-phase-1.md` … `tasks-phase-5.md`
+- Each phase file is its own `Write` call
+- The index file is a short table linking to the phase files
+This keeps every individual `Write` call under the safe size budget.
+### If you see a token-budget warning
+Stop narrating and call `Write` with whatever content is ready. Sub-agents do not have a "next response" — continuation is not possible after truncation. Save what you have, then return.
+### Why this matters
+Sub-agents invoked via the `Task` tool have a ~16 K output-token budget per invocation. A naive agent that previews then writes consumes those tokens twice — once as prose, once inside the tool call — and truncation typically lands inside the `Write` call itself. The parent command then reports "agent did not complete" and re-dispatches, burning compute for no new artifact. Writing first eliminates the failure mode at the source.
+---
 **Remember**: this preamble exists because, without discipline, AI tends to slack off, hallucinate, and over-engineer.
 These rules are not constraints — they are the tools that make you reliable.

package/agents/flow-planner.md CHANGED Viewed

@@ -132,31 +132,23 @@ For each of the following sources, every item must be covered by tasks:
 ### Step 6: Write tasks.md + State
-Based on `${CLAUDE_PLUGIN_ROOT}/templates/tasks.md.tmpl`.
+**CRITICAL (see L8 of the preamble — long-artifact handling):**
+- Your FIRST action in this step must be a `Write` tool call with the full `tasks.md` content. Do NOT paste the file content as assistant text before writing.
+- Do NOT preview the tasks list in the response. The file itself is the deliverable.
+- If `tasks.md` would be >200 lines, split into `tasks-phase-1.md` … `tasks-phase-5.md` and make `tasks.md` a short index linking to them.
-Must include a **coverage audit table** at the end (from Step 5):
+Based on `${CLAUDE_PLUGIN_ROOT}/templates/tasks.md.tmpl`. Must include a **coverage audit table** at the end (from Step 5).
-```markdown
-## Coverage Audit
-| Requirement ID | Corresponding Tasks | Status |
-|--------|---------|------|
-| FR-01  | 1.2, 3.1 | ✓ |
-| FR-02  | 3.2     | ✓ |
-| AC-1.1 | 3.1     | ✓ |
-| AD-03  | 1.1, 2.1 | ✓ |
-```
-Then:
+After the `Write` succeeds:
+1. Update `.flow/specs/<name>/.state.json`:
+   ```
+   phase_status.tasks = "completed"
+   total_tasks = <N>
+   ```
+2. Append to `.flow/specs/<name>/.progress.md`:
+   `## tasks phase complete, total N tasks`
-```
-.flow/specs/<name>/.state.json:
-  phase_status.tasks = "completed"
-  total_tasks = <N>
-.flow/specs/<name>/.progress.md:
-  Append "## tasks phase complete, total N tasks"
-```
+Then emit the 5-line summary (see "Output to User" below). No inline task listing.
 ## Output Quality Bar (Self-Check)
@@ -182,23 +174,13 @@ Then:
 Based on `_` in `.flow/specs/<name>/.state.json` or `specs.default_task_size` in `.flow/config.json`.
-## Output to User
+## Output to User (5 lines max, after Write succeeds)
 ```
-✓ Task breakdown complete: .flow/specs/<name>/tasks.md
-N tasks total, across 5 Phases:
-  Phase 1 (POC):        X tasks
-  Phase 2 (Refactor):   Y tasks
-  Phase 3 (Testing):    Z tasks
-  Phase 4 (Quality):    W tasks
-  Phase 5 (PR):         V tasks
-Coverage audit: FR (A/B) | AC (C/D) | AD (E/F) all covered ✓
-Estimated effort: N tasks × 5 minutes ≈ M minutes
-Next:
-  - Review tasks.md
-  - /curdx-flow:implement — start execution (after Phase 2 is released)
+✓ Wrote .flow/specs/<name>/tasks.md
+  N tasks across 5 Phases (X/Y/Z/W/V)
+  Coverage: FR A/B | AC C/D | AD E/F
+  Next: /curdx-flow:implement
 ```
+**Do not re-paste the tasks.md content inline. Do not list every task. Just the summary.**

package/agents/flow-reviewer.md CHANGED Viewed

@@ -187,7 +187,11 @@ else:
 ### Step 6: Generate review-report.md
-Full structure:
+**CRITICAL (see L8 of the preamble):** your FIRST action in this step must be a `Write` tool call with the **complete report content**. Do NOT paste the report as assistant text before writing. After the write succeeds, respond with a ≤ 5-line summary only (path, verdict, blocker count, next step). Do not re-paste the report.
+If the report would exceed ~200 lines, split into `review-report.md` (short index + verdict) and `review-details.md` (full findings) — two `Write` calls.
+Full structure (use this as the content passed to `Write`, not as preview text):
 ```markdown
 # Review Report: <spec-name>

package/agents/flow-verifier.md CHANGED Viewed

@@ -145,6 +145,12 @@ For each match, check:
 ### Step 6: Generate verification-report.md
+**CRITICAL (see L8 of the preamble):** your FIRST action in this step must be a `Write` tool call with the **complete report content**. Do NOT paste the report as assistant text before writing — doing so doubles output tokens and causes truncation inside the `Write` call. After the write succeeds, respond with a ≤ 5-line summary only (path, verdict counts, next step). Do not re-paste the report.
+If the report would exceed ~200 lines, split into `verification-report.md` (short index + verdict) and `verification-details.md` (full findings table) — two `Write` calls.
+Required structure (use this as the content passed to `Write`, not as preview text):
 ```markdown
 # Verification Report: <spec-name>

package/cli/install.js CHANGED Viewed

@@ -237,15 +237,26 @@ export async function install(args = []) {
 }
 function printNextSteps() {
+  // Detect whether the CLI is globally installed (curdx-flow on PATH) or
+  // the user ran us via npx. Tell them the right invocation each time.
+  const cliOnPath = has("curdx-flow");
+  const cliCmd = cliOnPath ? "curdx-flow" : "npx @curdx/flow";
   console.log(`\n${color.bold("✅ Install complete")}\n`);
+  console.log(`${color.bold("Restart Claude Code")} so the plugin registers all its commands and hooks.\n`);
   console.log(`${color.bold("Next steps")}:\n`);
   console.log(`  ${color.dim("# Verify health")}`);
-  console.log(`  curdx-flow doctor\n`);
-  console.log(`  ${color.dim("# Initialize .flow/ in your project")}`);
-  console.log(`  cd ~/your-project && curdx-flow init\n`);
-  console.log(`  ${color.dim("# Start using it (inside Claude Code)")}`);
+  console.log(`  ${cliCmd} doctor\n`);
+  console.log(`  ${color.dim("# Inside any project, initialize and start a feature spec")}`);
+  console.log(`  ${color.cyan("cd ~/your-project")}`);
   console.log(`  ${color.cyan("claude")}`);
-  console.log(`  ${color.cyan("/curdx-flow:start my-feature \"<describe what to build>\"")}\n`);
+  console.log(`  ${color.cyan("/curdx-flow:init")}`);
+  console.log(`  ${color.cyan("/curdx-flow:start my-feature \"<one-line goal>\"")}\n`);
+  if (!cliOnPath) {
+    console.log(
+      `${color.dim("Tip: install the CLI globally for shorter commands —")} ${color.cyan("npm i -g @curdx/flow")}\n`
+    );
+  }
   console.log(
     `${color.bold("Learn more")}: https://github.com/curdx/curdx-flow/blob/main/docs/getting-started.md\n`
   );

package/commands/review.md CHANGED Viewed

@@ -91,6 +91,15 @@ Output: test-gap checklist with suggested test cases.
 ## Report
+**Landing check**: sub-agent responses can be truncated. After dispatching review agents, verify the report actually landed on disk:
+```bash
+REPORT=".flow/specs/$SPEC_NAME/review-report.md"
+if [ ! -f "$REPORT" ] || [ "$(wc -c < "$REPORT" 2>/dev/null | tr -d ' ')" -lt 300 ]; then
+  echo "⚠ Report missing or truncated. Re-dispatching flow-reviewer with a terse 'Write the report now, no narration' prompt."
+fi
+```
 Consolidated output: `.flow/specs/$SPEC_NAME/review-report.md`:
 ```markdown

package/commands/spec.md CHANGED Viewed

@@ -98,6 +98,30 @@ After each phase completes successfully, update `.state.json`:
 }
 ```
+### Artifact landing check (mandatory after every phase)
+Sub-agent responses can be truncated by the model's output-length limit, which means the `Write` tool call for the phase's Markdown artifact may never fire. Do NOT trust the agent's return value alone — always verify the file actually landed.
+For each phase just dispatched, run:
+```bash
+ARTIFACT=".flow/specs/$SPEC_NAME/<phase>.md"
+if [ ! -f "$ARTIFACT" ]; then
+  echo "⚠ $ARTIFACT did not land. Re-dispatching <phase> agent with an explicit 'write the file' prompt."
+  # Re-dispatch the same agent, but in the prompt, front-load:
+  #   "Your ONLY job is to call the Write tool with the full <phase>.md content now.
+  #    Do not explain. Do not narrate. Write the file and stop."
+  # This pattern produces an artifact even when prior verbosity caused truncation.
+fi
+# Minimum-size sanity check — if the file is <500 bytes, the write likely truncated
+if [ -f "$ARTIFACT" ] && [ "$(wc -c < "$ARTIFACT" | tr -d ' ')" -lt 500 ]; then
+  echo "⚠ $ARTIFACT looks truncated (<500 bytes). Re-dispatching to complete it."
+fi
+```
+Only advance `.state.json.phase` after both the file exists AND passes the size sanity check. If a re-dispatch also fails to produce the artifact, stop and surface the issue to the user instead of silently advancing — that prevents later phases from consuming an empty upstream file.
 ## Optional planning review
 If `--review` (or `--review=<dims>`) is present:

package/commands/verify.md CHANGED Viewed

@@ -67,6 +67,19 @@ If `--strict`:
 ### Step 4: Produce `verification-report.md`
+**Landing check**: sub-agent responses can be truncated by the model's output-length limit. After dispatching `flow-verifier`, verify the report actually landed:
+```bash
+REPORT=".flow/specs/$SPEC_NAME/verification-report.md"
+if [ ! -f "$REPORT" ] || [ "$(wc -c < "$REPORT" 2>/dev/null | tr -d ' ')" -lt 300 ]; then
+  echo "⚠ Report missing or truncated. Re-dispatching flow-verifier with a terse 'write the report now' prompt."
+  # Re-dispatch pattern:
+  #   "Your only job right now is to Write the verification-report.md using the
+  #    findings you already gathered. Do not re-scan. Do not narrate. Write
+  #    the file and stop."
+fi
+```
 Write to `.flow/specs/$SPEC_NAME/verification-report.md`:
 ```markdown

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@curdx/flow",
-  "version": "2.0.0-beta.1",
+  "version": "2.0.0-beta.3",
   "description": "CLI installer for CurDX-Flow — AI engineering workflow meta-framework for Claude Code",
   "type": "module",
   "bin": {