npm - @brunosps00/dev-workflow - Versions diffs - 0.15.0 → 1.0.1 - Mend

@brunosps00/dev-workflow 0.15.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (136) hide show

package/scaffold/en/commands/dw-autopilot.md CHANGED Viewed

@@ -15,18 +15,18 @@ A step that invokes a `/dw-xxx` command is ONLY considered complete when the art
 | Thought | Reality |
 |---------|---------|
 | "I already ran the tests manually" | The command produces structured artifacts. Run the command. |
-| "I validated via ad-hoc Playwright" | `/dw-run-qa` requires RF matrix, bugs.md, screenshots, scripts, logs, checklist. Run the command. |
-| "The implementation is obviously correct" | `/dw-review-implementation` requires a compliance matrix per RF/endpoint/task. Run the command. |
+| "I validated via ad-hoc Playwright" | `/dw-qa` requires RF matrix, bugs.md, screenshots, scripts, logs, checklist. Run the command. |
+| "The implementation is obviously correct" | `/dw-review --coverage-only` requires a compliance matrix per RF/endpoint/task. Run the command. |
 | "A strong manual validation is enough" | NO. Technical equivalence DOES NOT replace formal execution. |
 | "I already checked build and lint, that's enough" | Build/lint DO NOT replace review nor QA. Run the commands. |
-| "I wrote a summarized qa-report.md by hand" | A loose file IS NOT execution of `/dw-run-qa`. The full `QA/` tree is mandatory. |
+| "I wrote a summarized qa-report.md by hand" | A loose file IS NOT execution of `/dw-qa`. The full `QA/` tree is mandatory. |
 | "The autopilot already advanced, I don't need to go back" | If the artifact doesn't exist, the step didn't run. Go back and execute. |
-| "I fixed bugs along the way, so QA is already ok" | Fixing bugs does not replace running formal QA. Run `/dw-run-qa`. |</critical>
+| "I fixed bugs along the way, so QA is already ok" | Fixing bugs does not replace running formal QA. Run `/dw-qa`. |</critical>
 ## When to Use
 - Use when you want to go from an idea to a PR with minimal manual intervention
 - Use for complete features that go through the entire pipeline (research, planning, execution, quality)
-- Do NOT use for small, well-scoped one-off tasks — use `/dw-run-task` directly with a quick PRD instead
+- Do NOT use for small, well-scoped one-off tasks — use `/dw-run` directly with a quick PRD instead
 - Do NOT use to fix bugs (use `/dw-bugfix`)
 - Do NOT use when you want manual control between each phase (use individual commands)
@@ -72,12 +72,12 @@ If this command is re-invoked on the same PRD after interruption:
 - Query `.dw/intel/` via `/dw-intel` to understand project context
 - Identify: tech stack, existing patterns, related features
-- If `.dw/intel/` is absent, suggest running `/dw-map-codebase` first for richer downstream context
+- If `.dw/intel/` is absent, suggest running `/dw-intel --build` first for richer downstream context
 ### Step 2: Research (Conditional)
 Evaluate whether the topic requires deep research:
-- **YES** (run `/dw-deep-research`): new technology for the project, unknown domain, external API integrations, critical architectural decisions
+- **YES** (run `/dw-brainstorm --research`): new technology for the project, unknown domain, external API integrations, critical architectural decisions
 - **NO** (skip to step 3): simple feature in an already mapped domain, refactoring existing code, basic CRUD
   - If skipping, DOCUMENT the reason in the progress block. E.g.: "Research skipped — domain already mapped in .dw/rules/[file].md". The user must see the justification.
@@ -94,7 +94,7 @@ Run `/dw-brainstorm` with accumulated context (intel + research).
 <critical>The PRD MUST include an interactive interview with the user. Ask AT LEAST 7 clarification questions BEFORE writing the PRD. Do NOT answer questions automatically based on context — the user MUST respond.</critical>
-Run `/dw-create-prd` using brainstorm findings.
+Run `/dw-plan prd` using brainstorm findings.
 - Follow ALL command instructions, especially the clarification questions section
 - Ask at least 7 questions about: problem, target users, critical features, scope, constraints, design, integration
 - In each question, present a recommendation grounded in brainstorm and deep-research findings (if executed). E.g.: "Based on the research, I recommend X because [evidence]. Do you agree or prefer a different direction?"
@@ -115,7 +115,7 @@ Present to the user:
 <critical>The TechSpec MUST include an interactive interview with the user. Ask AT LEAST 7 technical clarification questions BEFORE writing the TechSpec. Do NOT answer questions automatically — the user MUST respond.</critical>
-Run `/dw-create-techspec` from the approved PRD.
+Run `/dw-plan techspec` from the approved PRD.
 - Follow ALL command instructions, especially the clarification questions section
 - Ask at least 7 questions about: preferred architecture, existing vs new libs, testing strategy, integration with existing systems, infrastructure constraints, performance, security
 - In each question, present a technical recommendation grounded in brainstorm, deep-research, and approved PRD findings. E.g.: "Research indicated lib X has better performance for this case [source]. Want to use X or have another preference?"
@@ -125,7 +125,7 @@ Run `/dw-create-techspec` from the approved PRD.
 ### Step 6: Tasks
-Run `/dw-create-tasks` from PRD + TechSpec.
+Run `/dw-plan tasks` from PRD + TechSpec.
 - Follow all command instructions
 - Generate individual tasks in `.dw/spec/prd-[name]/`
@@ -148,9 +148,9 @@ Evaluate whether tasks involve frontend:
 ### Step 8: Execution
-Run `/dw-run-plan` with the PRD path.
+Run `/dw-run` with the PRD path.
 - Follow ALL command instructions, including the native plan-checker gate (PASS required) and wave-based parallel execution via the bundled `dw-execute-phase` skill agents
-- Each task follows `/dw-run-task` with Level 1 validation
+- Each task follows `/dw-run` with Level 1 validation
 ### Step 9: Implementation Review (Loop)
@@ -164,7 +164,7 @@ Run the project's build and lint:
 5. Repeat until both build AND lint pass without errors
 6. Only then proceed to the review
-Run `/dw-review-implementation` to verify PRD compliance (Level 2).
+Run `/dw-review --coverage-only` to verify PRD compliance (Level 2).
 - If gaps found: fix automatically and re-run the review
 - Maximum 3 correction cycles
 - Do NOT advance to QA until the review passes
@@ -177,7 +177,7 @@ A short text review, a "looks good", or a conclusion "implementation is correct"
 ### Step 10: Visual QA
-Run `/dw-run-qa` with Playwright MCP.
+Run `/dw-qa` with Playwright MCP.
 - Test happy paths, edge cases, negative flows, accessibility
 - Document bugs with screenshots
@@ -188,12 +188,12 @@ Run `/dw-run-qa` with Playwright MCP.
 - `{{PRD_PATH}}/QA/screenshots/` — directory exists and contains at least 1 PNG per RF tested (format `RF-XX-[slug]-PASS.png` or `-FAIL.png`)
 - `{{PRD_PATH}}/QA/scripts/` — directory exists and contains Playwright `.spec.ts`/`.spec.js` scripts per RF
 - `{{PRD_PATH}}/QA/logs/` — directory exists with captured console/network logs
-Running Playwright ad-hoc, taking a few loose screenshots, or writing a short qa-report.md by hand DOES NOT replace this structure. If any artifact is missing or incomplete, the command did NOT run — invoke `/dw-run-qa` formally and follow its flow to completion.</critical>
+Running Playwright ad-hoc, taking a few loose screenshots, or writing a short qa-report.md by hand DOES NOT replace this structure. If any artifact is missing or incomplete, the command did NOT run — invoke `/dw-qa` formally and follow its flow to completion.</critical>
 ### Step 11: Fix QA (Conditional)
 If QA found bugs:
-- Run `/dw-fix-qa` to fix and retest
+- Run `/dw-qa --fix` to fix and retest
 - Loop until stable (maximum 5 cycles). After 5 cycles, STOP and ask the user how to proceed.
 ### Step 12: Implementation Review (Post-QA)
@@ -205,7 +205,7 @@ Run the project's build and lint (same sequence as Step 9):
 2. Build
 3. If any fail: fix and re-run until they pass
-Run `/dw-review-implementation` again to confirm QA fixes did not break PRD compliance.
+Run `/dw-review --coverage-only` again to confirm QA fixes did not break PRD compliance.
 - If gaps found: fix and re-run
 - Maximum 3 cycles
@@ -213,7 +213,7 @@ Run `/dw-review-implementation` again to confirm QA fixes did not break PRD comp
 ### Step 13: Code Review
-Run `/dw-code-review` (Level 3) for formal review.
+Run `/dw-review --code-only` (Level 3) for formal review.
 - Generate persisted report
 ### Step 14: Commit
@@ -227,7 +227,7 @@ Run `ls` on each path below and confirm existence. If ANY is missing, DO NOT com
 - `{{PRD_PATH}}/QA/screenshots/` (non-empty)
 - `{{PRD_PATH}}/QA/scripts/` (non-empty with `.spec.*` files)
 - `{{PRD_PATH}}/QA/logs/`
-- Evidence of the last `/dw-review-implementation` run with RF-by-RF matrix (session output or reference in `autopilot-state.json`)
+- Evidence of the last `/dw-review --coverage-only` run with RF-by-RF matrix (session output or reference in `autopilot-state.json`)
 Also verify `autopilot-state.json`:
 - Every step 1 through 13 that is NOT in `skipped_steps` must be in `completed_steps`
@@ -250,7 +250,7 @@ Ask the user: **"Commits completed. Do you want to generate the Pull Request?"**
 ## Native Engine
-The autopilot relies on dev-workflow-native infrastructure for codebase intelligence (`/dw-map-codebase` + `/dw-intel`) and bundled phase execution agents (plan-checker + executor in `.agents/skills/dw-execute-phase/agents/`). All bundled and require no external dependencies. See the `dw-codebase-intel` and `dw-execute-phase` bundled skills under `.agents/skills/` for details.
+The autopilot relies on dev-workflow-native infrastructure for codebase intelligence (`/dw-intel --build` + `/dw-intel`) and bundled phase execution agents (plan-checker + executor in `.agents/skills/dw-execute-phase/agents/`). All bundled and require no external dependencies. See the `dw-codebase-intel` and `dw-execute-phase` bundled skills under `.agents/skills/` for details.
 ## State Persistence

package/scaffold/en/commands/dw-brainstorm.md CHANGED Viewed

@@ -9,14 +9,36 @@ You are a brainstorming facilitator for the current workspace. This command exis
 - Do NOT use when you already have clear requirements ready for a PRD, or when you need to implement code
 ## Pipeline Position
-**Predecessor:** (user idea) | **Successor:** `/dw-create-prd`
+**Predecessor:** (user idea) | **Successor:** `/dw-plan prd`
-## Flags
+## How this command works (auto-dispatch, not flag switchboard)
-- **(default)**: normal brainstorm with 3-7 options (conservative, balanced, bold) and trade-offs. If the product has PRDs or rules, a **Product Inventory** is produced automatically and each option carries a classification tag.
-- **`--onepager`**: at the end of the brainstorm, generate a durable one-pager at `.dw/spec/ideas/<slug>.md` (using `.dw/templates/idea-onepager.md`) with Feature Inventory + Classification & Rationale + MVP Scope + Not Doing + Assumptions. Use when you want a persisted product artifact before moving to `/dw-create-prd`.
-- **`--council`**: after the normal brainstorm, invoke the `dw-council` skill to stress-test the top 2-3 options via 3-5 archetypes (pragmatic-engineer, architect-advisor, security-advocate, product-mind, devils-advocate). Useful when the choice is high-impact and there is genuine dissent between paths.
-- Flags are composable: `--onepager --council` produces the one-pager after the council debate.
+`/dw-brainstorm` runs **FULL** by default. It opens with a **Signal Reading** phase that inspects the user's request, the project state (PRDs, rules, intel, recent commits), and the conversation so far, then **dispatches one or more internal modes**. The user does not pick a mode — the command does.
+Internal modes (the dispatcher picks 1+):
+| Mode | Auto-fires when |
+|------|-----------------|
+| **option-matrix** (always-on default) | Default surface: 3-7 options (conservative / balanced / bold) with `[IMPROVES] / [CONSOLIDATES] / [NEW]` tags. Always runs unless a hard override says otherwise. |
+| **grill** | Vocabulary feels unsettled — user terms differ from `.dw/rules/` / `.dw/constitution.md`, or two synonyms compete in the same conversation, or a contributor proposes a name that clashes with the glossary. |
+| **prototype** | User asks "is this state model right?" / "what should this look like?" with no clear answer; or the next sensible step is to RUN code, not write words. |
+| **council** | Two or more competing approaches surface with no obvious winner; or consensus forms too quickly (false-consensus signal). |
+| **research** | The question depends on external state-of-the-art ("what's the current best practice for X", multi-source comparisons, regulatory or framework landscape). |
+| **refactor-audit** | User points at a directory or describes a code area as "messy", "needs cleanup", "tech debt"; or a quarterly health-check is requested. |
+| **onepager** | The conversation has converged enough to deserve a durable product artifact (`.dw/spec/ideas/<slug>.md`); or the user signals they're about to call `/dw-plan prd` next. |
+Modes can chain inside one session — grill can surface a design question that the dispatcher then sends to prototype; refactor-audit can produce findings that the dispatcher sends to council for stress-testing.
+### Optional overrides (rarely needed)
+- **`--mode=<name>`** — force a specific dispatch and skip Signal Reading. Names: `option-matrix`, `grill`, `prototype`, `council`, `research`, `refactor-audit`, `onepager`. Combine with `+` to chain explicitly: `--mode=grill+onepager`.
+- **`--quiet`** — skip Signal Reading entirely and run only `option-matrix` as a minimal facilitator.
+Power users who already know what they want can pass `--mode=`. Everyone else gets auto-dispatch by default — the command reads the room and acts.
+### Migration note (transitional)
+Old flag invocations (`--onepager`, `--council`, `--research`, `--refactor`, `--grill`, `--prototype`) remain accepted for one minor cycle and map to the equivalent `--mode=` value. New code should use `--mode=` or rely on auto-dispatch.
 ## Decision Flowchart: Brainstorm vs Direct PRD
@@ -27,7 +49,7 @@ digraph brainstorm_decision {
   Q1 [label="Are requirements\nclear and specific?"];
   Q2 [label="Are there multiple\nviable approaches?"];
   node [shape=box];
-  PRD [label="Go directly to\n/dw-create-prd"];
+  PRD [label="Go directly to\n/dw-plan prd"];
   BS [label="Start with\n/dw-brainstorm"];
   Q1 -> PRD [label="Yes"];
   Q1 -> Q2 [label="No"];
@@ -40,15 +62,16 @@ digraph brainstorm_decision {
 When available in the project under `./.agents/skills/`, use these skills to enrich ideation:
-- `dw-council` (opt-in via `--council`): multi-advisor stress-test of the most promising options with mandatory steel-manning and concession tracking. **DO NOT invoke by default** — only when the flag is present or when consensus forms too quickly (false-consensus signal).
-- `dw-ui-discipline`: use when brainstorming involves frontend or UI direction — its hard-gate (scene sentence, surface job) is a generative forcing function during ideation, not just a review check
-- `vercel-react-best-practices`: use when brainstorming React/Next.js architecture or performance trade-offs
-- `security-review`: use when brainstorming touches auth, data handling, or security-sensitive features
+- `dw-council`: invoked by the dispatcher's **council** mode — multi-advisor stress-test of the most promising options with mandatory steel-manning and concession tracking. The dispatcher fires it when 2+ paths tie OR consensus forms too quickly (false-consensus signal). Not invoked on every brainstorm — only when signals justify it.
+- `dw-simplification`: invoked by the dispatcher's **refactor-audit** mode — applies Chesterton's Fence + complexity metrics + the new **deep-modules** reference (deletion test, locality, leverage, interface depth) to every flagged smell.
+- `dw-ui-discipline`: use when brainstorming involves frontend or UI direction — its hard-gate (scene sentence, surface job) is a generative forcing function during ideation, not just a review check. Also picked up by the **prototype** mode's UI branch.
+- `vercel-react-best-practices`: use when brainstorming React/Next.js architecture or performance trade-offs.
+- `security-review`: use when brainstorming touches auth, data handling, or security-sensitive features.
 ## Template Reference
 - Brainstorm matrix template: `.dw/templates/brainstorm-matrix.md` (relative to workspace root)
-- Durable one-pager template: `.dw/templates/idea-onepager.md` (used with `--onepager` flag)
+- Durable one-pager template: `.dw/templates/idea-onepager.md` (used by the **onepager** mode)
 Use this command when the user wants to:
 - Generate ideas for product, UX, architecture, or automation
@@ -61,6 +84,19 @@ Use this command when the user wants to:
 <critical>The brainstorm is a **product-level** phase, not technical. DO NOT dive into architecture, stack, endpoints, schemas. That's the techspec's job. Here we work user journeys, value, features, and boundaries.</critical>
+### 0. Signal Reading (always first, unless `--quiet` or explicit `--mode=`)
+Before producing any output, **read the situation**:
+1. Inspect `.dw/spec/prd-*/`, `.dw/rules/`, `.dw/constitution.md`, `.dw/intel/` if they exist. Note the project's current vocabulary and recent PRDs.
+2. Inspect recent git activity (`git log --oneline -20`) to detect ongoing work.
+3. Re-read the user's request against the Auto-Dispatch table at the top of this file. Match signals to modes.
+4. Decide the dispatch: **option-matrix** always runs unless an explicit mode override skips it. Other modes (grill, prototype, council, research, refactor-audit, onepager) fire **additively** when their signals are present.
+5. State the dispatch decision to the user in one short line: e.g., "Dispatching: option-matrix + onepager (PRD is one step away)" or "Dispatching: grill (vocabulary unsettled in current PRD)". Do not bury this — surface it before running.
+6. Then proceed with the modes in this order (when chained): grill → research → option-matrix → council → refactor-audit → prototype → onepager. Skip modes not in the dispatch.
+### Standard flow (option-matrix mode)
 1. Start by summarizing the problem in 1 to 3 sentences.
 2. **Reframe as "How Might We"**: turn the raw idea into `How might we [verb] for [user] so that [outcome]?`. This pulls the team out of premature "solution mode".
 3. **Product Inventory (required if the product exists)**:
@@ -78,7 +114,7 @@ Use this command when the user wants to:
    - Approximate effort level
 7. Whenever it makes sense, include conservative, balanced, and bold alternatives.
 8. Close with a pragmatic recommendation and clear next steps.
-9. **If the `--onepager` flag is present**: at the end, generate `.dw/spec/ideas/<slug>.md` using `.dw/templates/idea-onepager.md`, filling Feature Inventory, Classification & Rationale, Recommended Direction (product language), MVP Scope (user stories), Not Doing, Key Assumptions, and Open Questions. Report the path to the user.
+9. **If the dispatcher selected `onepager` mode** (auto-fires when conversation has converged enough, or user signals they're heading to `/dw-plan prd` next): at the end, generate `.dw/spec/ideas/<slug>.md` using `.dw/templates/idea-onepager.md`, filling Feature Inventory, Classification & Rationale, Recommended Direction (product language), MVP Scope (user stories), Not Doing, Key Assumptions, and Open Questions. Report the path to the user.
 ## Preferred Response Format
@@ -102,16 +138,16 @@ Use this command when the user wants to:
 - Recommend 1 or 2 paths
 - Explain why they win in the current context
-### 6. One-pager (if `--onepager`)
+### 6. One-pager (if `onepager` mode fired)
 - Path of the created file at `.dw/spec/ideas/<slug>.md`
 ### 7. Next Steps
 - Short and actionable list
 - If appropriate, suggest which command to use next:
-  - `/dw-create-prd` (main successor; accepts the one-pager as input, reducing clarification questions)
-  - `/dw-run-task` (if it's a small IMPROVES that fits in a single task with a quick PRD)
-  - `/dw-create-techspec`
-  - `/dw-create-tasks`
+  - `/dw-plan prd` (main successor; accepts the one-pager as input, reducing clarification questions)
+  - `/dw-run` (if it's a small IMPROVES that fits in a single task with a quick PRD)
+  - `/dw-plan techspec`
+  - `/dw-plan tasks`
   - `/dw-bugfix`
 ## Heuristics
@@ -139,7 +175,263 @@ At the end, always leave the user in one of these situations:
 - With a clear recommendation (including an IMPROVES/CONSOLIDATES/NEW classification)
 - With better questions to decide
 - With a next workspace command to follow
-- With the one-pager at `.dw/spec/ideas/<slug>.md` (if `--onepager` was used)
+- With the one-pager at `.dw/spec/ideas/<slug>.md` (if **onepager** mode fired)
+- With the research report at `~/Documents/<Topic>_Research_<date>/` (if **research** mode fired)
+- With the refactor plan at `<target>/refactor-plan.md` (if **refactor-audit** mode fired)
+- With sharpened glossary entries in `.dw/rules/` (if **grill** mode fired)
+- With a runnable throwaway prototype + verdict template (if **prototype** mode fired)
+## Mode: research (multi-source research)
+Fires when the question depends on external state-of-the-art (multi-source comparisons, framework / regulatory landscape, decisions needing cited evidence). Override: `--mode=research`. Replaces the default option-matrix with a structured research pipeline that produces a cited document with verified claims.
+<critical>Every factual claim MUST be cited immediately with [N] in the same sentence</critical>
+<critical>NEVER fabricate citations — if no source is found, say so explicitly</critical>
+<critical>The bibliography MUST contain EVERY citation used in the body, no abbreviations or ranges</critical>
+### When to use research mode
+- Multi-source comparisons (e.g., "compare React Server Components vs Astro Islands").
+- State-of-the-art reviews of a topic.
+- Regulatory or industry context mapping.
+- Decisions needing cited evidence (not just an opinion).
+- Do NOT use research mode for simple lookups, debugging, or questions answerable in 1-2 web searches.
+### Sub-modes (research depth)
+```
+Selection
+├── Initial exploration → quick (3 phases, 2-5 min)
+├── Standard research → standard (6 phases, 5-10 min) [DEFAULT for research]
+├── Critical decision → deep (8 phases, 10-20 min)
+└── Comprehensive review → ultradeep (8+ phases, 20-45 min)
+```
+### Required reading
+Complementary skill **`dw-source-grounding`**: **ALWAYS** — apply Detect → Fetch → Implement → Cite protocol with strict source hierarchy (official versioned docs > changelogs > web standards > compat tables; Stack Overflow / blogs / training data are discovery only). Every finding ends with `[source: <url>, version: X.Y, retrieved: YYYY-MM-DD]`; bibliography built from these citations.
+### Pipeline phases
+**Phase 1 — SCOPE** | Frame the question. Decompose into core components. Identify stakeholder perspectives. Define scope boundaries. List key assumptions to validate.
+**Phase 2 — PLAN** | Identify primary + secondary sources. Map knowledge dependencies. Create search strategy with variants. Plan triangulation approach. Define quality gates.
+**Phase 3 — RETRIEVE** | Parallel information gathering. Decompose into 5-10 independent search angles (semantic, keyword, date-filtered, academic, alternative perspectives, statistics, industry analysis, critical analysis). Execute ALL searches in parallel via multiple tool calls in a single message. First Finish Search pattern: proceed when first threshold reached (quick: 10+ sources avg credibility >60/100; standard: 15+ >60; deep: 25+ >70; ultradeep: 30+ >75).
+**Phase 4 — TRIANGULATE** | Identify claims requiring verification. Cross-check facts across 3+ independent sources. Flag contradictions. Document verification status per claim.
+**Phase 5 — OUTLINE REFINEMENT** | Compare initial scope to actual findings. Adapt structure based on evidence. Targeted searches to fill gaps.
+**Phase 6 — SYNTHESIZE** | Identify cross-source patterns. Map concept relationships. Generate insights beyond source material. Build evidence hierarchies.
+**Phase 7 — CRITIQUE** (deep/ultradeep only) | Review logical consistency. Verify citation completeness. Identify gaps or weaknesses. Simulate 2-3 critic personas.
+**Phase 8 — REFINE** (deep/ultradeep) | Strengthen weak arguments. Add missing perspectives. Resolve contradictions.
+**Phase 9 — PACKAGE** | Generate report progressively, section by section.
+### Output
+Saved to `~/Documents/<Topic>_Research_<YYYYMMDD>/`. Mandatory sections:
+1. Executive Summary (200-400 words)
+2. Introduction (scope, methodology, assumptions)
+3. Main Analysis (4-8 findings, 600-2000 words each, all cited)
+4. Synthesis and Insights
+5. Limitations and Caveats
+6. Recommendations
+7. Bibliography (COMPLETE — every citation, no placeholders)
+8. Methodological Appendix
+Target lengths: quick 2-4k words; standard 4-8k; deep 8-15k; ultradeep 15-20k+.
+### Quality standards
+- Narrative: flowing prose, beginning/middle/end. Min 80% prose, max 20% bullets.
+- Each factual statement cited immediately with [N].
+- Distinguish fact from synthesis.
+- No vague attributions ("studies show...", "experts believe..." without citation).
+- Label speculation explicitly.
+- Admit uncertainty: "No sources found for X."
+## Mode: refactor-audit (code-smell catalog + deep-modules)
+Fires when the user points at a directory or describes a code area as "messy" / "needs cleanup" / "tech debt", or when a quarterly health-check is requested. Override: `--mode=refactor-audit`. Audits the target area for refactoring opportunities using Martin Fowler's smell taxonomy combined with the deep-modules analysis (deletion test, locality, leverage, interface depth) bundled in the `dw-simplification` skill.
+<critical>ASK EXACTLY 3 CLARIFICATION QUESTIONS BEFORE STARTING THE ANALYSIS</critical>
+### When to use refactor mode
+- Pre-implementation audit of tech debt in the area you're about to touch.
+- Quarterly code-health review.
+- Pre-migration scoping (e.g., before a framework upgrade).
+- Do NOT use refactor mode if `/dw-review` already flagged the same area (avoid duplicate findings).
+### Required reading
+Complementary skills:
+- **`dw-review-rigor`**: **ALWAYS** — applies de-duplication (same smell in N files = 1 entry with affected list), severity ordering P0-P3, signal-over-volume (max ~20 findings; keep criticals, prune marginal ones). Smells with a justifying ADR drop to `low` at most.
+- **`dw-simplification`**: **ALWAYS** — every flagged smell is filtered through Chesterton's Fence (what does the construct DO, why was it added, what breaks if removed). Smells with no clear "why-was-it-there" answer get downgraded to `info` with a research note instead of a refactor proposal. Complexity metrics (cognitive complexity ≥16 or nesting depth ≥4 = `high` candidate; <10 = `low` at most) anchor severity.
+- **`security-review`**: defer security concerns to this skill — do not duplicate.
+- **`vercel-react-best-practices`** + its `perf-discipline.md`: defer React/Next.js performance patterns to this skill.
+### Pipeline
+1. Three clarification questions (scope, priorities, constraints).
+2. Identify the target area (PRD-scoped directory, specific module, or whole codebase).
+3. Scan for smells using Fowler's taxonomy:
+   - **Bloaters** — Long Method, Large Class, Long Parameter List, Data Clumps, Primitive Obsession.
+   - **Object-Orientation Abusers** — Switch Statements, Refused Bequest, Alternative Classes with Different Interfaces, Temporary Field.
+   - **Change Preventers** — Divergent Change, Shotgun Surgery, Parallel Inheritance Hierarchies.
+   - **Dispensables** — Comments, Duplicate Code, Lazy Class, Data Class, Dead Code, Speculative Generality.
+   - **Couplers** — Feature Envy, Inappropriate Intimacy, Message Chains, Middle Man.
+   - **Conditional complexity** — high cyclomatic/cognitive, deep nesting.
+4. Apply `dw-review-rigor` de-duplication + `dw-simplification` Chesterton filter.
+5. For each surviving smell, map to a refactoring technique with before/after sketches.
+6. Severity-order P0-P3 (impact × likelihood × maintenance cost).
+7. Plus: coupling/cohesion metrics, SOLID analysis.
+### Output
+Saved to `<target>/refactor-plan.md`:
+```markdown
+# Refactoring Opportunities — <target>
+## Summary
+- Smells found: N (after de-dup)
+- P0 (do this sprint): N
+- P1 (this quarter): N
+- P2 (when convenient): N
+- P3 (informational): N
+## Findings (severity-ordered)
+### P0 — <smell name>
+**Files:** <list> (de-duplicated)
+**Symptom:** <description>
+**Why fix:** <impact analysis>
+**Suggested refactor:** <Fowler technique>
+**Before:** <code sketch>
+**After:** <code sketch>
+**Effort:** S / M / L
+**Risk:** Low / Medium / High
+**Tests required:** <list>
+...
+```
+### Analysis tools
+- React projects: `npx react-doctor@latest --verbose` for health score.
+- Angular projects: `ng lint` for static issues.
+### Anti-patterns
+- Listing every cyclomatic complexity hit > threshold without context → noise.
+- Suggesting "extract method" everywhere a function is over N lines → mechanical, not insight.
+- Proposing refactors that aren't tested or testable → high risk, won't ship.
+- Ignoring documented architectural decisions in `.dw/rules/` → flagging intentional design as smell.
+## Mode: grill (domain-grilling)
+Fires when vocabulary feels unsettled — user terms differ from `.dw/rules/` / `.dw/constitution.md`, two synonyms compete in the same conversation, or a contributor proposes a name that clashes with the glossary. Override: `--mode=grill`. Replaces the option-generation default with an **interview-style stress-test** of the plan/PRD against the project's domain vocabulary. Each round of questions sharpens one piece. Updates `.dw/rules/` (or `.dw/constitution.md`) inline as terms crystallize — never deferred to "after the conversation."
+<critical>Ask ONE question at a time. Wait for the answer. Don't dump a list of 5 questions and hope for the best.</critical>
+### When to use grill mode
+- Before `/dw-plan prd` when the domain feels unsettled or the team uses competing terms.
+- After `/dw-plan prd` when reviewers flag ambiguous language in the PRD.
+- During architectural discussion when "module", "service", "component" are being used interchangeably and you need to pin the canonical term.
+- When a contributor proposes a name that doesn't match the project's existing glossary.
+### During-session disciplines
+1. **Challenge against the glossary.** Read `.dw/rules/index.md` + per-module `.dw/rules/<module>.md` + `.dw/constitution.md`. Flag terminology conflicts the instant the user uses a term that differs from (or contradicts) what's already documented.
+2. **Sharpen fuzzy language.** When the user says "the user thing" or "the order stuff", propose a precise canonical term. Don't pretend you understood — push back.
+3. **Discuss concrete scenarios.** Force precision via specific edge cases: "What happens to the Order in state X when event Y arrives during retry Z?" Vague answers go back as further questions.
+4. **Cross-reference code.** When the user states a behavior, glance at the codebase to confirm it. Surface contradictions: "You said the API returns `OrderId` but `src/api/orders.ts:42` returns `{ order_id, status }`." Don't argue from generalities.
+5. **Update `.dw/rules/` inline.** When a term crystallizes, write it into the appropriate rules file in the same conversation turn. Lazy file creation: if the file doesn't exist, create it. Format follows the glossary discipline established by the project (see `.dw/rules/index.md` for shape).
+6. **Skip implementation details in the glossary.** `.dw/rules/` and `.dw/constitution.md` describe vocabulary and principles — not implementation. "Order: a customer's request to purchase one or more items, in one of these states: pending, paid, shipped, delivered, refunded" is fine. "Order: a TypeScript class in `src/orders/`" is implementation leak.
+### ADR creation discipline
+Only propose an ADR via `/dw-adr` when **all three** hold:
+| Criterion | Test |
+|-----------|------|
+| **Hard to reverse** | If we change this in 6 months, does it cost >1 week of work? |
+| **Surprising without context** | Would a new contributor reasonably reach a different decision? |
+| **Genuine trade-off** | Was there a real alternative we considered and chose against? |
+If any is missing, skip the ADR. Don't ADR every casual decision — that turns the ADR folder into noise.
+### Output
+grill mode produces:
+- **Updated `.dw/rules/<module>.md`** or `.dw/constitution.md` with crystallized terms.
+- **Updated PRD / TechSpec** if grilling happens mid-plan (terms in the artifact are aligned with the glossary).
+- **Optional `.dw/spec/<prd>/adrs/adr-NNN.md`** if criteria above hold.
+- **NO** option matrix or recommendation (that's option-matrix mode; grill is purely about sharpening). If the dispatcher chained grill+option-matrix, the option matrix runs in a separate phase.
+### When the discipline bends
+- **Greenfield project with no `.dw/rules/`**: grill anyway; the conversation produces the FIRST entries in `.dw/rules/index.md`. That's the value.
+- **Cosmetic terminology disagreements** ("should we call it `userId` or `user_id`?"): skip grill mode; use a coding-conventions ADR or `.dw/rules/index.md` Naming section.
+## Mode: prototype (throwaway prototype)
+Fires when the user asks "is this state model right?" / "what should this look like?" with no clear answer — i.e., the next sensible step is to RUN code, not write words. Override: `--mode=prototype`. Builds a **throwaway prototype that answers a single question**. The question decides the shape — pick a branch.
+<critical>The prototype is throwaway from day one. Don't polish. Don't add tests. Don't extract abstractions. The point is to LEARN something fast and then DELETE or absorb.</critical>
+### Pick a branch
+| User's question | Branch |
+|-----------------|--------|
+| "Does this state/logic model feel right?" | **LOGIC** — interactive terminal app that pushes the state machine through edge cases that are hard to reason about on paper. |
+| "What should this look like?" | **UI** — several radically different UI variations on a single route, switchable via a URL search param and a floating bottom bar. |
+If the question is ambiguous, ask the user. If user not reachable: default by surrounding code (backend module → LOGIC; page/component → UI) and state the assumption at the top of the prototype.
+### Rules (apply to both branches)
+1. **Throwaway from day one, clearly marked.** Place the prototype next to the module/page it's prototyping for (so context is obvious) but name it so a casual reader can see it's a prototype (`prototype-<slug>.ts`, `prototype-route.tsx`, etc.).
+2. **One command to run.** Whatever the project's task runner supports — `pnpm <name>`, `python <path>`, `bun <path>`, etc. The user must start it without thinking.
+3. **No persistence by default.** State lives in memory. Persistence is what the prototype is CHECKING, not what it depends on. If the question explicitly involves a database, hit a scratch DB or a local file with a clear `PROTOTYPE — wipe me` name.
+4. **Skip the polish.** No tests, no error handling beyond what makes the prototype runnable, no abstractions.
+5. **Surface the state.** After every action (LOGIC) or on every variant switch (UI), print or render the full relevant state so the user can see what changed.
+6. **Delete or absorb when done.** When the prototype has answered its question, either delete it or fold the validated decision into real code. Don't leave it rotting in the repo.
+### When done
+The **answer** is the only thing worth keeping. Capture it durably:
+- Commit message that closes the prototype: "removed prototype X; decided <answer> based on <observation>"
+- Or an ADR (if the criteria from grill mode hold)
+- Or `.dw/spec/<prd>/NOTES.md` if mid-PRD
+- Or an issue comment if user-driven
+If the user isn't around, leave a `PROTOTYPE VERDICT: <pending>` placeholder so the next pass can fill it in before deletion.
+### Output
+prototype mode produces:
+- **Throwaway code file(s)** in the appropriate location.
+- A `NOTES.md` next to the prototype with the QUESTION it's answering.
+- After the user runs it and answers the question, instructions to remove the prototype + capture the verdict.
+### Anti-patterns
+- Building a prototype that's actually a feature in disguise — production-quality code, tests, deployment config. That's not a prototype; it's a first draft.
+- Leaving the prototype in the repo "in case we need it later" — six months later it's load-bearing.
+- Not capturing the verdict — the prototype answered the question and the answer evaporated.
+- Multiple prototypes stacked up at once — pick one question, answer it, move.
 ## Inspired by
@@ -148,8 +440,10 @@ The codebase-grounded idea refinement pattern is inspired by [`addyosmani/agent-
 - **Product level, not code level**: while `idea-refine` uses Glob/Grep/Read over `src/*`, here we read **PRDs + rules + intel** to map the **feature inventory** of the product. The brainstorm stays product-focused.
 - **Explicit classification** (IMPROVES / CONSOLIDATES / NEW) as dev-workflow-native discipline — forces the team to decide whether the idea is new, consolidates existing features, or improves one, before opening a PRD.
 - Output at `.dw/spec/ideas/<slug>.md` (sibling of `prd-<slug>/`) instead of `docs/ideas/` — preserves dev-workflow path conventions.
-- Integration with the existing pipeline: `/dw-create-prd` accepts the one-pager as input, reducing clarification questions.
+- Integration with the existing pipeline: `/dw-plan prd` accepts the one-pager as input, reducing clarification questions.
+The **grill** and **prototype** modes are adapted from [`mattpocock/skills/grill-with-docs`](https://github.com/mattpocock/skills/tree/main/grill-with-docs) and [`mattpocock/skills/prototype`](https://github.com/mattpocock/skills/tree/main/prototype) (Matt Pocock, MIT). dev-workflow adaptation: integrated as INTERNAL auto-dispatched modes rather than separate skills, paths rebased on `.dw/rules/` + `.dw/spec/<prd>/`, ADR creation gated on the 3-criteria test (hard-to-reverse + surprising + genuine trade-off).
-Credit: Addy Osmani and the `idea-refine` pattern.
+Credit: Addy Osmani (idea-refine) and Matt Pocock (grill-with-docs, prototype).
 </system_instructions>

package/scaffold/en/commands/dw-bugfix.md CHANGED Viewed

@@ -5,8 +5,8 @@
     ## When to Use
     - Use when fixing a reported bug with automatic triage to distinguish bug vs feature vs excessive scope
-    - Do NOT use when implementing a new feature (use `/dw-create-prd` instead)
-    - Do NOT use when fixing bugs found during QA testing (use `/dw-fix-qa` instead)
+    - Do NOT use when implementing a new feature (use `/dw-plan prd` instead)
+    - Do NOT use when fixing bugs found during QA testing (use `/dw-qa --fix` instead)
     ## Pipeline Position
     **Predecessor:** (bug report) | **Successor:** `/dw-commit` then `/dw-generate-pr`
@@ -48,7 +48,7 @@
     In this mode:
     1. Follow the normal question and analysis flow
     2. Instead of executing, generate a document at `.dw/spec/dw-bugfix-[name]/prd.md`
-    3. The file is named `prd.md` to maintain compatibility with the create-techspec/dw-create-tasks pipeline
+    3. The file is named `prd.md` to maintain compatibility with the create-techspec/dw-plan tasks pipeline
     4. Then the user can run `create-techspec .dw/spec/dw-bugfix-[name]`
     5. And then `create-tasks .dw/spec/dw-bugfix-[name]`
@@ -109,7 +109,7 @@
     ---
     **Do you want me to start the PRD creation flow?**
-    - `yes` - I will follow `.dw/commands/dw-create-prd.md` for this feature
+    - `yes` - I will follow `.dw/commands/dw-plan prd.md` for this feature
     - `no, it's a bug` - Explain further why you consider it a bug
     - `no, cancel` - End
@@ -361,7 +361,7 @@
         q2 [label="Does it require\nnew functionality?", shape=diamond];
         q3 [label="Scope <= 5 files\nand no migration?", shape=diamond];
         bug [label="BUG\n(continue bugfix flow)"];
-        feature [label="FEATURE\n(redirect to /dw-create-prd)"];
+        feature [label="FEATURE\n(redirect to /dw-plan prd)"];
         excessive [label="EXCESSIVE SCOPE\n(redirect to PRD or\nuse --analysis mode)"];
         start -> q1;

package/scaffold/en/commands/dw-commit.md CHANGED Viewed

@@ -7,7 +7,7 @@
     - Do NOT use when creating a PR (use `/dw-generate-pr` instead)
     ## Pipeline Position
-    **Predecessor:** `/dw-run-task` or `/dw-bugfix` | **Successor:** `/dw-generate-pr`
+    **Predecessor:** `/dw-run` or `/dw-bugfix` | **Successor:** `/dw-generate-pr`
     ## Complementary Skills