npm - @pavp/storywright - Versions diffs - 1.5.0 → 1.6.0 - Mend

@pavp/storywright 1.5.0 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/package.json +1 -1
package/skills/story-from-figma/SKILL.md +170 -51
package/skills/story-generate/SKILL.md +197 -106
package/skills/story-split/SKILL.md +240 -115

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@pavp/storywright",
-  "version": "1.5.0",
+  "version": "1.6.0",
   "description": "PM Skills pack for Claude Code — turn ambiguous inputs (prompts, screenshots, Figma links) into Jira-ready user stories.",
   "keywords": [
     "claude",

package/skills/story-from-figma/SKILL.md CHANGED Viewed

@@ -1,30 +1,75 @@
 ---
 name: story-from-figma
-description: Generate user stories from a Figma file or frame URL. Uses an MCP Figma server to enumerate frames, components, navigation, and states; falls back to asking for screenshots if MCP is unavailable.
+description: Generate Cohn+Gherkin user stories from a Figma URL. Maps prototype flows to stories (one per user goal, not per frame). Asks ONLY in terminal.
 trigger: "/story-from-figma | story from figma | generate from figma | analizar figma | https://www.figma.com/"
-intent: Multimodal entrypoint skill. Inspects a Figma design, infers user flows and screens, and produces one or more stories via story-generate.
-version: 1.0.0
+intent: Multimodal entrypoint. Inspects a Figma design, infers flows and screens, and emits one canonical story per logical user goal. Honors v2.2 hard rules (terminal-only Q, no mini-PRD, mechanical deps for split, V audit per child).
+version: 2.2.0
 inputs:
   - figma-link
 outputs:
-  - story.jira-wiki.md (per generated story)
-  - story.standard.md (per generated story)
-  - clarifications.md
+  - story-<N>.standard.md
+  - story-<N>.jira-wiki.md
+  - flow-summary.md
+  - .storywright-context.json
 composes:
   - _components/clarification-questions
   - _components/acceptance-criteria
   - _components/invest-checklist
-  - _components/definition-of-done
-  - _components/business-rules
-  - _components/edge-cases
-  - _components/analytics-events
-  - _components/risks-and-dependencies
   - _components/jira-wiki-formatter
 ---
 ## Purpose
-A Figma file usually represents N screens / N flows. This skill maps that visual structure into stories — one story per logical user goal, not one per frame.
+A Figma file usually represents N screens / N flows. This skill maps that visual structure into Cohn+Gherkin stories — **one story per logical user goal, not one per frame**. Splits aggressively if any flow has multiple outcomes.
+## Hard rules (v2.2 parity with refine/generate/split)
+1. **Terminal-only clarifications.** Never write any sidecar question file (no `clarifications.md`). All gap questions through `AskUserQuestion`, batched ≤4. Non-blocking gaps → `⚠️ Assumed` inline.
+2. **Cohn + Gherkin canonical per story.** Each generated story has ONE Use Case + ONE AC scenario (one Given chain + one `When` + one `Then`). If a flow naturally has >1 `When`/`Then` → STOP and hand off to `[[story-split]]`.
+3. **No mini-PRDs.** Prohibited in each generated story:
+   - NFR blocks — DoD only
+   - Edge Cases enumerations as a section — fold into AC failure paths
+   - Dependencies as prose — Jira links only
+   - Per-claim visual specs — single banner (rule 5)
+   - Generation logs >3 lines (>5 if SPLIT recommended)
+4. **Output language matches the user's chat language**, not the Figma file's. Auto-detect via rule 4a; ask only if signals split. Persist via rule 9.
+5. **Visual inference confidence — single banner only.** Since the source IS Figma, the banner is always `**Source: Figma → values can be tokenized at implementation.**`. If MCP Figma is unavailable and the user falls back to PNG exports, switch the banner to raster (`pixel-derived, not token-confirmed`).
+6. **Sibling task IDs.** When the inventory phase produces multiple flows, ticket slugs follow rule F (naming pattern). Do NOT invent slugs without consulting `.storywright-context.json`.
+7. **Mockup chrome detection — closed list** (nav rail, top bar, footer, toast slot, modal scrim, app tabs). If a frame shows chrome that's NOT explicitly part of a flow, ask via `AskUserQuestion` whether it's a separate story, shared shell, or out-of-scope.
+8. **Anti-PRD is part of each story's INVEST `Small` criterion** — see `[[invest-checklist]]` Small.
+9. **Cross-skill context persistence.** Read `<output-folder>/.storywright-context.json` first (exact folder only). Write resolved answers back. Schema same as other v2.2 skills, plus:
+   ```json
+   {
+     "extra": {
+       "figma_url": "<url>",
+       "figma_scope": "file | page | frame",
+       "mcp_available": true | false
+     }
+   }
+   ```
+10. **Mechanical NxN dep matrix when emitting multiple stories (rule A).** If N>1 flows produce N stories, parse each story's `Given` lines for surface nouns owned by sibling flows. Mark `DEP(Sj → Si)` per match. Render the matrix in `flow-summary.md`. No intuition.
+11. **Per-story V audit (rule C).** For each candidate story, one-line test: "If only this story ships and no sibling flow exists, does a real user complete a real task?". If no → `WEAK · merge-upstream-candidate`. Recommend merging in `flow-summary.md`.
+12. **Passive-goal downstream prompt (rule G).** If a story's `I want to` verb is observational AND `so that` lacks downstream action → ask once via `AskUserQuestion`.
+### 4a. Language auto-detect — expanded signals (E)
+Same weighted table as refine/generate/split (Gherkin keywords, persona phrasing, column names, domain verbs, title). Plus Figma-specific signal: **frame names** and **layer text** in M = medium weight.
+### Rule F. Naming pattern — ask once, persist
+Same options. Persist in `naming_pattern` inside `.storywright-context.json`. Used for all story slugs in this run.
+### Rule D. Surface vs styling
+Same deterministic rule. A frame counts as a separate flow ONLY if it has its own user goal (verb where the user *does something*). A purely visual variant (color theme, light/dark) is NOT a new flow.
 ## When to use
@@ -40,86 +85,160 @@ A Figma file usually represents N screens / N flows. This skill maps that visual
 ## Application (step-by-step)
-### Phase 0 — MCP availability check
-1. Verify an MCP Figma server is connected to Claude Code. See `mcp-figma-notes.md` in this skill folder for setup options.
-2. If MCP is unavailable:
-   - Ask the user to either (a) install an MCP Figma server, (b) export the relevant frames as PNGs and drop them in chat, or (c) paste a textual description of the flows.
-   - Continue under the chosen fallback. The rest of the skill works with screenshots via Claude vision.
+### Phase 0 — MCP availability + context load
-### Phase 0.5 — Detect companion inputs
+1. Verify MCP Figma server is connected. See `mcp-figma-notes.md`.
+2. **Read prior context.** If `<output-folder>/.storywright-context.json` exists (exact folder only), load it.
+3. If MCP unavailable:
+   - Ask via `AskUserQuestion`: install MCP / export PNGs / paste textual flow descriptions.
+   - If user falls back to PNG, set `design_source = raster` in context file and use the raster banner per rule 5.
-Before extracting from Figma, check whether the user attached:
-- **Accompanying text** (goal, story draft, constraints) — treat as canonical for `User Story / Scope / Business Goal`. Figma is canonical for `Components / States / Flows`.
-- **Reference screenshots** — usually redundant when Figma is available; use only if Figma frames are missing states the screenshots show.
+### Phase 0.5 — Companion inputs + language
-If text + Figma both describe the same flow but disagree (e.g., text says "single page form", Figma shows multi-step wizard), **surface the conflict** in clarifications and ask before drafting. Do NOT silently pick a winner. See `[[story-generate]]` "Mixed inputs" section for source priority.
+1. Detect companion text or PNGs.
+2. Run language auto-detect (rule 4a). Adopt silently if signals agree; ask if split.
+3. Persona sharpening: if persona ambiguous, ask via `AskUserQuestion`.
-### Phase 1 — Inventory
+### Phase 1 — Inventory (MCP)
 1. List pages in the file (if MCP allows).
 2. For the target page, list frames with:
    - Frame name
    - Frame type (entry, modal, error state, empty state, success state, loading, etc.)
-   - Outgoing prototype links (which other frame each interactive element points to)
-3. Identify the **flows** by grouping frames connected by prototype links. One flow = one candidate story (or epic if large).
+   - Outgoing prototype links
+3. Identify **flows** by grouping frames connected by prototype links. One flow = one candidate story (or epic if large).
+4. Apply rule D: visual variants of the same flow do NOT count as new flows.
 ### Phase 2 — Per-flow inference
 For each flow:
-1. Identify the **goal** (what user outcome does this flow achieve?).
-2. Identify the **entry point** (where does the user start?).
-3. Enumerate **states**: empty, loading, success, error, edge.
-4. Identify **components** (forms, lists, modals) and their **inputs/outputs**.
-5. Score confidence per inference: HIGH (visible in design), MEDIUM (implied), LOW (assumed). Anything below HIGH gets `> ⚠️ Assumed:` in the output.
-6. Pass the structured inference to `[[story-generate]]` to produce the full story.
+1. Identify the **goal** (what user outcome).
+2. Identify the **entry point**.
+3. Enumerate **states**: empty / loading / success / error / edge. Fold into AC failure paths (rule 3); do NOT emit an Edge Cases section.
+4. Identify **components** and their **inputs/outputs**.
+5. Score confidence per inference: HIGH / MEDIUM / LOW. Below HIGH → `⚠️ Assumed` inline.
+6. Run passive-goal check (rule G).
+7. Run pre-split deterministic counter (rule D) on the flow. If count ≥2 → recommend `[[story-split]]` for THAT flow before drafting.
+### Phase 3 — Draft canonical block per flow
+```markdown
+### [Flow Title]
+#### Use Case
+- **As a** [persona]
+- **I want to** [action]
+- **so that** [outcome with downstream action — rule G]
-### Phase 3 — Splitting check
+#### Acceptance Criteria
+- **Scenario:** [single-outcome scenario name]
+- **Given:** [context — surface nouns drive flow-summary dep matrix]
+- **When:** [single trigger]
+- **Then:** [single observable outcome]
-If a single flow has too many states or branches, run `[[invest-checklist]]` first. If it returns `SPLIT RECOMMENDED`, hand off to `[[story-split]]` before generating.
+#### Design Reference
+**Source: Figma → values can be tokenized at implementation.**
+- <frame URL or frame names>
-### Phase 4 — Output
+#### INVEST
+- I/N/V/E/S/T — one line each.
+- **Verdict:** READY | SPLIT RECOMMENDED | NEEDS REFINEMENT | NOT A STORY
+#### Generation log (≤3 lines)
+- Mapped from frames: <FRAME-IDs>; pattern: <if any>.
+```
+### Phase 4 — Multi-flow analysis (if N>1)
+1. **Mechanical NxN dep matrix (rule 10).** Parse each story's Given lines for surface nouns owned by sibling flows. Emit in `flow-summary.md`.
+2. **Per-story V audit (rule 11).** Flag merge-upstream candidates loudly.
+3. **Coherence check** — verify the union of stories covers the user journey shown in Figma. Flag gaps.
+### Phase 5 — Output
 For each story:
-- Emit `story.jira-wiki.md` + `story.standard.md` per the formatter.
-- Emit a single `flow-summary.md` listing all stories produced and the frames they map to, so reviewers can audit traceability.
+- `story-<N>.standard.md` + `story-<N>.jira-wiki.md`.
+Plus single `flow-summary.md`:
+```markdown
+### Flow Summary — <Figma file/page>
+| # | Story | Frames | INVEST verdict | V audit |
+|---|---|---|---|---|
+| 1 | login-google-web | AUTH-001, AUTH-002 | READY | PASS |
+| 2 | login-google-mobile | AUTH-101 | READY (after #1) | PASS |
+| 3 | recovery-flow | AUTH-201..AUTH-212 | SPLIT RECOMMENDED | — |
+**Dependency matrix (rule 10):**
+|     | #1 | #2 | #3 |
+|-----|----|----|----|
+| #1  | —  |    |    |
+| #2  |DEP | —  |    |
+| #3  |    |    | —  |
-If any inference is LOW confidence, add to `clarifications.md`.
+**Build order:** #1 → #2 (parallel: #3 after its own split).
+**Design source:** Figma (or raster, if PNG fallback).
+```
+Plus `.storywright-context.json` updated.
+NO `clarifications.md`. NO Edge Cases sections. NO NFR blocks. NO per-claim visual tags.
 ## Examples
 ### Good
 Input: Figma file URL with 3 flows on the "Auth" page (login, signup, password reset).
 Output:
-- 3 stories (one per flow).
-- Each story references the frames it derived from: `Maps to frames: AUTH-001, AUTH-002, AUTH-003`.
-- Account-recovery flow flagged for `[[story-split]]` because it had 12 frames covering multiple recovery methods.
+- 3 canonical stories.
+- `flow-summary.md` with mechanical matrix (#2 DEP #1; #3 independent).
+- V audit: all PASS.
+- Recovery flow has 12 frames → pre-split counter ≥2 → recommend `[[story-split]]` for #3 only.
+### Good — PNG fallback
+MCP unavailable. User exports 3 PNGs. Skill switches banner to raster, marks `design_source = raster`, generates 3 stories with `[mockup-pixel-derived]`-style inheritance (single banner per Design Reference).
+### Good — passive goal fires
+Flow goal inferred as "view dashboard". Skill detects passive verb → asks: "What does the user do with the dashboard data?". User: "Spot anomalies and drill into them." So-that strengthened.
 ### Bad
+One story per frame. Frames are screens; stories are user goals.
-One story per frame. Frames are screens; stories are user goals. A single goal may span 5 frames.
+### Bad
+Emitting `clarifications.md`. Violates rule 1.
+### Bad
+Skipping the mechanical matrix in `flow-summary.md` when N>1.
 ## Common Pitfalls
 - Treating each frame as a story.
-- Skipping prototype-link analysis — without flow structure, inferred user goals are guesses.
-- Ignoring empty/error/loading states. Designers usually include them; PMs often miss them.
-- Trusting MEDIUM/LOW inferences silently. Always surface them.
-- Generating stories without verifying that the design covers all error paths (often the design only shows the happy path).
+- Skipping prototype-link analysis — without flow structure, user goals are guesses.
+- Ignoring empty/error/loading states. Designers usually include them; fold into AC failure paths.
+- Trusting MEDIUM/LOW inferences silently — mark `⚠️ Assumed`.
+- Generating stories without verifying coverage of all error paths.
+- Skipping per-story V audit (rule 11) — figma flows are easy to over-split.
+- Re-asking questions already in `.storywright-context.json`.
+- Tagging every visual claim instead of using the single Figma banner.
 ## References
 - [[story-generate]]
 - [[story-split]]
+- [[story-refine]]
 - [[clarification-questions]]
-- `./mcp-figma-notes.md` (setup of MCP server)
+- `./mcp-figma-notes.md` (MCP server setup)
 <claude-specific>
-- Use Claude's native vision when MCP is unavailable and the user drops PNGs.
+- Use Claude vision when MCP is unavailable and the user drops PNGs.
 - Use extended thinking for flow grouping — prototype links can be ambiguous.
 - Cache the Phase 2 inference checklist across calls.
-- When MCP Figma is available, batch frame metadata fetches into one round trip to minimize tool round-trips.
+- When MCP Figma is available, batch frame metadata fetches into one round trip.
+- Read `.storywright-context.json` ONLY from the exact target output folder.
+- Build the multi-story dependency matrix from Given-text parsing (rule 10), not intuition.
+- Never call Write for any sidecar question file. Use `AskUserQuestion`.
 </claude-specific>

package/skills/story-generate/SKILL.md CHANGED Viewed

@@ -1,17 +1,17 @@
 ---
 name: story-generate
-description: Transform an ambiguous prompt, half-baked story, screenshot, or Figma link into a Jira-ready user story with acceptance criteria, DoD, edge cases, and risks. Ask only critical clarifying questions.
+description: Transform an ambiguous prompt, half-baked story, screenshot, or Figma link into a Jira-ready user story. Cohn+Gherkin canonical. Asks clarifications ONLY in terminal.
 trigger: "/story-generate | generate a user story | write a user story | turn this into a story | crear historia de usuario"
-intent: Top-level orchestrator skill that drives the full story generation flow by composing component skills.
-version: 1.0.0
+intent: Top-level orchestrator that drafts a fresh story from any input. Follows the same hard rules as story-refine v2.2 (Cohn philosophy, terminal-only Q, no mini-PRD, deterministic split gate).
+version: 2.2.0
 inputs:
   - text
   - image
   - figma-link
 outputs:
-  - story.jira-wiki.md
   - story.standard.md
-  - clarifications.md
+  - story.jira-wiki.md
+  - .storywright-context.json
 composes:
   - _components/clarification-questions
   - _components/acceptance-criteria
@@ -26,7 +26,82 @@ composes:
 ## Purpose
-Take whatever the PM has — a one-liner, a half-baked story, a screenshot, a Figma link — and produce a story that an engineer can pick up and ship without follow-up questions. Always output two artifacts (Jira wiki + CommonMark).
+Take whatever the PM has — a one-liner, a half-baked story, a screenshot, a Figma link — and produce a Cohn+Gherkin story an engineer can pick up and ship without follow-up questions. If the input is too broad, recommend `/story-split` instead of producing a mini-PRD.
+## Hard rules (no exceptions)
+1. **Terminal-only clarifications.** Never write any sidecar question file (no `clarifications.md`). All gap questions go through `AskUserQuestion` (batch in groups of ≤4). Non-blocking gaps → mark `⚠️ Assumed` inline.
+2. **Cohn + Gherkin canonical.** One Use Case block. One AC scenario per story (one Given chain, one `When`, one `Then`). If the input naturally needs >1 `When`/`Then` → STOP drafting, recommend `/story-split`.
+3. **No mini-PRDs.** Prohibited in story output:
+   - NFR blocks (a11y/i18n/perf/tokens) — these live in the team's DoD
+   - Edge Cases enumerations as a section — surface inside AC failure paths only
+   - Dependencies as prose — Jira links only
+   - Per-claim visual specs — use single banner (rule 5)
+   - Refinement logs >3 lines (>5 if SPLIT)
+4. **Output language matches the user's chat language**, not the input's. Auto-detect first (see rule 4a); only ask via `AskUserQuestion` if signals split.
+5. **Visual inference confidence — single banner only.** ONE banner at the top of the Design Reference block declares the source type. Claims under it inherit confidence:
+   - Raster source (PNG/JPG) → `**Source: raster mockup → all visual specs are pixel-derived, not token-confirmed.**`
+   - Figma source → `**Source: Figma → values can be tokenized at implementation.**`
+   - Design-token source → `**Source: design tokens → values are authoritative.**`
+   - Never assert hex / px / spacing from raster without the raster banner.
+6. **Sibling task IDs.** If the draft references "next task / future task / another story" → check `<output-folder>/.storywright-context.json` first. If unresolved, ask. Tentative slugs follow rule F.
+7. **Mockup chrome detection — closed list.** Chrome = `left nav rail / sidebar`, `top bar`, `footer`, `persistent toast/snackbar slot`, `persistent modal scrim`, `app-level tabs`. If image shows any AND the input does not mention them, ask via `AskUserQuestion` whether each is in-scope, sibling-scope, or out-of-scope. Anything not on the list is NOT chrome.
+8. **Anti-PRD is part of INVEST `Small`.** See `[[invest-checklist]]` Small criterion (line-count ceiling lives there).
+9. **Cross-skill context persistence.** When the skill resolves clarifications, write answers to `<output-folder>/.storywright-context.json`. Read only from the exact output folder of the current invocation; never search siblings or parents. Schema:
+   ```json
+   {
+     "version": 1,
+     "decided_at": "<ISO date>",
+     "decided_by_skill": "story-generate",
+     "language": "EN | ES | ...",
+     "chrome_scope": "in-scope | in-scope-placeholder | sibling | out-of-scope",
+     "siblings": "TODO | <list of IDs> | not-applicable",
+     "design_source": "raster | figma | tokens",
+     "naming_pattern": "<see rule F>",
+     "extra": {}
+   }
+   ```
+10. **Mixed input conflict detection.** When text + image + Figma disagree, surface as BLOCKING `AskUserQuestion`. Never silently pick a winner. (See source priority below.)
+11. **Passive-goal downstream prompt (G).** If `I want to` verb is observational (`view, see, read, browse, look at, inspect, monitor`) and `so that` lacks a follow-up user action → ask once via `AskUserQuestion`: "What does the user do with this?". Strengthen the `so that` accordingly.
+### 4a. Language auto-detect — expanded signals (E)
+| Signal | Where | Weight |
+|---|---|---|
+| Gherkin keywords ("Given/When/Then") | AC block | high |
+| Persona phrasing ("As a user" vs "Como un usuario") | Use Case | high |
+| Column / field names ("Phone - primary", "Teléfono - principal") | AC bullets | medium |
+| Domain verbs ("clicking" vs "hacer clic") | AC bullets | medium |
+| Title language | header | low |
+**Decision:**
+- High+medium signals agree on M → adopt M silently. Mark inline `⚠️ Assumed: output language = <M> (auto-detected from <signals>)`.
+- Signals split → ask once.
+- Persist via rule 9.
+### Rule F. Naming pattern — ask once, persist
+When the skill needs to invent a tentative ticket slug AND `.storywright-context.json` has no `naming_pattern`, ask once:
+- kebab-case feature-action (`customer-search-bar-wire`)
+- verb-noun (`wire-search-bar`)
+- domain-action (`search.customer.wire-input`)
+- Jira prefix + numeric (`CSB-001`)
+Persist in `.storywright-context.json`. Reuse for all sibling slugs.
+### Rule D. Surface vs styling (deterministic)
+A "named UI surface" counts as a separate outcome ONLY if it has its own user goal (verb where the user *does something with it*). If a noun is mentioned only in a styling context (color, padding, background) or as a sub-component of a parent surface (column inside a grid) → it is NOT a surface, it is styling. Count = 0.
 ## When to use
@@ -34,139 +109,155 @@ Take whatever the PM has — a one-liner, a half-baked story, a screenshot, a Fi
 - The user pastes a vague story and wants it production-ready.
 - The user drops an image/Figma link and asks for stories.
+For inputs that clearly cover multiple outcomes → run the deterministic split gate (step 6 below) and recommend `/story-split` instead of drafting.
 ## Inputs & how to interpret each
 ### Text prompts
-Anything from a single phrase to a paragraph. If the prompt names only a feature, infer the implicit user goal.
+Anything from a phrase to a paragraph. If only a feature is named, infer the implicit user goal and confirm via rule G if passive.
 ### Local images (PNG/JPG)
-Use vision. Extract:
-- UI elements (buttons, fields, navigation)
-- Visible states (loading, error, success)
-- Inferred flow (what does each element trigger?)
-- Confidence per inference (high / medium / low). Anything below high → add `> ⚠️ Assumed:` blockquote in the output and surface in clarifications.
+Use vision. Extract UI elements, visible states, inferred flow, confidence per inference. Anything below high confidence → mark inline `⚠️ Assumed`. NEVER assert pixel-precise visual specs inline with each claim — use the single banner (rule 5).
 ### Figma links
-If MCP Figma is available (see `[[story-from-figma]]`), use it to enumerate frames, components, navigation. If not, fall back to asking the user to drop screenshots.
-### Mixed inputs (text + image + Figma)
-The skill is designed to **fuse multiple sources** in a single invocation. Common pairings:
-- **Text + screenshot** — text states the goal, image shows the proposed UI. Use text for `User Story / Goal / Scope`, image for `Components / States / Edge cases / UX flow`.
-- **Text + Figma link** — text gives intent, Figma gives implementation surface. Use text for `User Story / Business goal`, Figma for `Technical considerations / Edge cases / Components / Multi-screen flows`.
-- **Text + image + Figma** — full triangulation. Highest fidelity; also highest chance of conflict.
+If MCP Figma is available (see `[[story-from-figma]]`), use it. If not, fall back to asking the user for screenshots.
-**Source priority (when sources disagree):**
+### Mixed inputs (text + image + Figma) — source priority
 | Section | Primary | Secondary | Tertiary |
 |---|---|---|---|
-| User Story / Goal | Text | Figma (frame titles, callouts) | Image |
-| Business Rules / Scope | Text | Figma | Image |
+| User Story / Goal | Text | Figma frame titles | Image |
+| Scope | Text | Figma | Image |
 | UI Components / States | Figma | Image | Text |
-| Edge Cases | Figma + Image (states shown) | Text | — |
-| Technical Considerations | Figma (component naming, design system refs) | Text | Image |
-| Acceptance Criteria | Triangulate all three | — | — |
+| AC observable outcomes | Triangulate | — | — |
-**Conflict handling:**
+**Conflicts → BLOCKING `AskUserQuestion`.** Never silently pick a winner.
-1. **Detect the conflict explicitly.** Example: text says "Google only" but Figma shows Google + Facebook buttons.
-2. **Do NOT silently pick a winner.** Surface the conflict in `clarifications.md` as a BLOCKING question: *"Text says X but design shows Y — which is canonical?"*
-3. **If the user is in-session, ask immediately** before drafting. If running batch, mark the story `DRAFT` and write both options in scope/out-of-scope with `> ⚠️ Conflict:` annotation.
-4. **Scope coverage check:** if Figma shows N flows but text describes 1, ask whether to (a) generate 1 story bounded to text, (b) generate N stories from Figma, or (c) generate 1 story + flag remaining flows as roadmap.
+## Canonical output shape
+```markdown
+### [Title]
+#### Use Case
+- **As a** [persona — never just "user"]
+- **I want to** [action]
+- **so that** [outcome with downstream action — rule G]
+#### Preconditions (optional)
+- ...
+#### Out of Scope (optional)
+- ...
+#### Acceptance Criteria
+- **Scenario:** [single-outcome scenario name]
+- **Given:** [context — surface nouns drive downstream dep matrix]
+- **and Given:** [context]
+- **When:** [single trigger]
+- **Then:** [single observable outcome]
+#### Design Reference (optional)
+**Source: <raster | figma | tokens> → <banner from rule 5>**
+- [link or path]
+- visual notes: [...]
+#### INVEST
+- I/N/V/E/S/T — one line each.
+- **Verdict:** READY | SPLIT RECOMMENDED | NEEDS REFINEMENT | NOT A STORY
+#### Generation log (≤3 lines; ≤5 if SPLIT)
+- ...
+```
+Nothing else. No NFR. No edge-cases enumeration. No deps prose. No Assumptions block.
 ## Application (step-by-step)
-1. **Detect input types present** — text, image, figma-link, or any combination. Branch accordingly:
-   - **Single source** → process as before.
-   - **Mixed sources** → run the "Mixed inputs" protocol above, including source-priority lookup and explicit conflict detection BEFORE drafting.
-2. **Intake gap check** — invoke `[[clarification-questions]]`. If it returns BLOCKING questions, **ask first** before drafting.
-3. **Detect language** of input (es | en | other). Output in the input language.
-4. **Draft skeleton** of the structured story (all 15 sections from the template).
-5. **Fill the CORE first** (always required, in order):
-   1. **Title** — concise, ≤8 words.
-   2. **Summary** — single value-focused sentence ("Enable Google login for trial users to reduce signup friction"), NOT a feature label ("Add Google button"). Elevator pitch.
-   3. **User Story** (As a / I want to / so that).
-      - **Persona check:** if role is "user" or "customer", push for sharper ("trial user", "Workspace admin"). Generic personas hide motivation.
-      - **"So that" check:** outcome must be distinct from action. "So I can save my work" = restating; "So I don't lose progress if tab crashes" = real motivation.
-   4. **Acceptance Criteria** via `[[acceptance-criteria]]` — at minimum the happy path + one failure mode.
-   5. **Definition of Done** via `[[definition-of-done]]`.
-6. **Fill OPTIONAL sections only if they have real content.** Drop any that would be empty or boilerplate:
-   - Contexto / Business goal — include when there's a stated trigger or KPI
-   - Scope / Out of scope — include when boundaries are non-obvious
-   - `[[business-rules]]` — include when invariants exist beyond the ACs
-   - Technical considerations — include when surface/SDK/flag matters
-   - `[[edge-cases]]` — include when ≥3 high-impact edges exist
-   - `[[analytics-events]]` — include when story has measurable funnel
-   - `[[risks-and-dependencies]]` — include when there are real blockers or unknowns
-   The bias is **less is more**. A clean 4-section story beats a 15-section one full of `N/A`.
-6. **Run INVEST self-check** via `[[invest-checklist]]`:
-   - `READY` → continue.
-   - `NOT A STORY` (V failed) → STOP. Tell the user this is a tech task, not a user story. Suggest reframing or combining with user-facing work.
-   - `NEEDS REFINEMENT` (T or N failed) → revise the failing sections in place.
-   - `RUN A SPIKE` (E failed on unknowns) → recommend a 1–2 day investigation; do not split or generate yet.
-   - `SPLIT RECOMMENDED` (I, E, or S failed) → STOP. Hand off to `[[story-split]]`. **Never auto-split.**
-7. **Render outputs** via `[[jira-wiki-formatter]]`:
-   - `story.jira-wiki.md` — Jira wiki markup
-   - `story.standard.md` — CommonMark
-8. **If clarifications remain unresolved** (user skipped them, or low-confidence visual inferences exist):
-   - Emit `clarifications.md` with the outstanding questions
-   - Mark the story output with a `DRAFT` banner at the top
-   - Tell the user explicitly what would unblock promoting from DRAFT to READY
-9. **Present both artifacts** as fenced code blocks. Ask the user whether to save to disk (offer paths under `./stories/<slug>/`).
+0. **Detect input types** — text / image / figma-link / combination. Run conflict detection (rule 10) BEFORE drafting. Run chrome detection (rule 7).
-## Examples
+1. **Read prior context.** If `<output-folder>/.storywright-context.json` exists (exact folder only), load it.
-### Good — text prompt
-Input: *"Permitir login con Google"*
+2. **Language resolution** (rule 4 + 4a). Auto-detect using expanded signals; ask only on split.
-Flow:
-1. Run gap check → 3 BLOCKING questions: scope of accounts, account linking, surface (web/mobile/both).
-2. Ask the 3 questions, wait for answers.
-3. Draft + fill all 15 sections.
-4. INVEST → `READY`.
-5. Render both outputs.
-6. Done.
+3. **Persona sharpening.** If persona is "user" / "customer" / "person", ask via `AskUserQuestion` for the specific role (e.g., "Sales person", "Workspace admin"). Generic personas hide motivation.
-### Good — image input
-Input: screenshot of a dashboard with a filter sidebar.
+4. **Passive-goal check (rule G).** If `I want to` verb is observational + `so that` lacks downstream action → ask once.
-Flow:
-1. Vision: extract filter categories, infer apply/reset actions.
-2. Confidence on "filters persist across navigation" → MEDIUM → mark as `⚠️ Assumed` and surface in clarifications.
-3. Run gap check → 1 BLOCKING (does this replace or augment current filters?).
-4. Ask, draft, fill, INVEST, render.
+5. **Gap-check** via `[[clarification-questions]]`. BLOCKING gaps → `AskUserQuestion` batched ≤4. Non-blocking → fill inline `⚠️ Assumed`.
-### Bad
+6. **Deterministic pre-split test.** Count outcomes using the same rule as `[[story-refine]]`:
+   - +1 per AC bullet with action verb at user level
+   - +1 per distinct `When [event]`
+   - +1 per named UI surface with its own user goal (rule D)
+   - 0 for styling, sub-components, passive layout assertions
+   - Count ≥2 → STOP. Recommend `/story-split`. List candidate children + per-pair dep notes (rule A) + V audit (rule C from refine).
+7. **Draft the canonical block** (Use Case + AC + Design Ref + INVEST). Preserve user wording where good.
+8. **Run INVEST** via `[[invest-checklist]]`.
+   - `READY` → render.
+   - `SPLIT RECOMMENDED` → STOP, recommend split.
+   - `NEEDS REFINEMENT` → iterate failing dimension, max 1 cycle, then STOP.
+   - `NOT A STORY` → tell user it's a tech task and stop.
+9. **Render** both outputs via `[[jira-wiki-formatter]]`. Files: `story.standard.md` + `story.jira-wiki.md`. Plus `.storywright-context.json`. No other files.
+10. **Generation log** ≤3 bullets (≤5 if SPLIT) at end of story.
+## Examples
+### Good — text prompt
+Input: *"Permitir login con Google"*
+1. Language auto-detect → ES (persona "usuario", verbs "permitir").
+2. Persona sharpening → ask: trial user? admin? signed-out visitor?
+3. Pre-split count = 1 (one auth flow). Continue.
+4. Draft Use Case + 1 AC (happy path, failure as `and Given`).
+5. INVEST → READY.
+6. Render.
+### Good — image input
+Input: dashboard screenshot with filter sidebar.
+1. Vision extracts filters; one inference at MEDIUM confidence → mark `⚠️ Assumed` inline.
+2. Pre-split count = 1 (one filter interaction surface).
+3. Draft + INVEST → READY.
+### Good — passive-goal prompt fires
+Input: "As a user, I want to view list of customers, so that I find details."
+- Detected: `view` (passive) + thin `so that`.
+- Ask: "What does the user do with the customer they find?"
+- User: "Call them."
+- Refined `so that`: "so that I can find and call a customer to schedule a service."
+### Bad — broad input drafted as one story
 Input: *"Build the new dashboard"*
+- Pre-split count ≥2 → STOP. Recommend `/story-split`. Do NOT draft a 15-section story.
-Don't draft. The scope is too broad. Run gap check → propose splitting into smaller stories at the **clarification step**, before the story is drafted. (Effectively delegates to `[[story-split]]` upfront.)
+### Bad — clarifications.md
+Writing any sidecar question file. Violates rule 1.
+### Bad — per-claim visual tag
+`[mockup-pixel-derived]` on every line instead of the single banner. Violates rule 5.
 ## Common Pitfalls
-- Drafting before asking the critical questions. Always run intake first.
-- Ignoring confidence in image inferences. If you guessed, say so.
-- Auto-splitting. Never. Propose, wait, then split.
-- Mixing English and Spanish in the output. Pick the input language.
-- Skipping the `clarifications.md` file when assumptions remain.
+- Drafting before running the deterministic split gate (step 6).
+- Auto-splitting. Never. Propose, wait for `/story-split`.
+- Mixing languages. Pick one via rule 4 + 4a.
+- Re-asking questions already resolved in `.storywright-context.json`.
+- Letting per-claim `[mockup-pixel-derived]` tags litter the output.
+- Treating image visual specs as authoritative without the rule-5 banner.
 ## References
-- [[story-refine]] (use when input is an existing story to improve)
-- [[story-split]] (use when INVEST fails on Independent/Estimable/Small)
-- [[story-from-figma]] (use when input is a Figma link)
+- [[story-refine]] (when input is an existing story)
+- [[story-split]] (when INVEST fails on I/E/S)
+- [[story-from-figma]] (when input is Figma URL)
 - [[clarification-questions]]
-## Output templates
-See `templates/story.jira-wiki.md` and `templates/story.standard.md` in this skill's folder for the canonical section ordering and formatting.
 <claude-specific>
-- Use extended thinking for INVEST check and for vision confidence scoring.
-- Cache the 15-section taxonomy and component invocation order across calls.
-- When input includes images, attach them to the same message as the prompt to use Claude's native vision (do not describe-then-reason in two steps).
-- Use prompt caching on the component skill bodies (they're long and reused).
+- Use extended thinking for INVEST + pre-split counting.
+- Attach images in the same message for native vision; don't describe-then-reason in two steps.
+- Read `.storywright-context.json` ONLY from the exact target output folder.
+- Never call Write for any sidecar question file. Use `AskUserQuestion`.
+- Treat step 6 (deterministic pre-split test) as a hard gate; do not skip even when the user wants a single story.
 </claude-specific>

package/skills/story-split/SKILL.md CHANGED Viewed

@@ -1,196 +1,321 @@
 ---
 name: story-split
-description: Detect when a story is too big to ship in one sprint and propose an INVEST-driven split into an epic with sub-stories. Never auto-splits — always proposes, then waits for user confirmation.
+description: Split an oversize story into an epic plus Cohn+Gherkin children. Mechanical NxN dep matrix and per-child V audit. Asks ONLY in terminal. Never auto-splits.
 trigger: "/story-split | split this story | divide this story | dividir historia | this is too big"
-intent: Splitting skill that uses the INVEST failure reasons (from invest-checklist) as the rationale for decomposition. Produces an epic skeleton plus N child story stubs, ready to feed back into story-generate.
-version: 1.0.0
+intent: Splitting skill driven by INVEST failure reasons. Produces an epic plus N child story stubs in the v2.2 canonical Cohn+Gherkin shape, with mechanical dependency matrix and Valuable audit per child.
+version: 2.2.0
 inputs:
   - text
   - image
   - figma-link
 outputs:
-  - split-plan.md
   - epic.md
-  - story-1.md, story-2.md, ... (N child stubs)
+  - story-1.md
+  - story-2.md
+  - .storywright-context.json
 composes:
   - _components/invest-checklist
   - _components/clarification-questions
+  - _components/acceptance-criteria
+  - _components/jira-wiki-formatter
 ---
 ## Purpose
-When a story is an epic in disguise, splitting badly is worse than not splitting. This skill uses **established INVEST-compatible patterns** (workflow steps, business rules, user roles, data variations, happy/sad paths, simple/complex) to propose a clean decomposition. The user always approves the plan before any child stories are written.
+When a story is an epic in disguise, splitting badly is worse than not splitting. This skill uses established INVEST-compatible patterns to propose a clean decomposition, then mechanically verifies each child's independence and value before saving. The user always approves the plan before any child is written.
-## When to use
+## Hard rules (v2.2 parity with refine/generate)
-- `[[invest-checklist]]` returned `SPLIT RECOMMENDED` (failures on I, E, or S).
-- User explicitly asks: "this story is too big, split it".
-- Input visibly mixes ≥2 flows.
+1. **Terminal-only clarifications.** Never write any sidecar question file. All gap questions through `AskUserQuestion`, batched ≤4. Non-blocking gaps → `⚠️ Assumed` inline.
-## Inputs & interpretation
+2. **Children are Cohn+Gherkin canonical.** Each child has ONE Use Case block + ONE AC scenario (one Given chain + one `When` + one `Then`). If a child still needs >1 `When`/`Then` → recursive re-split.
-- **text** — the oversize story (or a one-line goal that's clearly epic-scoped).
-- **image (optional)** — companion mockup. Use to validate scope and reveal hidden sub-flows the text doesn't mention.
-- **figma-link (optional)** — companion design. Often makes splitting easier: prototype links and frame structure reveal natural flow boundaries that map to children.
+3. **No mini-PRDs in children.** Same prohibition list as refine v2.2 — no NFR blocks, no Edge Cases enumerations, no Dependencies prose, no per-claim visual specs, refinement logs ≤3 lines (≤5 if recursive split).
-### Mixed inputs
+4. **Output language matches the user's chat language.** Auto-detect via rule 4a; ask only if signals split. Persist via rule 9.
-When the user provides text + image / Figma alongside the story:
-- **Text is canonical for User Story / Scope / Business Goal** of the epic.
-- **Figma / image is canonical for flow structure** — use prototype links to enumerate candidate sub-flows. Each flow is a candidate child story.
-- **Conflict handling:** if Figma shows N flows but text only mentions K (K < N), surface this in the gap-check via `[[clarification-questions]]` BEFORE drafting the split plan. Ask: "Figma includes flows for X, Y, Z — include in this epic, or scope out?" Never silently expand or shrink scope.
-- See `[[story-generate]]` "Mixed inputs" for the full source-priority matrix.
+5. **Visual inference confidence — single banner only** in each child's Design Reference block. No per-claim `[mockup-pixel-derived]` tags.
-## Application (step-by-step)
+6. **Sibling task IDs.** When referencing tickets that don't exist yet, follow rule F (naming pattern) instead of inventing slugs.
+7. **Chrome detection — closed list** (nav rail, top bar, footer, toast slot, modal scrim, app tabs). If image shows chrome and the input doesn't mention it, ask whether to add as its own child, attach to an existing child, or scope out.
+8. **Anti-PRD is part of each child's INVEST `Small` criterion** — see `[[invest-checklist]]` Small.
+9. **Cross-skill context persistence.** Read `<output-folder>/.storywright-context.json` first (exact folder only). Write updated answers back at the end. Schema:
+   ```json
+   {
+     "version": 1,
+     "decided_at": "<ISO date>",
+     "decided_by_skill": "story-split",
+     "language": "EN | ES | ...",
+     "chrome_scope": "in-scope | in-scope-placeholder | sibling | out-of-scope",
+     "siblings": "TODO | <list of IDs> | not-applicable",
+     "design_source": "raster | figma | tokens",
+     "naming_pattern": "<see rule F>",
+     "extra": { "split_pattern": "...", "core_complexity": "..." }
+   }
+   ```
-### Pre-split gate — STOP conditions
+10. **Children independence — mechanical detection (A).** For each child Cj, parse its `Given:` lines for surface nouns owned by sibling Ci. If matched → `DEP(Cj → Ci)`. The dependency matrix IS the union of those matches. No intuition-based deps. Affected child's INVEST `Independent` becomes `PARTIAL · depends on <Ci>`. Build order in epic.md is a topological sort of the matrix.
-Before splitting anything, run `[[invest-checklist]]` and apply these gates:
+11. **Per-child V audit (C).** For each candidate child, run one-line test: "If only this child ships and no sibling exists, does a real user complete a real task?". If answer is "no, useless until <other> ships" → mark V = `WEAK · merge-upstream-candidate` and recommend merging into the parent surface. Do not let stylistic/UI-fragment children survive the split.
-- **Valuable FAILS** → **STOP. Do not split.** A non-valuable item is a technical task, not a story. Combine it with related user-facing work; don't decompose it.
-- **Testable FAILS** → **fix in place** via `[[story-refine]]` first. Splitting an untestable story produces untestable children.
-- **Negotiable FAILS** → fix in place; the story is over-prescriptive, not too big.
-- **Estimable FAILS due to unknowns** → run a **spike** (Pattern 9 below), not a split.
-- **Independent / Estimable / Small FAIL** → continue to pattern selection.
+12. **Passive-goal downstream prompt (G).** If any child's `I want to` verb is observational (view/see/read/browse/look/inspect/monitor) AND `so that` lacks a follow-up action → ask once via `AskUserQuestion` per child (batched ≤4 across children).
-### Pattern catalog (apply in order; stop at first that fits)
+13. **Determinism on counts.** Use the same deterministic surface-vs-styling counter (rule D from refine v2.2) inside child re-split checks.
-Based on the Humanizing Work splitting methodology (Richard Lawrence & Peter Green).
+### 4a. Language auto-detect — expanded signals (E)
+Same weighted signal table as refine v2.2 (Gherkin keywords, persona phrasing, column names, domain verbs, title). Adopt silently when high+medium signals agree; ask only if split.
-1. **Workflow steps — thin end-to-end slices.**
-   **Critical:** this is NOT "step 1 / step 2 / step 3" of the journey. Each child must deliver the **full** workflow with increasing sophistication.
-   - ❌ Wrong: Story 1 = editorial review, Story 2 = legal approval, Story 3 = publish. (Story 1 alone delivers nothing observable to the user.)
-   - ✅ Right: Story 1 = publish post immediately, no reviews. Story 2 = add editorial review step. Story 3 = add legal step. Each story produces visible behavior.
+### Rule F. Naming pattern — ask once, persist
+Same kebab / verb-noun / domain-action / Jira-prefix options. Persist in `naming_pattern`. Reuse across this run AND future skills.
+### Rule D. Surface vs styling (deterministic)
+A noun counts as a separate surface ONLY if it has its own user goal (verb). Styling, sub-components, and passive layout assertions do not count.
+## When to use
+- `[[invest-checklist]]` returned `SPLIT RECOMMENDED` (I, E, or S fail).
+- User explicitly asks: "this story is too big, split it".
+- `[[story-refine]]` or `[[story-generate]]` deterministic pre-split test ≥2.
+## Inputs & interpretation
-2. **CRUD operations.** When the input says "manage" / "handle" / "maintain", it bundles operations. Split into Create / Read / Update / Delete.
+- **text** — the oversize story (or epic-scoped one-liner).
+- **image (optional)** — companion mockup. Use to validate scope and reveal hidden sub-flows.
+- **figma-link (optional)** — companion design. Prototype links / frame structure reveal natural flow boundaries.
-3. **Business rule variations.** Same feature, different rules → one story per rule (members / VIP / first-time discounts).
+### Mixed inputs source-priority
-4. **Data type variations.** One story per data shape (counties / cities / custom areas; or jpg / pdf / mp4). Deliver simplest first.
+- Text canonical for `User Story / Scope / Business Goal` of the epic.
+- Figma / image canonical for `flow structure / candidate children`.
+- Conflicts → BLOCKING `AskUserQuestion`. Never silently expand or shrink scope.
-5. **Data entry / UI complexity.** Basic input first (`YYYY-MM-DD` text); fancy UI (calendar picker, autocomplete, drag-drop) as follow-ups.
+## Pre-split gate (STOP conditions)
-6. **Major effort.** First implementation does the heavy infrastructure lift; subsequent stories are trivial additions (build Visa payments + infra in story 1; add Mastercard/Amex in story 2).
+Run `[[invest-checklist]]` first:
-7. **Simple / complex.** Strip variations from the core. Story 1 = simplest case that still delivers value; stories 2..N = each variation.
+- **V FAILS** → STOP. Not a story. Combine with related user-facing work. Do not split.
+- **T FAILS** → fix in place via `[[story-refine]]`. Splitting untestable input produces untestable children.
+- **N FAILS** → fix in place. Story is over-prescriptive, not too big.
+- **E FAILS due to unknowns** → recommend a spike, not a split.
+- **I / E (size) / S FAIL** → proceed to pattern selection.
-8. **Defer performance.** "Make it work" before "make it fast". Story 1 = functional, no SLA. Story 2 = optimize to <100ms / add caching / scale.
+## Pattern catalog (apply in order; stop at first that fits)
-9. **Spike (last resort).** None of 1–8 apply because the unknown blocks decomposition. Run a 1–2 day time-boxed investigation answering a specific question ("is this feasible on our stack?", "what does the third-party API actually return?"). A spike is **not a story** — it produces learning, not shippable code. After the spike, restart at pattern 1.
+Humanizing Work methodology (Lawrence & Green).
-**Anti-patterns (these are NOT splits):**
-- Horizontal slicing (frontend story + backend story) — neither child has user value.
-- Task decomposition ("set up DB" / "write endpoint" / "build form").
-- Meaningless halves ("first half of feature" / "second half").
+1. **Workflow steps — thin end-to-end slices.** NOT step1/step2 of the journey. Each child delivers the FULL workflow with increasing sophistication.
+   - ❌ Wrong: editorial / legal / publish. Story 1 alone delivers nothing.
+   - ✅ Right: publish immediately. Story 2 adds editorial. Story 3 adds legal.
+2. **CRUD operations.** "Manage" / "handle" / "maintain" → split into C/R/U/D.
+3. **Business rule variations.** Same feature, different rules (members / VIP / first-time).
+4. **Data type variations.** One story per data shape (jpg / pdf / mp4).
+5. **Data entry / UI complexity.** Basic input first; fancy UI (calendar, autocomplete) as follow-ups.
+6. **Major effort.** First implementation does the heavy infrastructure lift; subsequent stories are trivial additions.
+7. **Simple / complex.** Strip variations from the core. Story 1 = simplest case that still delivers value.
+8. **Defer performance.** Make-it-work before make-it-fast.
+9. **Spike (last resort).** Time-boxed investigation. Not a story.
-### Cynefin domain calibration
+**Anti-patterns (NOT splits):**
+- Horizontal slicing (frontend / backend) — no user value per child.
+- Task decomposition ("set up DB", "write endpoint").
+- Meaningless halves.
-Adjust the splitting strategy to uncertainty:
+## Cynefin domain calibration
-- **Obvious / Complicated** — known problem, just engineering. Enumerate all children, prioritize by value/risk.
-- **Complex** — unclear what users want or what will work. Don't enumerate exhaustively; produce **1–2 learning stories** that ship something observable, then let real usage teach what to write next.
-- **Chaotic** — priorities shifting daily, fires burning. **Defer splitting** until stability returns. Stabilize first.
+- **Obvious / Complicated** — enumerate all children, prioritize by value/risk.
+- **Complex** — produce 1–2 learning stories that ship something observable; let usage teach the rest.
+- **Chaotic** — defer splitting; stabilize first.
-### Meta-pattern (applies across every pattern)
+## Meta-pattern (every pattern)
-For any pattern you pick:
 1. Name the **core complexity** that makes the story big.
 2. List **all variations** of that complexity.
 3. Pick **one variation** as the simplest complete vertical slice.
 4. Each other variation becomes its own story.
-### Procedure
+## Application (step-by-step)
+0. **Read prior context.** Load `<output-folder>/.storywright-context.json` if present (exact folder only). Apply resolved answers.
+1. **Detect input types + companion sources.** Run conflict detection (mixed inputs). Run chrome detection (rule 7).
+2. **Language resolution** via rule 4a.
+3. **Pre-split gate.** Run `[[invest-checklist]]`. Honor STOP conditions above.
+4. **Pattern selection.** Apply catalog in order. Name first fit. Name the meta-pattern's "core complexity". Note Cynefin domain.
+5. **Draft split plan** as a terminal table (no file yet):
-1. Run `[[invest-checklist]]` and apply the pre-split gates above. If you should not split, say so explicitly and stop.
-2. Apply the pattern catalog in order. Name the first pattern that fits and the meta-pattern's "core complexity".
-3. **Draft a split plan** as a Markdown table:
    ```
    ### Split Plan
+   Rationale: <INVEST failure reasons>
+   Core complexity: <meta-pattern>
+   Pattern(s): <names>
+   Cynefin: <domain>
-   **Rationale (from INVEST failure):**
-   - S — FAIL: covers web + mobile + account-linking
-   - E — FAIL: account-linking edge cases not scoped
+   | # | Proposed child | Pattern | V audit (rule 11) |
+   |---|---|---|---|
+   | 1 | ... | Workflow simple | PASS / WEAK·merge |
+   | 2 | ... | Data variation | PASS |
+   ```
-   **Core complexity (meta-pattern):** authenticating new users + reconciling them with pre-existing accounts.
-   **Pattern(s) applied:** Workflow steps (thin end-to-end) + Simple→Complex
-   **Cynefin domain:** Complicated (known problem, just engineering)
+6. **Strategic check before approval:**
+   - Does the split reveal low-value work we can deprioritize or kill?
+   - Are the children roughly equal in size?
+   If neither holds, try a different pattern.
-   | # | Proposed child story | Pattern | INVEST hint |
-   |---|---|---|---|
-   | 1 | Login Google — simplest path (new account, web) | Workflow / Simple | Smallest complete vertical slice |
-   | 2 | Login Google — mobile | Data variation (surface) | Small, depends on #1 |
-   | 3 | Account linking — Google ↔ existing email/password | Major effort | Independent flow |
-   | 4 | Workspace domain restriction | Business rule variation | Independent |
+7. **STOP and ask the user to approve via `AskUserQuestion`:**
+   - Approve → proceed to step 8.
+   - Adjust → edit, re-loop.
+   - Cancel → mark original as `NEEDS REFINEMENT`, stop.
+8. **For each approved child, write the canonical block** (Use Case + AC + Design Ref + INVEST):
+   ```markdown
+   ### [Child Title]
+   #### Use Case
+   - **As a** [persona]
+   - **I want to** [action]
+   - **so that** [outcome — rule G applied]
-   **Proposed epic title:** Login con Google (multi-surface + linking)
+   #### Acceptance Criteria
+   - **Scenario:** [single outcome]
+   - **Given:** [context — surface nouns drive dep matrix]
+   - **When:** [single trigger]
+   - **Then:** [single observable outcome]
+   #### Design Reference (optional)
+   **Source: <raster | figma | tokens> → <banner from rule 5>**
+   - [link / path]
+   #### INVEST
+   - I — <PASS | PARTIAL · depends on <Ci>>
+   - N/V/E/S/T — one line each
+   - **Verdict:** READY | READY (after <Ci> builds) | WEAK·merge-upstream-candidate
+   #### Refinement log (≤3 lines)
+   - Split from parent; pattern: <name>.
    ```
-4. **Split evaluation (strategic check before approval).** Ask:
-   - **Does the split reveal low-value work we can deprioritize or kill?** Good splits surface 80/20 — e.g., after splitting flight search, "flexible dates" turns out to be rarely used → drop it.
-   - **Are the children roughly equal in size?** Equal-sized children give PMs prioritization flexibility mid-sprint.
-   If neither holds, try a different pattern.
-5. **STOP and ask the user to approve the plan.** Options:
-   - "Approve plan" → proceed to step 6
-   - "Adjust" → user edits the table or merges/splits rows
-   - "Cancel" → leave the story unsplit (mark it as `NEEDS REFINEMENT` for the team to negotiate)
-6. **After approval:**
-   - Write `epic.md` — title, description, child story list, business goal, dependencies between children
-   - Write one stub per child: title + 1-line user story + open clarifications. **Do not run the full story-generate flow yet** — stubs are placeholders; user invokes `[[story-generate]]` per child when ready.
-7. **Coherence check** — verify the children together cover the original scope. If any child overlaps another or the union has gaps, flag it before saving.
-8. **Recursive re-split check.** For each child, ask: still >1 sprint? If yes, restart at the pattern catalog **for that child**. Keep splitting until every leaf is sprint-shippable. Surface the tree visually in the plan.
-9. **Save all artifacts** under `./stories/<epic-slug>/`:
-   - `epic.md`
-   - `story-1.md`, `story-2.md`, …
-   - `split-plan.md` (the decision trail)
-## Examples
+9. **Build the dependency matrix mechanically (rule 10).** Parse each child's `Given` lines for surface nouns owned by other children. Emit the matrix in `epic.md`.
-### Good
+10. **Run V audit per child (rule 11).** Flag merge-upstream candidates in `epic.md`. Recommend merging instead of keeping standalone.
+11. **Recursive re-split check.** For each child, run the deterministic counter (rule D). If count ≥2 for any child → recursive split of that child. Surface the tree in `epic.md`.
+12. **Coherence check** — verify children together cover the original scope. Flag gaps or overlaps before saving.
+13. **Write `epic.md`** at `<output-folder>/`:
+    ```markdown
+    ### EPIC: <title>
+    **Why split:** <pattern + core complexity + INVEST failure reasons>
+    **Cynefin domain:** <domain>
+    **Children independence matrix (mechanical, rule 10):**
+    |        | C1 | C2 | C3 | ... |
+    |--------|----|----|----|-----|
+    | C1     | —  |    |    |     |
+    | C2     |DEP | —  |    |     |
+    | ...    |    |    |    |     |
+    **Build order (topological):** C1 → C2 → ...
-Original: "Permitir login con Google" with INVEST failing on Small + Estimable.
+    **V audit (rule 11):**
+    - C1 — PASS
+    - C2 — PASS
+    - C3 — WEAK · merge-upstream-candidate (merge into C2)
+    **Children:**
+    1. story-1.md — <slug per naming pattern F>
+    2. story-2.md
+    ...
+    **Design source:** <raster | figma | tokens>
+    ```
+14. **Write all `story-N.md` files** and `.storywright-context.json` updated with `split_pattern` and `core_complexity` under `extra`.
+## Validate every child (must pass all 6)
+1. Delivers user value independently (rule 11 V audit PASS).
+2. Developable with explicit build order from the matrix (no implicit deps).
+3. Testable: single Given/When/Then with observable outcome.
+4. Sprintable (1–5 days work).
+5. Union equals original scope (coherence check).
+6. ≤60 lines per child story (anti-PRD via INVEST Small).
+A "no" on any line → revise the split.
+## Examples
+### Good
+Original: "Permitir login con Google" with INVEST Small + Estimable FAIL.
 Split:
 1. Web — new accounts only (Simple)
 2. Mobile — new accounts only (Simple)
-3. Account linking with existing email/password (Complex, depends on #1)
-4. Workspace domain restriction (Independent rule)
+3. Account linking with existing email/password (Major effort)
+4. Workspace domain restriction (Business rule variation)
-Each child is shippable in one sprint, each has clear value, each is testable.
+Matrix mechanical:
+- C2.Given mentions "Google sign-in handshake" owned by C1 → DEP(C2 → C1)
+- C3.Given mentions "Google account exists" owned by C1 → DEP(C3 → C1)
+- C4 independent.
-### Bad
+V audit: all PASS.
-Splitting "Permitir login con Google" into "Backend auth endpoint" and "Frontend login button" — that's a **task split**, not a story split. Both halves have no user value alone.
+Build order: C1 → {C2, C3} → C4 (C4 parallel anytime).
-## Validate every child (must pass all 5)
+### Good — merge recommendation
+Original story produced child "results counter" + child "grid". Counter's V audit:
+- "If only counter ships and no grid exists, does a user complete a task?" → no.
+- V = WEAK · merge-upstream-candidate.
+- Epic recommends merging counter into grid child.
-1. **Delivers user value** independently? (not just "frontend done")
-2. **Developable in isolation** with no hard ordering dependency?
-3. **Testable** with concrete ACs?
-4. **Sprintable** (1–5 days of work, single sprint)?
-5. **Union equals original** — together do they cover the original scope?
+### Bad
+Splitting "Permitir login con Google" into "Backend auth endpoint" + "Frontend login button". Task split, not story split. Both fail rule 11.
-A "no" on any line means revise the split.
+### Bad
+Claiming all 5 children Independent without running rule 10 matrix.
+### Bad
+Writing any sidecar question file. Violates rule 1.
 ## Common Pitfalls
-- **Skipping the pre-split INVEST gate.** Splitting a non-Valuable item produces non-Valuable children. Splitting an untestable item produces untestable children. Fix first, split second.
-- **Workflow done step-by-step instead of thin end-to-end.** Story 1 = "review" / Story 2 = "approve" / Story 3 = "publish" means Story 1 alone is invisible to the user. Each child must deliver visible behavior.
-- **Horizontal slicing** (frontend / backend / DB). Each child must have user value.
-- **Task splits** ("set up DB", "wire API to button"). Tasks aren't stories.
-- **Splitting on size alone**, without naming the core complexity or pattern.
-- **Forcing a pattern that doesn't fit.** If pattern N doesn't apply, say no and move on; never bend.
-- **Auto-splitting without user approval.**
-- **Forgetting the coherence check** — losing scope in the split.
-- **Skipping the strategic evaluation** (low-value reveal, equal sizing).
-- **Letting the tree go >5 children** — that's an initiative, not an epic.
-- **Splitting in chaos.** Stabilize first; splitting amid shifting priorities just multiplies churn.
+- Skipping the INVEST pre-split gate. Splitting non-V or non-T input.
+- Workflow split done step-by-step instead of thin end-to-end.
+- Horizontal slicing (frontend / backend / DB).
+- Task splits.
+- Splitting on size alone without naming pattern or core complexity.
+- Forcing a pattern that doesn't fit.
+- Auto-splitting without user approval.
+- Skipping the mechanical matrix (rule 10).
+- Skipping per-child V audit (rule 11).
+- Letting tree go >5 children — that's an initiative, not an epic.
+- Splitting in Chaotic Cynefin — stabilize first.
+- Re-asking questions already in `.storywright-context.json`.
 ## References
 - [[invest-checklist]]
 - [[story-generate]]
+- [[story-refine]]
 - [[clarification-questions]]
 <claude-specific>
-- Use extended thinking — pattern selection benefits from explicit comparison of options.
-- Cache the 8-pattern catalog.
+- Use extended thinking for pattern selection (compare options explicitly).
+- Cache the 9-pattern catalog and the v2.2 hard-rule list.
+- Build the dependency matrix from Given-text parsing (rule 10), not intuition.
+- Run V audit per child (rule 11) and flag merge candidates loudly.
+- Never call Write for any sidecar question file. Use `AskUserQuestion`.
+- Read `.storywright-context.json` ONLY from the exact target output folder.
 </claude-specific>