@kiwidata/grimoire 0.3.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -103,22 +103,22 @@ User has a request
103
103
  ├─ "I want to add / change / remove functionality"
104
104
  │ │
105
105
  │ ├─ Adding new behavior?
106
- │ │ → /grimoire:draft → write new .feature file
106
+ │ │ → /grimoire:draft → design the new behavior (plan projects the .feature)
107
107
  │ │
108
108
  │ ├─ Changing existing behavior?
109
- │ │ → /grimoire:draft → modify existing .feature file
109
+ │ │ → /grimoire:draft → design the change (plan projects it into the .feature)
110
110
  │ │
111
111
  │ ├─ Removing a feature?
112
112
  │ │ → /grimoire:remove → tracked removal with impact assessment
113
113
  │ │
114
114
  │ └─ Does it also involve a technology/architecture choice?
115
- │ → Draft BOTH: .feature file + MADR decision record in the same change
115
+ │ → Draft BOTH the behavior and the decision in one change
116
116
 
117
117
  ├─ "We should use X instead of Y" / "How should we architect this?"
118
- │ → /grimoire:draft → MADR decision record (not a feature)
118
+ │ → /grimoire:draft → design the decision (plan projects a MADR, not a feature)
119
119
 
120
120
  ├─ "We need to handle X concurrent users / meet Y compliance"
121
- │ → /grimoire:draft → MADR decision record (non-functional requirement)
121
+ │ → /grimoire:draft → design the requirement (plan projects a MADR / constraint)
122
122
 
123
123
  ├─ "What do we have? What's documented?"
124
124
  │ → /grimoire:audit → discover undocumented features and decisions
@@ -156,9 +156,11 @@ The end-to-end flow for adding or modifying behavior is six stages, each owned b
156
156
 
157
157
  **Draft** (`/grimoire:draft`) → **Plan** (`/grimoire:plan`) → **Review** (`/grimoire:review`, optional) → **Apply** (`/grimoire:apply`) → **Verify** (`/grimoire:verify`) → **PR** (`grimoire pr`).
158
158
 
159
+ Draft's single job is to design the change on `draft.md`. **Projection** — turning that agreed design into features, constraints, MADRs, `data.yml`, and the manifest — is the **first step of Plan**, not the end of Draft.
160
+
159
161
  Each skill's SKILL.md is the authoritative home for that stage's mechanics; the README "Workflow" section is the narrative walkthrough. Do not re-derive stage steps here — invoke the skill. The operational invariants that bind every stage:
160
162
 
161
- - **Manifest status tracks progress:** `approved` after draft, `implementing` during apply, `accepted` at PR.
163
+ - **Manifest status tracks progress:** the manifest is created at plan's projection step; `approved` once the design is agreed and projected, `implementing` during apply, `accepted` at PR.
162
164
  - **Live on the branch.** Features, decisions, constraints, and schema are edited directly on the feature branch — no copy-into-change-folder, no promote step.
163
165
  - **No archive step.** The PR diff *is* the change; git history plus the `Change: <id>` commit trailer are the record. PR finalize just flips decision status to `accepted` and removes the ephemeral change folder.
164
166
  - **The user drives the pace.** Review mode (default) approves every file change before writing; autonomous mode works the full task list, stopping only on blockers.
@@ -204,9 +206,9 @@ project-root/
204
206
  ## Conventions
205
207
 
206
208
  ### Manifest Status Lifecycle
207
- Every manifest has a `status` field in YAML frontmatter:
208
- - `draft` — being written, not yet reviewed
209
- - `approved` — reviewed by user, ready for planning/implementation
209
+ Every manifest has a `status` field in YAML frontmatter (the manifest is created at plan's projection step, from the already-agreed design):
210
+ - `draft` — just projected from the agreed `draft.md`
211
+ - `approved` — design agreed and projected, ready for implementation
210
212
  - `implementing` — tasks are being worked on
211
213
 
212
214
  Update the status as the change progresses. The CLI reads this to report change state. There is no `complete`/archive state — finalize removes the ephemeral change folder once the PR is opened; git history is the record.
package/README.md CHANGED
@@ -77,8 +77,8 @@ Then talk to your AI assistant:
77
77
  ```
78
78
  You: "Users should be able to log in with 2FA"
79
79
 
80
- → /grimoire:draft Creates login.feature with Given/When/Then scenarios
81
- → /grimoire:plan Generates tasks: write the test, then production code
80
+ → /grimoire:draft Designs the change on one living draft.md (Given/When/Then take shape here)
81
+ → /grimoire:plan Projects the design into login.feature + decisions, then generates tasks
82
82
  → /grimoire:review (optional) Product, security, engineering + principles review
83
83
  → /grimoire:apply Implements test-first (BDD for behavior, unit for invariants)
84
84
  → /grimoire:verify Confirms all scenarios pass, no regressions
@@ -115,11 +115,11 @@ Grimoire routes your request to its one correct home (an admission test keeps ea
115
115
  - **"The login page is broken"** → `/grimoire:bug` (reproduce first, then fix)
116
116
  - **"A tester found a problem"** → `/grimoire:bug-report` → `/grimoire:bug-triage` → routed fix
117
117
 
118
- A `.feature` is allowed only if it has an external actor, is observable without reading code/logs, uses domain language, and survives a reimplementation. Security controls, NFRs, and observability guarantees are invariants → they live in the constraints register. Produces `.feature` files (with security tags like `@security`, `@auth`, `@pii`, `@pci-dss` when applicable), constraint entries, decision records, `data.yml` for schema changes, and a manifest tracking the change.
118
+ A `.feature` is allowed only if it has an external actor, is observable without reading code/logs, uses domain language, and survives a reimplementation. Security controls, NFRs, and observability guarantees are invariants → they live in the constraints register. You design all this on one living `draft.md`; it is **projected** into its homes — `.feature` files (with security tags like `@security`, `@auth`, `@pii`, `@pci-dss` when applicable), constraint entries, decision records, `data.yml` for schema changes, and a manifest at the **start of Plan**, so Draft's one job is to design the change.
119
119
 
120
- ### 2. Plan — Generate concrete tasks
120
+ ### 2. Plan — Project the design, then generate concrete tasks
121
121
 
122
- Every scenario becomes a pair: write the step definition (test), then write the production code. Tasks reference exact file paths, exact assertions, and real patterns from area docs. Data changes (models, migrations) are ordered before feature code.
122
+ Plan opens by **projecting** the agreed `draft.md` into its homes (features, constraints, decisions, `data.yml`, manifest), running the admission test and principles gate as it goes. Then every scenario becomes a pair: write the step definition (test), then write the production code. Tasks reference exact file paths, exact assertions, and real patterns from area docs, ordered along the technical spine (dependencies data API logic → UI → verification).
123
123
 
124
124
  The plan skill reads area docs for conventions and boundaries, and queries the code graph for reusable utilities and exact symbols — so the AI plans with real codebase knowledge, not guesses. Each task is tagged with its verification level: `scenario` (behavior), `unit-invariant` (a constraint), or `characterization` (internal/refactor).
125
125
 
@@ -176,7 +176,7 @@ Grimoire treats design as a first-class spec input, not an afterthought.
176
176
  - **Brand capture at init** — `grimoire init` offers to capture colors, type, spacing, and voice into `.grimoire/brand/` (DTCG tokens). Skip-able; can be added later via `grimoire-design --capture-brand`.
177
177
  - **Consult (optional)** — `/grimoire:design-consult` runs a pre-design Q&A. Security and data personas interview the designer about the proposed change *before* any artifacts exist, surfacing assumptions and constraints early. No findings, no blockers — just questions whose answers will shape the design.
178
178
  - **Design** — `/grimoire:design` walks: problem statement → user flow & pain points → variants (Figma MCP, static HTML, or ASCII) → required component states (default/loading/empty/error) → proposed Gherkin scenarios for each (component × state).
179
- - **Handoff** — accepted scenarios feed `/grimoire:draft` (manifest + ADRs), then `/grimoire:plan` (tasks), then `/grimoire:review` — **mandatory at complexity 4** with surface-conditional adversarial personas (keyboard, screen-reader, contrast on web; touch + gesture on mobile; keyboard-only on TUI).
179
+ - **Handoff** — accepted scenarios feed `/grimoire:draft` (design), then `/grimoire:plan` (projects manifest + ADRs, then tasks), then `/grimoire:review` — **mandatory at complexity 4** with surface-conditional adversarial personas (keyboard, screen-reader, contrast on web; touch + gesture on mobile; keyboard-only on TUI).
180
180
  - **Revision** — `/grimoire:design --revise` re-enters an existing design without restarting. Shows current variants and Gherkin, asks what to change, regenerates only the affected artifacts. Previously-accepted scenarios are not overwritten without confirmation.
181
181
 
182
182
  Brand-drift lint (`grimoire-design --lint`) cross-references hardcoded colors / px / fonts against `.grimoire/brand/tokens.json` and suggests token replacements. Wired into precommit-review when tokens exist.
@@ -194,19 +194,14 @@ Full grimoire cycle end-to-end — adding two-factor authentication to an existi
194
194
  You: "Users should verify their identity with a TOTP code after entering their password"
195
195
  ```
196
196
 
197
- The AI runs `/grimoire:draft` and produces:
197
+ The AI runs `/grimoire:draft` and designs the change on one living `draft.md` — the decision ledger (Y-statements), behavioral sketches, and an open/decided ledger — iterating with you until the design is agreed:
198
198
 
199
199
  ```
200
200
  .grimoire/changes/add-2fa-login/
201
- ├── manifest.md # Why, what's changing, scope
202
- ├── features/
203
- │ └── auth/
204
- │ └── login.feature # Updated with 2FA scenarios
205
- └── decisions/
206
- └── 0003-totp-library.md # Chose pyotp over django-otp
201
+ └── draft.md # the agreed design; scenarios take shape here before projection
207
202
  ```
208
203
 
209
- **login.feature:**
204
+ The scenarios that emerge (projected into `login.feature` at the start of Plan):
210
205
  ```gherkin
211
206
  Feature: Login with two-factor authentication
212
207
  As a user
@@ -235,11 +230,11 @@ Feature: Login with two-factor authentication
235
230
  And I should remain on the verification page
236
231
  ```
237
232
 
238
- You review and approve. Manifest status: `draft` `approved`.
233
+ You review and approve the **design**. Nothing is written to `features/` or `decisions/` yet — projection happens next, in Plan.
239
234
 
240
235
  ### Plan
241
236
 
242
- The AI runs `/grimoire:plan`, reads the approved features + area docs + data schema, and generates `tasks.md`:
237
+ The AI runs `/grimoire:plan`, which **first projects** the agreed `draft.md` into its homes — `manifest.md`, `features/auth/login.feature` (the scenarios above), and `decisions/0003-totp-library.md` — running the admission test as it routes each fact. It then reads those homes + area docs + the code graph and generates `tasks.md`, ordered along the technical spine:
243
238
 
244
239
  ```markdown
245
240
  # Tasks: add-2fa-login
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kiwidata/grimoire",
3
- "version": "0.3.0",
3
+ "version": "0.3.1",
4
4
  "description": "Gherkin + MADR spec-driven development for AI coding assistants",
5
5
  "type": "module",
6
6
  "bin": {
@@ -384,6 +384,7 @@ Present a brief summary:
384
384
  ## Important
385
385
  - **Tests are not optional.** Every task produces both production code and passing step definitions. No exceptions.
386
386
  - **Red-green is mandatory, not aspirational.** A test must fail before it passes. If it doesn't fail, it's not a real test. Fix it before moving on.
387
+ - **Code-before-test is the most common bypass.** "I'll add the test after" / "let me see it work first" are the *Code before the test* rationalization in `../references/red-flags.md`. If you wrote code before the test, delete the code and start from red.
387
388
  - **A test that always passes is worse than no test.** It gives false confidence. If you can't make a step definition fail, you don't understand what it's testing.
388
389
  - The feature file is the spec. If a test fails, fix the code, not the feature.
389
390
  - If implementation reveals that a scenario is wrong or missing, STOP and go back to draft. Don't silently change features.
@@ -61,6 +61,8 @@ Render the chosen template for the user to fill in. Write the result to `problem
61
61
  ### 3. User Flow & Success Metrics
62
62
  Ask for a **friction-log narrative** as the default minimum-viable level — a short prose description of the current user journey and where it hurts. Offer (but never force) two upgrades: Mermaid journey diagram, then service blueprint.
63
63
 
64
+ Walk the flow on the **UX-workflow spine** (`../references/design-spine.md`): pick a direction — **backward** from the goal ("what must be true for the user to reach here?") when the goal is clear but the path isn't, or **forward** from what the user already knows when documenting an existing happy path — then validate by walking the other way; gaps where the two don't meet are missing or assumed steps. Reconstruct the *real* sequence with laddering / the Mom Test (`../references/elicitation-personas.md`), not an imagined one. Keep the simplicity bias: a surfaced step is a candidate, and a step that serves no part of the stated goal is cut, not designed.
65
+
64
66
  Separately — not bundled — ask "What are the user's current pain points?" Accept a bulleted list, free text, or "none known". Capture pain points in `problem.md` under a dedicated `## Pain Points` section. Variants generated in step 6 must each state which pain points they address (or explicitly mark "deferred").
65
67
 
66
68
  Ask for at least one measurable success metric (e.g., "reduce support tickets about lockouts by 50%"). If the user cannot articulate one, note `no success metric — design effectiveness will be hard to evaluate` as an assumption in `problem.md`.
@@ -73,7 +75,7 @@ If `.grimoire/docs/components.md` is absent AND the project has UI code, ask the
73
75
  - MUI / Chakra / Mantine / Radix imports in `package.json`
74
76
  - `*.stories.{ts,tsx,jsx,js}` (Storybook stories)
75
77
 
76
- Write findings to `.grimoire/docs/components.md` listing detected components with file paths and known variants. Skip the scan entirely if no UI signals are present (greenfield or non-UI surface). Subsequent variants prefer existing components over net-new designs and flag net-new explicitly.
78
+ Read at **interface altitude** — detect components from these manifests, imports, and stories, not by reading component source bodies (`../references/artifact-map.md` → Reading altitude). Write findings to `.grimoire/docs/components.md` listing detected components with file paths and known variants. Skip the scan entirely if no UI signals are present (greenfield or non-UI surface). Subsequent variants prefer existing components over net-new designs and flag net-new explicitly.
77
79
 
78
80
  ### 5. Brand Grounding
79
81
  Read `.grimoire/brand/tokens.json` and `.grimoire/brand/voice.md` if they exist. Use the format documented in `../references/brand-tokens-format.md`. Required groups: `color.*`, `font.family.*`, `font.size.*`, `spacing.*`.
@@ -128,7 +130,7 @@ For HTML output, write a single `.grimoire/changes/<change-id>/designs/preview.h
128
130
  Skip preview rendering when output is Figma (the Figma file IS the preview) or ASCII (the markdown table IS the preview).
129
131
 
130
132
  ### 9. Derive Gherkin
131
- Propose draft scenarios as a **design artifact** at `.grimoire/changes/<change-id>/designs/scenarios.feature` (a proposal, not the live baseline). One Scenario per (component × state), Given / When / Then grounded in the design. `grimoire-draft` writes the user-approved scenarios live into `features/` — design does not edit the live baseline directly. Every proposed scenario must still pass draft's feature admission test (external actor, observable, domain language).
133
+ Propose draft scenarios as a **design artifact** at `.grimoire/changes/<change-id>/designs/scenarios.feature` (a proposal, not the live baseline). One Scenario per (component × state), Given / When / Then grounded in the design. These are a proposal: `grimoire-draft` carries them into the design, and `grimoire-plan` projects the user-approved scenarios live into `features/` — design does not edit the live baseline directly. Every proposed scenario must still pass the feature admission test (external actor, observable, domain language).
132
134
 
133
135
  Apply surface-conditional adversarial scenarios per `../references/adversarial-personas.md`:
134
136
 
@@ -143,7 +145,7 @@ Present the proposed scenarios for review: "Review proposed scenarios — accept
143
145
  ### 10. Handoff
144
146
  When the user accepts proposed scenarios, the change folder is populated. Suggest the next step:
145
147
 
146
- > Run `grimoire-draft` to refine the manifest and ADRs, or `grimoire-plan` to break into tasks.
148
+ > Run `grimoire-draft` to refine the behavioral design, then `grimoire-plan` to project it (features/constraints/ADRs/manifest) and break into tasks.
147
149
 
148
150
  Skill is done.
149
151
 
@@ -256,4 +258,4 @@ Do not error — absence is a valid state for projects that haven't onboarded br
256
258
  - **Brand drift findings are suggestions, not blockers.** Lint mode proposes token replacements; it does not auto-rewrite code. The user decides whether to apply.
257
259
 
258
260
  ## Done
259
- When the user accepts proposed Gherkin scenarios and the change folder contains `problem.md`, `designs/`, and `features/`, the workflow is complete. Suggest `grimoire-draft` (manifest + ADRs) or `grimoire-plan` (task breakdown) next.
261
+ When the user accepts proposed Gherkin scenarios and the change folder contains `problem.md` and `designs/`, the workflow is complete. Suggest `grimoire-draft` (refine the behavioral design) next, then `grimoire-plan` (project to features/ADRs/manifest + task breakdown).
@@ -181,9 +181,9 @@ Do not invent content for empty sections. If the designer skipped "Proposed user
181
181
  ### 6. Handoff
182
182
  Tell the user what runs next and what those skills will do with `consult.md`:
183
183
 
184
- - **`grimoire-design`** on the same `change-id` will read `consult.md` first, propagate assumptions and givens into prompts for variant generation (e.g., "exclude patterns that violate givens"), and copy the lists into `manifest.md` when the designer accepts a direction.
185
- - **`grimoire-draft`** on the same `change-id` will read `consult.md` and copy "Inferred assumptions" + "Inferred givens" into `manifest.md` (Assumptions section, plus a new Givens section) at level 3-4 complexity.
186
- - **Open questions** travel into `manifest.md` as unvalidated assumptions for the designer/PM to resolve before plan.
184
+ - **`grimoire-design`** on the same `change-id` will read `consult.md` first and propagate assumptions and givens into prompts for variant generation (e.g., "exclude patterns that violate givens"); the lists carry into the design and are projected into `manifest.md` by `grimoire-plan`.
185
+ - **`grimoire-draft`** on the same `change-id` will read `consult.md` and carry "Inferred assumptions" + "Inferred givens" into the `draft.md` design (assumptions Decided/Open; givens Decisions-ledger context). `grimoire-plan` then projects these into the manifest's Assumptions section at level 3-4 complexity.
186
+ - **Open questions** stay in `consult.md` as designer follow-up items they are not copied forward.
187
187
 
188
188
  Do not invoke the next skill automatically. Confirm with the user, then suggest the next command.
189
189
 
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: grimoire-draft
3
- description: Design a change collaboratively on one living draft.md, then project it into Gherkin features, constraints, and MADR decisions. Use when the user describes new functionality, requirements, or architecture choices.
3
+ description: Design a change collaboratively on one living draft.md. grimoire-plan then projects it into Gherkin features, constraints, and MADR decisions. Use when the user describes new functionality, requirements, or architecture choices.
4
4
  compatibility: Designed for Claude Code (or similar products)
5
5
  metadata:
6
6
  author: kiwi-data
@@ -9,12 +9,13 @@ metadata:
9
9
 
10
10
  # grimoire-draft
11
11
 
12
- Design a change on **one living document** (`draft.md`), iterating with the user, then
13
- **project** the agreed design into its durable homes (features, constraints, decisions).
12
+ Design a change on **one living document** (`draft.md`), iterating with the user until the
13
+ design is agreed. `grimoire-plan` then **projects** that design into its durable homes
14
+ (features, constraints, decisions) — draft itself does not write them.
14
15
 
15
16
  The core idea: spread-out artifacts hinder the thinking. So you do all the designing in a
16
17
  single coherent doc — diagram/sketch, a decision ledger, pseudo-code, an open-question
17
- ledger — and only fragment it into separate homes **after agreement**. `draft.md` is
18
+ ledger — and it is fragmented into separate homes (by plan) only **after agreement**. `draft.md` is
18
19
  ephemeral: retained as reference through the pipeline, deleted when `grimoire-apply` clears
19
20
  the change folder. Git history preserves it.
20
21
 
@@ -27,7 +28,7 @@ the change folder. Git history preserves it.
27
28
  ## Routing (coarse — up front)
28
29
 
29
30
  Decide only whether to design at all, and in which skill. The **fine** routing (which fact
30
- becomes a feature vs. a constraint vs. a decision) happens later, at projection (step 7).
31
+ becomes a feature vs. a constraint vs. a decision) happens later, at projection — now `grimoire-plan`'s first step.
31
32
 
32
33
  - Bug report ("something is broken") → `grimoire-bug` or `grimoire-bug-report`
33
34
  - Pure refactoring (no behavior change) → no grimoire artifact needed. Suggest an ADR only if architecturally significant.
@@ -40,7 +41,7 @@ becomes a feature vs. a constraint vs. a decision) happens later, at projection
40
41
 
41
42
  Confirm this is a change worth designing, and which skill owns it (table above). You do
42
43
  **not** need to assign each fact to a home yet — during design everything lives in one
43
- `draft.md`; per-fact routing is a projection concern (step 7, D13).
44
+ `draft.md`; per-fact routing is a projection concern, handled in `grimoire-plan`.
44
45
 
45
46
  The one up-front question that matters: **is this a behavior/feature/architecture change**
46
47
  (→ design it here), or a bug / pure refactor / config tweak (→ route away, per the table)?
@@ -54,9 +55,9 @@ the design exists. Up front, make only one binary call:
54
55
  - **Trivial** — config, typo, copy change, single-file fix, dependency bump. Skip the
55
56
  `draft.md` loop: make the change directly, record a minimal `manifest.md` (Why + file
56
57
  list), done.
57
- - **Non-trivial** — anything else. Build a `draft.md` and design the change (steps 3–8).
58
+ - **Non-trivial** — anything else. Build a `draft.md` and design the change (steps 3–7).
58
59
 
59
- The full **complexity level (1–4)** is scored at **projection** (step 7), once the design
60
+ The full **complexity level (1–4)** is scored at **projection** (`grimoire-plan`'s first step), once the design
60
61
  is settled, and written to `manifest.md` — not before (a premature number biases the design
61
62
  to fit it). During design, use the table below only as a rough guide for how deep to
62
63
  research and elicit; depth grows with the change, it is not pre-allocated.
@@ -79,7 +80,7 @@ Before designing, research what already exists. Do not ask the user to research
79
80
  built-ins / first-party check; architecture decisions, new dependencies, or cross-cutting
80
81
  concerns need full research across all categories.
81
82
 
82
- Follow the methodology in `../references/build-vs-buy.md`. The findings feed the `draft.md`
83
+ Follow the methodology in `../references/build-vs-buy.md`. Read candidates at **interface altitude** — public API, types, and docs, not their source or tests (`../references/artifact-map.md` → Reading altitude). The findings feed the `draft.md`
83
84
  **Why** (and, for an adopt/build/hybrid call, the manifest **Prior Art** at projection).
84
85
  Present findings to the user and get agreement on direction before designing deeply.
85
86
 
@@ -111,17 +112,29 @@ This single loop replaces what used to be separate "elicit requirements", "draft
111
112
  "collaborate" steps. **Interviewing IS iterating on `draft.md`.** There is no gather-then-
112
113
  transcribe split — requirements surface, get questioned, and resolve inside the doc.
113
114
 
115
+ **Walk the spine.** Pick the spine this change rides (`../references/design-spine.md`): the **technical spine** (process/constraints → data model → API/contract → UI) for behavioral/technical work, or the **UX-workflow spine** (backward from the goal, or forward from what the user knows) for a user-facing flow. Then **always walk its layers in order** — at each layer: elicit with that layer's lens, record its decisions, and validate the prior layer against what you just learned (a required field must trace to a downstream need; the data model must satisfy the process constraints). An empty layer is a one-line skip, not a silent omission. Ceremony scales to constraints: lightweight by default, but once the change introduces **more than 2 constraints**, walk every layer formally — that 3rd constraint is complexity surfacing.
116
+
114
117
  Iterate with the user, directly on `draft.md`:
115
118
 
116
119
  ```
117
120
  loop:
118
- propose → decisions into the Decisions ledger; shapes into Sketches; a diagram/sketch into At a glance
121
+ propose → decisions into the Decisions ledger as Y-statements (../references/design-spine.md); shapes into Sketches; a diagram/sketch into At a glance
119
122
  question → unknowns become rows under Open (use ../references/elicitation-personas.md as lenses)
120
123
  user reacts → answers / edits the doc
121
124
  resolve → strike the Open row IN PLACE: `RESOLVED: <answer> (Dn)` — never delete it
122
125
  until Decided is stable AND Open is empty-or-deferred.
123
126
  ```
124
127
 
128
+ **Explore before you converge.** When the design approach is genuinely open — more than one
129
+ reasonable shape exists — sketch **2–3 candidate approaches** at a high level (one or two
130
+ lines each: the idea + its main trade-off) and let the user pick a direction *before* you
131
+ deep-dive the ledger on one. Don't silently commit to the first idea that works; the first
132
+ idea is rarely the best, and an unexamined commitment is the *Silently filling a gap* red
133
+ flag at design scale (`../references/red-flags.md`). Keep this lightweight: it is a quick
134
+ approach-level fork, not a variants matrix — **visual/UI variants are `grimoire-design`'s
135
+ job**. When the approach is obvious or forced (one viable shape), say so in one line and
136
+ proceed; don't manufacture alternatives.
137
+
125
138
  Discipline for the loop:
126
139
 
127
140
  1. **Outcome & Non-goals first.** Pin these (into *Why*) before anything else — they set scope. Restate them back to the user.
@@ -131,124 +144,44 @@ Discipline for the loop:
131
144
  5. **Disambiguate immediately.** If an answer is vague ("handle errors gracefully"), ask the specific follow-up and record the concrete answer. Never leave a vague answer in the ledger.
132
145
  6. **Capture, don't extrapolate.** "Out of scope for now" → record as a non-goal and stop. Don't design a scenario "just in case".
133
146
  7. **When the user delegates** ("just write something reasonable"), record it explicitly as an Open→RESOLVED row: "Defaulting to <choice> per user delegation — flag in review if wrong." The assumption stays visible.
134
- 8. **Sort facts by kind as they emerge.** An invariant (security control, NFR, performance budget, observability guarantee) is not a behavior — capture it in the *Constraints* section, not as a behavioral sketch. Apply the rough behaviour-vs-invariant test as you design (does an external actor observe it without reading code/logs?) so projection's admission test (step 7) gets clean input instead of slop to reroute. The fine fact-to-home routing still happens at projection; this just keeps the design honest while you think.
147
+ 8. **Sort facts by kind as they emerge.** An invariant (security control, NFR, performance budget, observability guarantee) is not a behavior — capture it in the *Constraints* section, not as a behavioral sketch. Apply the rough behaviour-vs-invariant test as you design (does an external actor observe it without reading code/logs?) so projection's admission test (in `grimoire-plan`) gets clean input instead of slop to reroute. The fine fact-to-home routing still happens at projection; this just keeps the design honest while you think.
135
148
 
136
149
  **Never silently fill an open question.** Either ask it (as an *Open* row), defer it to a non-goal, or record the inference explicitly in *Decided*. The *Decided/Open* ledger IS the requirements summary — before declaring the design done, walk it back to the user so they see every call and every guess.
137
150
 
138
151
  **Nothing is written to `features/`, `.grimoire/docs/constraints.md`, or `.grimoire/decisions/` during this loop.** Everything lives in `draft.md`. The design is "done" when *Decided* is stable and *Open* is empty-or-deferred — and the user agrees.
139
152
 
140
- Do NOT proceed to projection without explicit user approval of the design.
141
-
142
- ### 7. Projection — generate the homes from draft.md
143
-
144
- Once the user agrees the design is settled, project `draft.md` into its durable homes. This
145
- is where the **fine routing** happens (each fact → its one home) and where the admission
146
- test + principles gate run. Artifacts are written **live in their real locations** on the
147
- branch — `git diff` is the staging area; there is no copy-into-the-change-folder.
148
-
149
- First, **score the complexity level (1–4)** now that the design is settled, and write it to
150
- `manifest.md` frontmatter as `complexity: <1-4>`.
151
-
152
- Then project each kind of fact:
153
-
154
- **Behaviors → `features/*.feature`.** For each behavioral fact in the design:
155
-
156
- *The feature-file admission test* — a scenario may be written **only if it passes all four gates**; if it fails any, it is a constraint or a decision, not a feature:
157
- 1. **External actor, outside the system boundary** — an end user, an operator, or a *third-party* system integrating with you does the thing. "External" means outside *your* system, not outside one module: a sibling service, an internal queue consumer, or another module in the same repo calling this one is **internal**, even though it's a separate process. Internal actor → contract test or constraint/decision, never a `.feature`.
158
- 2. **Observable** — the actor sees the outcome without reading code or logs. "<200ms", "logs scrubbed of PII" → fails → constraint.
159
- 3. **Domain language** — domain nouns, zero implementation detail. Names a library/log-level/table (`loguru`, `INFO`, `bcrypt`, `users` table) → fails → leaking implementation.
160
- 4. **Survives reimplementation** — rewrite the internals from scratch; would the scenario still read the same? If it would change, it's pinned to implementation → not a feature.
161
-
162
- **Internal protocols and service-to-service contracts are NOT features.** A change to how two of your own components talk — an internal RPC/queue/event shape, a module API, a wire format between your services — is a *contract*, verified by a contract/integration test (`verify: unit-invariant` at plan stage), not by Gherkin. It fails gate 1: there is no external actor, only your own code on both ends. If a third-party integrates against the protocol it's external and may be a feature; two of your own services is internal. This is the second-biggest source of feature-file slop after invariants.
163
-
164
- Common slop this catches: invariants (→ `constraints.md`) — "PII is scrubbed from logs", "all endpoints require auth", "responses are gzipped", "errors logged with a trace id"; internal protocols (→ contract test) — "service A publishes an OrderPlaced event B consumes", "the worker accepts a job payload with these fields", "module X returns this struct to module Y".
165
-
166
- *Extend vs. new — default is always extend; new files are the exception and require justification.* List existing feature files first (**required, not skippable** — do not write any scenario until this triage table is complete):
167
-
168
- ```
169
- Existing feature files:
170
- features/auth/login.feature — "User Login"
171
- features/billing/invoices.feature — "Invoice Management"
172
- ```
173
-
174
- For each scenario, decide extend-or-new and show it:
175
-
176
- ```
177
- "Admin resets a user's password" → extend features/auth/login.feature (same actor domain: auth)
178
- "User configures SSO provider" → NEW (no existing file owns SSO configuration)
179
- ```
180
-
181
- Signals to extend: same actor, same domain object, same entry point, same HTTP resource or screen. Signals genuinely new: new actor type with no existing file, entirely new domain object, or the existing Feature title would need "and" to cover both. If unsure, extend. A new file requires stating which files were considered and why none fit.
182
-
183
- Then write Gherkin (Feature title + user story; Background for shared preconditions; one scenario per behavior; Given/When/Then describing WHAT, never HOW). Apply security tags per `../references/security-compliance.md` (only when there's a security surface; compliance tags only when `project.compliance` is set). When design input grounded the scenarios (step 4): use brand-token **names** not hex values when `.grimoire/brand/tokens.json` applies; prefer existing component names when `.grimoire/docs/components.md` exists, and flag any net-new component ("new component required — confirm before plan stage").
184
-
185
- **Constraints → `.grimoire/docs/constraints.md`.** Every invariant that failed the admission test (it's a security control / NFR / observability / compliance rule, not an actor-observable behavior) becomes one row: **assertion · rationale · how-verified · links**. The assertion is a flat statement ("Log output never contains PII or secrets"), not Given/When/Then. `how-verified` names the test that proves it (a `unit-invariant` the plan stage will create) — never a Gherkin scenario. If it stems from a decision, link the MADR; don't restate it. Create the file from `templates/constraints.md` if absent.
186
-
187
- **Decisions → `.grimoire/decisions/NNNN-*.md`.** Project each Decisions-ledger entry, applying the **novelty gate**: a MADR is for a decision with a real, project-specific trade-off between viable alternatives — not for industry-default tooling picks or ecosystem-forced conventions. Ask: *would a competent engineer on this stack make a different choice, and need our reasoning to understand ours?* If no, skip it. Obvious tooling/convention picks fold into the existing `Tooling and convention baseline` ADR (one line: choice → why), not a new sequential record. Genuine trade-offs get the next sequential number, status `proposed` (`grimoire-apply` flips to `accepted` at finalize), using `.grimoire/decisions/template.md`.
188
-
189
- **Data changes → `.grimoire/changes/<change-id>/data.yml`.** If the change adds/modifies/removes data models, fields, indexes, or external API integrations, write `data.yml` (same YAML shape as `schema.yml`, only what's changing, `action:` on each entry):
190
-
191
- ```yaml
192
- # Proposed data changes for: add-user-profiles
193
- users:
194
- action: modify
195
- source: src/models/user.py
196
- fields:
197
- avatar_url: { action: add, type: varchar, nullable: true }
198
- legacy_name: { action: remove }
199
- profiles:
200
- action: add
201
- type: collection
202
- fields:
203
- user_id: { type: objectId, ref: users }
204
- bio: { type: string, max_length: 500 }
205
- github_api:
206
- action: add
207
- type: external_api
208
- provider: GitHub
209
- schema_ref: https://docs.github.com/en/rest
210
- client: src/integrations/github.py
211
- endpoints:
212
- get_user:
213
- method: GET
214
- path: /users/{username}
215
- request:
216
- headers: { Authorization: "Bearer {token}" }
217
- response:
218
- login: { type: string, required: true }
219
- avatar_url: { type: string, required: true }
220
- name: { type: string, nullable: true }
221
- error_response:
222
- message: { type: string }
223
- status: { type: integer }
224
- ```
153
+ Do NOT hand off to `grimoire-plan` without explicit user approval of the design.
225
154
 
226
- **Contract documentation is mandatory for external APIs.** Every endpoint must document `request` (what you send), `response` (fields you read, `required: true` for those your code depends on), and `error_response` (the error shape you handle). Downstream skills generate contract tests from this. If you don't know the exact shape, reference `schema_ref` and document the subset your client uses that subset is the contract. No data impact → skip `data.yml` entirely.
155
+ ### 7. Hand offprojection happens in plan
227
156
 
228
- **Manifest (`manifest.md`).** Generate it from `draft.md` as the durable plan-input glue: `complexity` (just scored), Why + Non-goals, the artifact list (added/modified/removed features, decisions, constraints), and a **Prior Art** section summarizing step 3's research (what was found/evaluated, why adopt/build/hybrid; if building, what's borrowed). **Level 3–4** also carry **Assumptions** (what must be true; mark evidence vs. unvalidated; flag unvalidated ones on the critical path) and a **Pre-Mortem** (2–5 plausible failure modes 6 months out, with mitigations or "accepted"). These come straight from the `draft.md` Decided/Open and Cut sections.
157
+ Draft ends when the design on `draft.md` is agreed. **Projection turning the design into its
158
+ durable homes (features, constraints, MADRs, `data.yml`, manifest) — is now the first step of
159
+ `grimoire-plan`**, co-located with the planning that consumes those homes. A two-phase draft
160
+ (design *then* project) was one job too many; draft now does one thing — design the change —
161
+ and hands the agreed `draft.md` to plan.
229
162
 
230
- **Do NOT delete `draft.md`.** Retain it read-only as the agreed reference through plan → … → apply. `grimoire-apply` removes it with the change folder at finalize.
163
+ So draft does **not** write `features/`, `.grimoire/docs/constraints.md`,
164
+ `.grimoire/decisions/`, `data.yml`, or the full `manifest.md`. What it leaves for plan:
231
165
 
232
- ### 8. Validate (at projection)
166
+ - `draft.md` the agreed design: the Decisions ledger (Y-statements), Decided/Open, Sketches,
167
+ Constraints, and Cut sections. This is the single source plan projects from.
168
+ - The change folder and the feature branch.
233
169
 
234
- - `.feature` files have valid Gherkin; every Feature has a user story; every Scenario has at least Given + When + Then.
235
- - MADR records have valid YAML frontmatter (status, date).
236
- - Manifest is complete and accurate; `complexity` is set.
237
- - **Re-run the admission test on every scenario you wrote**: external actor, observable, domain language, survives reimplementation. Any scenario that now fails is slop — move it to `constraints.md` or a MADR.
238
- - **Principles gate** (`../references/principles.md`): no fact written to two homes (DRY), no second way to do an existing thing (one right way), no reinvented wheel, no artifact created past the stated scope (KISS). Note: `draft.md` co-existing with the homes is **not** a DRY violation — it is the (soon-deleted) source the homes were projected from, not a parallel authority.
170
+ **Exception trivial changes** (the step-2 triviality gate) skip plan entirely: draft makes
171
+ the change directly and records the minimal `manifest.md` (Why + file list) itself.
239
172
 
240
173
  ## Important
241
174
  - ONE change at a time. Don't combine unrelated changes.
242
- - **`draft.md` is the only surface you design on.** Features, constraints, MADRs, and the manifest are **generated from it** at projection never authored by hand in parallel during design.
175
+ - **Catch the rationalizations.** "Too small to spec", "I'll just assume a reasonable default" are the named excuses in `../references/red-flags.md` (*Skipping the spec*, *Silently filling a gap*). The urge to skip is the signal to do the step — not skip it.
176
+ - **`draft.md` is the only surface you design on.** Features, constraints, MADRs, and the manifest are **generated from it** at projection (`grimoire-plan`'s first step) — never authored by hand in parallel during design, and not written by draft at all.
243
177
  - **Features describe actor-observable behavior, not implementation, and not invariants.** No external actor, not observable, or names a library/log-level/table → it's a constraint (→ `constraints.md`) or a decision (→ MADR). An internal protocol or service-to-service contract (your own components talking) is a contract test, not a `.feature` — "external" means outside your system, not outside one module. These two — invariants and internal protocols — are the top sources of feature-file slop.
244
178
  - **One fact, one home** (`../references/principles.md`). A capability lives in one `.feature`; a control in one constraint row; a decision in one MADR. Never the same fact in two homes (at rest).
245
- - Decisions live in **one inline ledger** in `draft.md` while designing; they project to separate MADRs only at step 7. This is how coupled decisions stay legible during the thinking.
246
- - Artifacts (post-projection) are edited **live on the branch** — never copied into `.grimoire/changes/`. `git diff` is the staging area.
179
+ - Decisions live in **one inline ledger** in `draft.md` while designing (Y-statements); they project to separate MADRs only at projection, in `grimoire-plan`. This is how coupled decisions stay legible during the thinking.
180
+ - Projected artifacts are edited **live on the branch** — never copied into `.grimoire/changes/`. `git diff` is the staging area.
247
181
  - **Figma access token is read from `FIGMA_ACCESS_TOKEN` by the MCP server.** Never log it, never write it to config or any artifact (`manifest.md`, `consult.md`, `figma-snapshot.json`, `draft.md`). The MCP handles auth transparently.
248
182
 
249
183
  ## Done
250
- When the user approves the design and it has been projected, the workflow is complete.
251
- `draft.md` remains as reference until `grimoire-apply` clears it. Present the change
252
- directory path and suggest next steps:
253
- - `grimoire-plan` to generate implementation tasks
184
+ When the user approves the design on `draft.md`, the workflow is complete — draft does not
185
+ project. Present the change directory path and suggest next steps:
186
+ - `grimoire-plan` projects the design into features/constraints/MADRs/manifest, then generates tasks
254
187
  - Or further iteration on `draft.md` if the user wants changes
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: grimoire-plan
3
- description: Derive implementation tasks from approved Gherkin features and MADR decisions. Use when features are approved and ready for task breakdown.
3
+ description: Project an agreed draft.md into its homes (features, constraints, decisions, manifest), then derive implementation tasks from them. Use after the design is approved in grimoire-draft.
4
4
  compatibility: Designed for Claude Code (or similar products)
5
5
  metadata:
6
6
  author: kiwi-data
@@ -9,7 +9,7 @@ metadata:
9
9
 
10
10
  # grimoire-plan
11
11
 
12
- Derive implementation tasks from approved Gherkin features and MADR decisions. The output must be detailed enough that any LLM can execute the tasks without further planning.
12
+ Plan opens by **projecting** the agreed `draft.md` into its durable homes (features, constraints, decisions, `data.yml`, manifest), then derives implementation tasks from them. The output must be detailed enough that any LLM can execute the tasks without further planning.
13
13
 
14
14
  ## Triggers
15
15
  - User has approved a grimoire draft and wants to plan implementation
@@ -22,9 +22,7 @@ Derive implementation tasks from approved Gherkin features and MADR decisions. T
22
22
  - User wants to review the design → `grimoire-review` (after plan, before apply)
23
23
 
24
24
  ## Prerequisites
25
- - A change exists in `.grimoire/changes/<change-id>/` with:
26
- - `manifest.md` (approved)
27
- - At least one `.feature` file or decision record
25
+ - A change exists in `.grimoire/changes/<change-id>/` with an agreed `draft.md` — the user has approved the design in `grimoire-draft`. Plan's first step **projects** that design into its homes (features, constraints, MADRs, `data.yml`, manifest); those do not need to exist yet.
28
26
 
29
27
  ## Workflow
30
28
 
@@ -56,14 +54,104 @@ The plan implements what's approved. It does not expand scope to hit a checklist
56
54
 
57
55
  These are gates, not aspirations — a task that adds a duplicate home or a reinvented wheel is rejected, not refined.
58
56
 
59
- ### 1. Select Change
60
- - List active changes in `.grimoire/changes/`
61
- - If multiple, ask user which one to plan
62
- - If only one, confirm it
57
+ ### 1. Project the Design from draft.md
58
+
59
+ **Select the change.** List active changes in `.grimoire/changes/`; if multiple, ask which to plan; if one, confirm it. Read its `draft.md` — the agreed design is the source of truth for this step. (If a change arrives already projected and has no `draft.md` — e.g. from `grimoire-refactor`, which authors its own register and artifacts — there is nothing to project: skip to step 2.)
60
+
61
+ **Project `draft.md` into its durable homes.** This is where the **fine routing** happens (each fact → its one home) and where the admission test + principles gate run. Artifacts are written **live in their real locations** on the branch — `git diff` is the staging area; there is no copy-into-the-change-folder. (Projection used to close `grimoire-draft`; it now opens plan, co-located with the planning that consumes these homes.)
62
+
63
+ First, **score the complexity level (1–4)** now that the design is settled, and write it to `manifest.md` frontmatter as `complexity: <1-4>` (use the level table in `grimoire-draft` step 2 as the rubric). Then project each kind of fact:
64
+
65
+ **Behaviors → `features/*.feature`.** For each behavioral fact in the design:
66
+
67
+ *The feature-file admission test* — a scenario may be written **only if it passes all four gates**; if it fails any, it is a constraint or a decision, not a feature:
68
+ 1. **External actor, outside the system boundary** — an end user, an operator, or a *third-party* system integrating with you does the thing. "External" means outside *your* system, not outside one module: a sibling service, an internal queue consumer, or another module in the same repo calling this one is **internal**, even though it's a separate process. Internal actor → contract test or constraint/decision, never a `.feature`.
69
+ 2. **Observable** — the actor sees the outcome without reading code or logs. "<200ms", "logs scrubbed of PII" → fails → constraint.
70
+ 3. **Domain language** — domain nouns, zero implementation detail. Names a library/log-level/table (`loguru`, `INFO`, `bcrypt`, `users` table) → fails → leaking implementation.
71
+ 4. **Survives reimplementation** — rewrite the internals from scratch; would the scenario still read the same? If it would change, it's pinned to implementation → not a feature.
72
+
73
+ **Internal protocols and service-to-service contracts are NOT features.** A change to how two of your own components talk — an internal RPC/queue/event shape, a module API, a wire format between your services — is a *contract*, verified by a contract/integration test (`verify: unit-invariant`), not by Gherkin. It fails gate 1: there is no external actor, only your own code on both ends. If a third-party integrates against the protocol it's external and may be a feature; two of your own services is internal. This is the second-biggest source of feature-file slop after invariants.
74
+
75
+ Common slop this catches: invariants (→ `constraints.md`) — "PII is scrubbed from logs", "all endpoints require auth", "responses are gzipped", "errors logged with a trace id"; internal protocols (→ contract test) — "service A publishes an OrderPlaced event B consumes", "the worker accepts a job payload with these fields", "module X returns this struct to module Y".
76
+
77
+ *Extend vs. new — default is always extend; new files are the exception and require justification.* List existing feature files first (**required, not skippable** — do not write any scenario until this triage table is complete):
78
+
79
+ ```
80
+ Existing feature files:
81
+ features/auth/login.feature — "User Login"
82
+ features/billing/invoices.feature — "Invoice Management"
83
+ ```
84
+
85
+ For each scenario, decide extend-or-new and show it:
86
+
87
+ ```
88
+ "Admin resets a user's password" → extend features/auth/login.feature (same actor domain: auth)
89
+ "User configures SSO provider" → NEW (no existing file owns SSO configuration)
90
+ ```
91
+
92
+ Signals to extend: same actor, same domain object, same entry point, same HTTP resource or screen. Signals genuinely new: new actor type with no existing file, entirely new domain object, or the existing Feature title would need "and" to cover both. If unsure, extend. A new file requires stating which files were considered and why none fit.
93
+
94
+ Then write Gherkin (Feature title + user story; Background for shared preconditions; one scenario per behavior; Given/When/Then describing WHAT, never HOW). Apply security tags per `../references/security-compliance.md` (only when there's a security surface; compliance tags only when `project.compliance` is set). When design input grounded the scenarios (the change's `designs/` folder): use brand-token **names** not hex values when `.grimoire/brand/tokens.json` applies; prefer existing component names when `.grimoire/docs/components.md` exists, and flag any net-new component ("new component required — confirm before generating tasks").
95
+
96
+ **Constraints → `.grimoire/docs/constraints.md`.** Every invariant that failed the admission test (it's a security control / NFR / observability / compliance rule, not an actor-observable behavior) becomes one row: **assertion · rationale · how-verified · links**. The assertion is a flat statement ("Log output never contains PII or secrets"), not Given/When/Then. `how-verified` names the test that proves it (a `unit-invariant` this plan creates) — never a Gherkin scenario. If it stems from a decision, link the MADR; don't restate it. Create the file from `templates/constraints.md` if absent.
97
+
98
+ **Decisions → `.grimoire/decisions/NNNN-*.md`.** Project each Decisions-ledger entry (a Y-statement in `draft.md`), applying the **novelty gate**: a MADR is for a decision with a real, project-specific trade-off between viable alternatives — not for industry-default tooling picks or ecosystem-forced conventions. Ask: *would a competent engineer on this stack make a different choice, and need our reasoning to understand ours?* If no, skip it. Obvious tooling/convention picks fold into the existing `Tooling and convention baseline` ADR (one line: choice → why), not a new sequential record. Genuine trade-offs get the next sequential number, status `proposed` (`grimoire-apply` flips to `accepted` at finalize), using `.grimoire/decisions/template.md` — the Y-statement's context clause becomes the ADR's *Context and Problem Statement*.
99
+
100
+ **Data changes → `.grimoire/changes/<change-id>/data.yml`.** If the change adds/modifies/removes data models, fields, indexes, or external API integrations, write `data.yml` (same YAML shape as `schema.yml`, only what's changing, `action:` on each entry):
101
+
102
+ ```yaml
103
+ # Proposed data changes for: add-user-profiles
104
+ users:
105
+ action: modify
106
+ source: src/models/user.py
107
+ fields:
108
+ avatar_url: { action: add, type: varchar, nullable: true }
109
+ legacy_name: { action: remove }
110
+ profiles:
111
+ action: add
112
+ type: collection
113
+ fields:
114
+ user_id: { type: objectId, ref: users }
115
+ bio: { type: string, max_length: 500 }
116
+ github_api:
117
+ action: add
118
+ type: external_api
119
+ provider: GitHub
120
+ schema_ref: https://docs.github.com/en/rest
121
+ client: src/integrations/github.py
122
+ endpoints:
123
+ get_user:
124
+ method: GET
125
+ path: /users/{username}
126
+ request:
127
+ headers: { Authorization: "Bearer {token}" }
128
+ response:
129
+ login: { type: string, required: true }
130
+ avatar_url: { type: string, required: true }
131
+ name: { type: string, nullable: true }
132
+ error_response:
133
+ message: { type: string }
134
+ status: { type: integer }
135
+ ```
136
+
137
+ **Contract documentation is mandatory for external APIs.** Every endpoint must document `request` (what you send), `response` (fields you read, `required: true` for those your code depends on), and `error_response` (the error shape you handle). The task-generation step below turns this into contract tests. If you don't know the exact shape, reference `schema_ref` and document the subset your client uses — that subset is the contract. No data impact → skip `data.yml` entirely.
138
+
139
+ **Manifest (`manifest.md`).** Generate it from `draft.md` as the durable plan glue: `complexity` (just scored), Why + Non-goals, the artifact list (added/modified/removed features, decisions, constraints), and a **Prior Art** section summarizing the build-vs-buy research captured in `draft.md` (what was found/evaluated, why adopt/build/hybrid; if building, what's borrowed). **Level 3–4** also carry **Assumptions** (what must be true; mark evidence vs. unvalidated; flag unvalidated ones on the critical path) and a **Pre-Mortem** (2–5 plausible failure modes 6 months out, with mitigations or "accepted"). These come straight from the `draft.md` Decided/Open and Cut sections.
140
+
141
+ **Do NOT delete `draft.md`.** Retain it read-only as the agreed reference through the rest of plan → apply. `grimoire-apply` removes it with the change folder at finalize.
142
+
143
+ **Validate the projection** before moving on:
144
+ - `.feature` files have valid Gherkin; every Feature has a user story; every Scenario has at least Given + When + Then.
145
+ - MADR records have valid YAML frontmatter (status, date).
146
+ - Manifest is complete and accurate; `complexity` is set.
147
+ - **Re-run the admission test on every scenario you wrote**: external actor, observable, domain language, survives reimplementation. Any scenario that now fails is slop — move it to `constraints.md` or a MADR.
148
+ - **Principles gate** (`../references/principles.md`): no fact written to two homes (DRY), no second way to do an existing thing (one right way), no reinvented wheel, no artifact created past the stated scope (KISS). `draft.md` co-existing with the homes is **not** a DRY violation — it is the (soon-deleted) source the homes were projected from.
149
+
150
+ The homes now exist; the rest of plan reads and breaks them into tasks.
63
151
 
64
152
  ### 2. Read All Artifacts
65
153
 
66
- Read the change's artifacts following `../references/artifact-map.md` — it defines what each file is, the grimoire-docs-first / graph-for-structure discipline, and the staleness gate. Plan-specific reading on top of that:
154
+ Read the change's artifacts following `../references/artifact-map.md` — it defines what each file is, the grimoire-docs-first / graph-for-structure discipline, the **reading-altitude** rule (read contracts and signatures, not internal source or unit tests), and the staleness gate. Plan-specific reading on top of that:
67
155
 
68
156
  - `.grimoire/docs/constraints.md` — any constraints (security/NFR/observability) this change touches. These produce `unit-invariant` tasks, not scenarios.
69
157
  - The current baseline (`features/`, `.grimoire/decisions/`) via `git diff main` — exactly what this change adds vs. what already existed.
@@ -120,6 +208,8 @@ Level 1-2 changes with minor gaps may proceed; level 3-4 with multiple gaps shou
120
208
  ### 4. Generate Tasks
121
209
  Create `.grimoire/changes/<change-id>/tasks.md`. **Every task produces both production code AND a test — but the test level matches the artifact the task derives from.** Tasks are structured as pairs: the failing test first, then the production code.
122
210
 
211
+ **Order tasks by the technical spine** (`../references/design-spine.md`): dependencies → data/schema → API/contract → business logic → UI by component → verification, **test-first within each layer**. This is the same order the change was designed on, so the plan's shape mirrors the design and stays predictable across changes.
212
+
123
213
  **Tag every implementation task with a `verify:` level** — this tells `grimoire-apply` which test vehicle to use. Match the artifact:
124
214
 
125
215
  | Task derives from | `verify:` | Test vehicle |
@@ -341,7 +431,7 @@ Before presenting to the user, verify the plan:
341
431
  - [ ] Every test task describes what to assert (no "write a test")
342
432
  - [ ] Every implementation task describes what to create/modify (no "add the code")
343
433
  - [ ] The verification section has the exact commands to run
344
- - [ ] Tasks are ordered: shared stepstestproduction code → verification
434
+ - [ ] Tasks follow the technical-spine order (`../references/design-spine.md`): dependencies data/schema API/contractlogic UI → verification, test-first within each layer
345
435
  - [ ] No task requires the LLM to make architectural decisions — those should already be in the ADR
346
436
  - [ ] **Principles gate** (`../references/principles.md`): no task introduces a duplicate home for an existing fact (DRY), a second way to do an existing thing (one right way), a reinvented wheel where a tool/library/proven pattern exists (don't reinvent), or an abstraction/dependency justified only by a hypothetical (KISS). Any that does has a stated reason.
347
437
 
@@ -367,10 +457,10 @@ Check `.grimoire/config.yaml` for the configured agents:
367
457
  - If the user has configured separate thinking/coding agents, note this in the tasks.md header so the apply stage knows which agent to use
368
458
 
369
459
  ## Important
370
- - **Specificity is the whole point.** A vague plan is worse than no plan — it gives false confidence and the LLM will re-plan anyway. Every task must be executable without thinking.
460
+ - **Specificity is the whole point.** A vague plan is worse than no plan — it gives false confidence and the LLM will re-plan anyway. Every task must be executable without thinking. "Implement the feature" is not a task — it's the *Skipping the plan / vague tasks* rationalization in `../references/red-flags.md`.
371
461
  - Tasks should be small and specific — one logical unit of work each
372
462
  - Every task traces back to a scenario or decision
373
- - Order matters: dependencies first, verification last
463
+ - Order matters: tasks follow the technical-spine order (`../references/design-spine.md`); verification last
374
464
  - Don't generate tasks for things that already work (check the baseline)
375
465
  - Read the actual codebase before writing tasks. Reference real file paths, real patterns, real conventions. Don't guess.
376
466
 
@@ -177,9 +177,11 @@ Recommendation: Fix blockers, then proceed to apply.
177
177
  ## Important
178
178
  - This is a design review, not a code review. Focus on the specifications and plan, not hypothetical implementation details.
179
179
  - Be direct. Don't pad findings with praise or soften blockers. The goal is to catch problems before code is written, when they're cheap to fix.
180
+ - **Ground every finding — ladder it.** Use the interview techniques in `../references/elicitation-personas.md` pointed at the design: ladder a decision *up* to the goal it serves (or expose it as Tunnel Vision), ladder a finding *down* to the concrete behavior that breaks, 5-Whys it to root cause. Also check each decision walks the spine in context (`../references/design-spine.md`) — names its layer, validates the prior. A laddered finding earns its severity; a bare verdict doesn't.
180
181
  - A blocker means "if we code this as-is, we'll have to come back and redo work." A suggestion means "this would improve the design but isn't blocking."
181
182
  - Keep each persona's review focused and short. Three bullet points that matter are better than ten that don't.
182
183
  - If the change is trivial (e.g., rename a field, fix a typo in a feature), say so and don't manufacture issues.
184
+ - **Don't self-exempt by feel.** "It looks fine" / "I reviewed as I wrote it" are the *Skipping review* rationalization in `../references/red-flags.md`. Trivial-exempt is the skill's call, not a vibe.
183
185
  - All persona evaluation criteria, the materiality gate, the briefing structure, and the complexity-depth table live in `../references/review-personas.md`. Don't duplicate them here — read that file when running a persona.
184
186
 
185
187
  ## Done
@@ -274,6 +274,7 @@ Based on the report:
274
274
 
275
275
  ## Important
276
276
  - Verify is read-only. Do NOT fix issues — only report them. The user decides what to do.
277
+ - **"Should pass" is not evidence.** Declaring done without running is the *Declaring done without verifying* rationalization in `../references/red-flags.md`. Observe state, don't predict it.
277
278
  - Be specific: reference file paths and line numbers for every issue.
278
279
  - A scenario without a step definition is always CRITICAL — the spec is not tested.
279
280
  - A step definition with no assertions is always CRITICAL — it's a false positive.
@@ -8,7 +8,7 @@ Loaded by skills that read a change's specs before acting (`grimoire-plan`, `gri
8
8
 
9
9
  Per-change (under `.grimoire/changes/<change-id>/`):
10
10
 
11
- - **`draft.md`** — the living design doc the change was designed on (diagram/sketch, decision ledger, pseudo-code, Decided/Open ledger). The single source the other artifacts were **projected** from at the end of `grimoire-draft`. Ephemeral: retained read-only as the agreed-design reference through the pipeline, deleted when `grimoire-apply` clears the change folder. Read it for the *intent and rationale* behind the projected artifacts; the features/constraints/decisions remain the authoritative homes.
11
+ - **`draft.md`** — the living design doc the change was designed on (diagram/sketch, decision ledger of Y-statements, pseudo-code, Decided/Open ledger). The single source the other artifacts are **projected** from at the start of `grimoire-plan`. Ephemeral: retained read-only as the agreed-design reference through the pipeline, deleted when `grimoire-apply` clears the change folder. Read it for the *intent and rationale* behind the projected artifacts; the features/constraints/decisions remain the authoritative homes.
12
12
  - **`manifest.md`** — change summary, complexity level, and the Why. Level 3-4 also carry Assumptions, Pre-Mortem, and **Prior Art** (the build-vs-buy rationale). Generated from `draft.md` at projection.
13
13
  - **`features/*.feature`** — behavioral specifications. Edited live in `features/` on the branch.
14
14
  - **decision records** — architectural choices for this change, edited live in `.grimoire/decisions/`, including Cost of Ownership sections.
@@ -35,6 +35,18 @@ Project-wide (under `.grimoire/`):
35
35
 
36
36
  ---
37
37
 
38
+ ## Reading altitude — design reads contracts, debugging reads internals
39
+
40
+ When you read code during **design** (`grimoire-draft`, `grimoire-design`, `grimoire-plan`), read at the **published-interface altitude** — what a caller needs to integrate, not how the callee works inside:
41
+
42
+ - **Third-party library or service** — its public API, types, and docs. Not its source, and not its tests. The contract is what you design against; the internals are the maintainer's concern.
43
+ - **Your own system** — the touched area's exported symbols, API endpoints, and data schema, plus the relevant feature files and `constraints.md`. Not the whole backend's source, and not its unit tests.
44
+ - **Prefer the graph for structure without bodies.** `search_graph` / `get_architecture` give signatures, callers, and call edges — the *shape* of the interface — without spending context on implementation bodies. That is the altitude design needs.
45
+
46
+ **Reading full source bodies and unit tests is a *debugging* activity** — justified when a behavior is wrong and you need root cause (`grimoire-bug`), not when you need to know how an interface is used. In design the question is "what is the contract?", and the contract lives in signatures, schemas, and specs — not in line-by-line implementation. Exhaustively reading internals at design time burns context and rarely improves the design. This is the rule above, sharpened: even when you *do* read source, read the seam, not the guts.
47
+
48
+ ---
49
+
38
50
  ## Staleness gate
39
51
 
40
52
  For each area doc you load, compare its `last_updated` against `git log -1 --format=%ci <directory>`. If the doc is older than the most recent commit to its directory, it's stale — its paths, utility names, and patterns may be wrong.
@@ -0,0 +1,166 @@
1
+ # Design Spine
2
+
3
+ The ordered path a design walks — in the interview (`grimoire-draft`, `grimoire-design`), in
4
+ evaluation (`grimoire-review`), and in task order (`grimoire-plan`). One spine, walked the
5
+ same way everywhere, so the structure is predictable: the user learns where each kind of
6
+ decision happens.
7
+
8
+ This is the single home for *how* a design proceeds. Skills cite the relevant section; none
9
+ restate it (DRY). It pairs with `principles.md` (what every artifact must satisfy) and
10
+ `red-flags.md` (the excuses to skip a stage) — this file is the *sequence* the work follows.
11
+
12
+ The methods below are named on purpose. They are established practice — lean on the name
13
+ (Working Backwards, inside-out layering, stepwise refinement, Y-statement) so the work
14
+ inherits a known, well-understood discipline instead of an ad-hoc one.
15
+
16
+ ---
17
+
18
+ ## Bias: complete, not over-built
19
+
20
+ The spine and its personas are a **surfacing** tool — they raise candidates (steps,
21
+ constraints, "what happens if" cases). Surfacing tools bias toward *more*: every question
22
+ invites a handler, every constraint feels like rigor. Left ungoverned, walking the spine
23
+ *manufactures* the over-engineering YAGNI warns against. So the walk has one governing rule:
24
+
25
+ **Surface broadly, build narrowly.** The spine raises the candidate; `principles.md` §4
26
+ (KISS/YAGNI) decides whether it earns a place. They are a pair — never run the surfacing
27
+ without the prune.
28
+
29
+ - **Every surfaced item gets a disposition, and the default is *don't build it*.** Each
30
+ candidate resolves to one of: *build* (a present, concrete need requires it), *won't build*
31
+ (record as a one-line non-goal), or *defer*. When unsure, it's a non-goal. Recording "we
32
+ considered X and chose not to" is completeness; building X "just in case" is not.
33
+ - **Lean simple when you must lean.** Under-building is cheap to add later; over-building is
34
+ expensive to remove. If the call is genuinely balanced, choose the simpler, less-complete
35
+ option and say so — complexity layers in cleanly later, but rarely comes back out.
36
+ - **"Complete" means the stated outcome plus the failures whose cost the user would actually
37
+ feel — not every conceivable case.** Completeness is measured against the outcome, not
38
+ against an exhaustive enumeration of edge cases.
39
+ - **Constraints are surface area, not virtue.** Each must earn its place with a present,
40
+ concrete *why* (a downstream need, a real corruption risk). A constraint justified only by
41
+ "might need to" is YAGNI — cut it. This is why the ceremony gate counts constraints as a
42
+ *cost* signal, not a rigor score.
43
+
44
+ ---
45
+
46
+ ## Pick the spine
47
+
48
+ Two spines. Pick by what the change touches; **once picked, always walk it in order** (next
49
+ section). A mixed change uses the technical spine and expands its UI layer with the UX spine.
50
+
51
+ | Change | Spine | Home skill |
52
+ |--------|-------|-----------|
53
+ | User-facing flow / screen / UI | **UX-workflow spine** | `grimoire-design` |
54
+ | Behavioral / technical (API, data, logic) | **technical spine** | `grimoire-draft` |
55
+ | Mixed (UI + backend) | **technical**, UI layer via UX | draft + design for the UI layer |
56
+
57
+ ### UX-workflow spine — traversal direction
58
+ Walk the user's process in one of two directions. State which you're using.
59
+
60
+ - **Backward — "Working Backwards"** (Cooper goal-directed design; Amazon PR/FAQ). Start at
61
+ the **goal / end-state** (a JTBD outcome: "when *situation*, I want *motivation*, so I can
62
+ *outcome*"). At each step ask **"what must be true for the user to reach here?"** Best when
63
+ the goal is clear but the path is contested — it surfaces unknown prerequisites and prunes
64
+ steps that serve no downstream need.
65
+ - **Forward — "forward chaining"** (skills-forward). Start from **what the user reasonably
66
+ knows or has at the start** and step toward the goal. Best when the starting state is well
67
+ defined but the goal is emergent, or when documenting an existing happy path.
68
+
69
+ **Reconcile (the discipline):** define the end-state, chain **backward** to the required
70
+ prerequisites, then **validate forward** by walking a real user from their actual starting
71
+ knowledge (the Mom Test — see `elicitation-personas.md`). Where the two traversals don't meet
72
+ are the missing or assumed steps — capture each as an Open row.
73
+
74
+ ### Technical spine — layer order
75
+ Design **process/constraints → data model → API/contract → UI, component by component.** This
76
+ is **inside-out** layering (DDD layered architecture / Clean Architecture dependency rule):
77
+ the domain and its rules sit at the core; the API and UI are outer detail that depend inward,
78
+ never the reverse.
79
+
80
+ | Layer | What you settle here |
81
+ |-------|----------------------|
82
+ | **1 · Process / constraints** | The invariants and limits that must always hold — business rules, security controls, NFRs, what must *never* happen (data-corruption guards). These bound everything downstream. |
83
+ | **2 · Data model** | Entities, fields, relationships, and each field's constraints (required / unique / nullable / range). Every constraint states its *why* — the downstream need or corruption risk that justifies it (→ a `constraints.md` row). |
84
+ | **3 · API / contract** | The interface other code or clients use. Design it as a deliberate contract that **hides** the data model — this reconciles "data-first" with "API-first": the API is a versioned abstraction over the schema, not a mirror of it. |
85
+ | **4 · UI, by component** | The surface, one component at a time. For a user-facing surface, expand this layer with the UX-workflow spine above. |
86
+
87
+ **Each layer constrains the next** (*stepwise refinement* — every decision narrows the
88
+ solution space for the layers below). **Building a layer validates the prior** (*consumer-
89
+ driven contracts* / model–implementation feedback): designing the API tests the data model;
90
+ designing the UI tests the API. When a lower layer can't satisfy an upper one, the upper one
91
+ was wrong — go back and fix it, don't patch around it downstream.
92
+
93
+ ## Walk it — always, in order
94
+
95
+ Whatever spine is chosen, **traverse its layers/steps in order; do not jump around.** At each
96
+ layer:
97
+
98
+ 1. **Elicit** with that layer's lens — personas are the *who* (`elicitation-personas.md`),
99
+ techniques are the *how* (laddering / Mom Test / 5 Whys, same file).
100
+ 2. **Record decisions** for the layer in the `draft.md` ledger as Y-statements (below).
101
+ 3. **Validate the prior** layer against what you just learned — restate the check explicitly
102
+ ("this required field traces to *downstream need X*"; "this data model satisfies process
103
+ constraint *C*"). A failed validation sends you back up, not forward.
104
+
105
+ An **empty layer is fine** — say so in one line and skip (a change with no data impact skips
106
+ layer 2). Skipping is a stated call, never a silent omission.
107
+
108
+ ## Ceremony gate — scale to the constraints, not first impressions
109
+
110
+ Full layer-by-layer ceremony is for changes that earn it; trivial ones don't.
111
+
112
+ - **Default (lightweight):** walk the spine, but elicit only what the change needs and skip
113
+ empty layers in one line. Most level-1–2 changes finish here.
114
+ - **Escalate to full ceremony when the change introduces more than 2 constraints** (data
115
+ invariants, security controls, cross-layer dependencies). The **3rd** new constraint is the
116
+ signal that real complexity is hiding — from there, walk every layer formally, record a
117
+ decision per layer, and (level 3–4) add the manifest Pre-Mortem.
118
+
119
+ Constraint count is a measurable trigger that complements the complexity level: complexity is
120
+ an *output* of design, and the count is that output surfacing mid-interview. A nominally
121
+ "simple" change that trips the gate is not simple — let the count, not the first impression,
122
+ set the depth.
123
+
124
+ ## Decisions — Y-statement, in context
125
+
126
+ Every decision in the `draft.md` ledger is recorded as a **Y-statement**, so its context is
127
+ forced into the record. This defeats the *Tunnel Vision* anti-pattern — a decision that reads
128
+ well in isolation but is wrong for the surrounding workflow:
129
+
130
+ > **D*n*:** In the context of *spine layer / use-case*, facing *concern / force*, we chose
131
+ > *option* over *alternatives*, to achieve *quality*, accepting *downside* — **because
132
+ > *why*.**
133
+
134
+ The **because** is mandatory. The **context** clause ties the decision to the spine layer it
135
+ emerged from, so the user evaluates it *in the situation it serves*, not as an abstract claim
136
+ ("sounds good" decided in a vacuum is exactly how ADRs go wrong). Coupled decisions
137
+ cross-reference by ID (D7 cites D3). At projection (`grimoire-plan`'s first step) each ledger entry
138
+ maps cleanly to a MADR: the context clause becomes the ADR's *Context and Problem Statement*,
139
+ the chosen/neglected options its *Considered Options / Decision Outcome*, the accepted
140
+ downside its *Consequences*. The novelty gate still applies — industry-default picks fold into
141
+ the baseline ADR, not a new record.
142
+
143
+ ## Plan task order follows the technical spine
144
+
145
+ `grimoire-plan` emits tasks in the **same layer order** the change was designed on, so the
146
+ plan's shape mirrors the design:
147
+
148
+ ```
149
+ dependencies → data/schema → API/contract → business logic → UI by component → verification
150
+ ```
151
+
152
+ Within each layer, **test first** — the failing test for that layer-pair before its code.
153
+ This is the per-layer red-green unit, not a single global acceptance test up front. The order
154
+ is fixed on purpose: over time the user learns that schema changes always land before API
155
+ changes, contract tests before clients, UI last.
156
+
157
+ ---
158
+
159
+ ## How skills cite this
160
+
161
+ - **design** — UX-workflow spine (traversal direction) at the user-flow step.
162
+ - **draft** — pick + walk the spine in the design loop; Y-statement ledger; ceremony gate.
163
+ - **plan** — task order = technical-spine order; test-first per layer.
164
+ - **review / verify** — check that decisions name their context and each layer validates the prior.
165
+
166
+ Each skill links its own section; none restate the spine. This is the one home (DRY).
@@ -6,7 +6,7 @@ Persona-driven questions to surface requirements. Used by draft (gather requirem
6
6
 
7
7
  - **In draft**: Ask these questions to gather requirements before drafting.
8
8
  - **In plan**: Use as a completeness checklist — flag gaps in the specs, don't ask the user.
9
- - **In review**: Use as evaluation criteria — check if the design addresses each concern.
9
+ - **In review**: Use as evaluation criteria — check if the design addresses each concern. Apply the interview techniques below to *interrogate the design itself* (not the user) so a finding is reasoned, not asserted.
10
10
 
11
11
  ## Depth by Complexity Level
12
12
 
@@ -19,6 +19,27 @@ Persona-driven questions to surface requirements. Used by draft (gather requirem
19
19
 
20
20
  Don't ask every question — only ask questions whose answers aren't already clear.
21
21
 
22
+ **Questions surface candidates, not requirements.** A persona question raises something to *decide*, and the default decision is "out of scope" — record it as a one-line non-goal and move on. Don't convert every answered question into built handling or a new constraint; that is how eliciting turns into over-engineering. Promote to a scenario or constraint only what the stated outcome actually needs — everything else is a non-goal. See the simplicity bias in `design-spine.md`.
23
+
24
+ ## Interview Techniques — How to Ask
25
+
26
+ The personas are the *who* (which concerns to probe); these are the *how* (question shapes that surface a real answer). Reach for them especially when walking a spine in `design-spine.md`.
27
+
28
+ - **Laddering** — the bidirectional workhorse. Ladder **up** ("why is that important to you?") to climb from a feature to the goal it serves; ladder **down** ("how would you do that?" / "what would that look like?") to descend from a goal to a concrete step. Up drives the *backward* UX traversal; down drives the *forward* one.
29
+ - **The Mom Test** — anchor every question in **past behavior, not hypotheticals**: "When was the last time you hit this? Walk me through exactly what you did." Beats "would you use…?" / "is this a good idea?" — those invite flattery, not facts. The best way to reconstruct the *real* workflow instead of the imagined one.
30
+ - **Critical Incident Technique** — ask for a **specific real incident** in detail ("tell me about the last time X went wrong"). People recall concrete incidents far more accurately than general process, so this surfaces edge cases and the actual happy path.
31
+ - **5 Whys** — repeat "why" to climb from a symptom or step to its root motivation. Use it to find the end-state a backward traversal should start from.
32
+
33
+ Don't run all four — pick the one that fits: laddering to trace a workflow up or down the spine, Mom Test / CIT to reconstruct what really happens, 5 Whys to find the underlying goal.
34
+
35
+ **In review, point these inward at the design.** They make a critique credible — reasoned rather than asserted:
36
+
37
+ - **Ladder a decision up** — does it trace to a real goal, or is it *Tunnel Vision* (reads well in isolation, wrong for the surrounding workflow)? A decision whose ladder-up dead-ends serves no goal.
38
+ - **Ladder a finding down** — "how does this concretely fail?" A finding you can ladder to a specific broken behavior is real; one you can't is a hunch — say so or drop it.
39
+ - **5-Whys a finding to root cause** before grading it, so the blocker names the underlying defect, not a symptom.
40
+
41
+ A laddered finding ("fails because → because → because") earns its severity. A bare verdict ("[blocker] this is wrong") does not — and the materiality gate in `review-personas.md` rejects it.
42
+
22
43
  ## Outcome & Scope — Always Ask First
23
44
 
24
45
  Before diving into persona questions, establish the outcome and boundaries. These two questions prevent the most common spec failures — building the wrong thing and building too much:
@@ -48,6 +69,7 @@ Ask when: the change has user-facing behavior.
48
69
 
49
70
  Ask when: the change introduces new components, services, dependencies, or data flows.
50
71
 
72
+ - **Follow the technical spine** (`design-spine.md`): design inside-out — process/constraints → data model → API contract → UI, each layer constraining the next. Flag anything that inverts it: a UI shape dictating the data model, or an API that mirrors the schema instead of abstracting over it.
51
73
  - What's the deployment context? Does this run in the same service or cross service boundaries?
52
74
  - What existing components does this touch? Are there shared modules, APIs, or databases involved?
53
75
  - Are there concurrency concerns? Multiple users or processes acting on the same data?
@@ -75,6 +97,7 @@ Ask when: the change involves authentication, authorization, user input, sensiti
75
97
 
76
98
  Ask when: the change has complex behavior, multiple paths, or integration points.
77
99
 
100
+ - **Walk the workflow step by step and ask *what happens if* at each step** — the user abandons here, the input is invalid, the network drops, the action runs twice, they arrive without the previous step's precondition? This surfaces candidates, not requirements: **most answers are "nothing — out of scope," recorded as a one-line non-goal.** Promote a "what happens if" to a scenario or constraint only when its failure carries a cost the user would actually feel for *this* change (the simplicity bias in `design-spine.md`).
78
101
  - What are the boundary values? Min/max lengths, zero vs. one, empty collections, null states?
79
102
  - What are the timing edge cases? Concurrent edits, race conditions, timeout during processing?
80
103
  - What external dependencies could fail? How should the system behave when they do — retry, fallback, error?
@@ -85,10 +108,11 @@ Ask when: the change has complex behavior, multiple paths, or integration points
85
108
  Ask when: the change creates, modifies, or removes data models, or integrates with external APIs.
86
109
 
87
110
  - What data entities are involved? What are the relationships between them?
88
- - What are the field constraints? Required, unique, nullable, max length, valid ranges, enums?
111
+ - What are the field constraints — required, unique, nullable, max length, valid ranges, enums? For each, **what is the *why*** — the downstream need it serves or the corruption risk it prevents ("required because the billing job reads it", "unique to stop duplicate ledgers")? Record that justification as the rationale on a `constraints.md` row, not as an unstated assumption.
89
112
  - How does this data grow? Is there a retention policy, archival strategy, or cleanup needed?
90
113
  - Is there existing data that needs migrating? Can the migration run live or does it need downtime?
91
114
  - Are there external API contracts? What fields does the client read, and what happens if the schema changes?
115
+ - **Does the data model supply everything the API and UI need?** For each field an endpoint or screen requires, confirm a backing source exists in the model with the right type and nullability — the data layer *validates* against the layers above it (`design-spine.md`). A required API/UI value with no data-model source is a gap to fix in the model now, not a `null` to discover in production.
92
116
 
93
117
  ## Requirements Summary Template
94
118
 
@@ -0,0 +1,62 @@
1
+ # Anti-Rationalization Red Flags
2
+
3
+ Under time pressure, sunk cost, or false confidence, an AI talks itself out of the
4
+ discipline it was told to follow — and does it convincingly. The rationalizations are
5
+ predictable and few. This file names them.
6
+
7
+ This is the single home for the "excuses to skip a stage." It is the sibling of the
8
+ **Anti-Loop Protocol** in `AGENTS.md`: that governs loops *inside* a stage; this governs
9
+ skipping a stage *entirely*. Skills cite the relevant section rather than restating it.
10
+
11
+ **The rule:** when you catch yourself forming one of these thoughts, that is the signal to
12
+ *do the step*, not skip it. The urge to skip is the evidence the step is needed. A stage is
13
+ skipped only by an explicit, recorded decision (a gate that says skip) — never by a silent
14
+ rationalization that it "isn't worth it this time."
15
+
16
+ ---
17
+
18
+ ## Skipping the spec (→ grimoire-draft)
19
+ **Catch yourself saying:** "Too small / too obvious to draft." · "I already know what to build." · "I'll spec it after I see it work."
20
+ **Why it's wrong:** "Too small" is a complexity judgment, and complexity is an *output* of design, not an input — you can't score it honestly before designing. Code-first means the spec gets reverse-engineered to match whatever you built, so it documents the bug instead of catching it.
21
+ **Instead:** Run the triviality gate (draft step 2). If it's genuinely trivial (config/typo/single-file) the gate says skip — that's a *recorded* call. Anything else: draft it.
22
+
23
+ ## Silently filling a gap (→ grimoire-draft)
24
+ **Catch yourself saying:** "A reasonable default is obvious here." · "I understand enough, no need to ask." · "I'll just assume X."
25
+ **Why it's wrong:** A silent assumption the user would have corrected becomes a bug whose paper trail says "intended." One question costs seconds; a wrong guess costs a rebuild.
26
+ **Instead:** Ask it (an *Open* row), defer it to a non-goal, or record the inference explicitly as `RESOLVED: defaulting to X per delegation`. Never leave an unrecorded guess in the design.
27
+
28
+ ## Skipping the plan / vague tasks (→ grimoire-plan)
29
+ **Catch yourself saying:** "Planning is overhead, I'll work it out as I go." · "The task is just 'implement the feature'."
30
+ **Why it's wrong:** A vague plan is worse than none — it gives false confidence and you re-plan mid-implementation anyway, now with code already written the wrong way. "Implement the feature" is not a task; it restates the goal.
31
+ **Instead:** Every task names exact files and one approach, small enough to execute without thinking (grimoire-plan). If a task needs thought to start, it isn't planned yet.
32
+
33
+ ## Code before the test (→ grimoire-apply)
34
+ **Catch yourself saying:** "I'll add the test after." · "Let me just see it work first." · "The test is trivial, red-first is ceremony."
35
+ **Why it's wrong:** A test written after the code is shaped to pass the code, not to catch its bugs — it asserts what you built, not what was required. A test that never failed has never proven anything. This is the single most common discipline bypass.
36
+ **Instead:** Red first — watch it fail for the right reason, then make it pass (grimoire-apply). If you wrote code before the test, the honest move is to delete the code and start from red.
37
+
38
+ ## Skipping review (→ grimoire-review)
39
+ **Catch yourself saying:** "It looks fine." · "I reviewed it as I wrote it." · "Too small to need review."
40
+ **Why it's wrong:** "Looks fine" is the feeling that precedes every shipped bug. Reviewing your own work as you write it is the weakest review there is — same blind spots, same assumptions, at the moment you're most committed to the approach.
41
+ **Instead:** Run the persona pass at the depth the complexity calls for (grimoire-review). Trivial changes are exempt *by the skill's own rule* — say so and move on; don't self-exempt by feel.
42
+
43
+ ## Declaring done without verifying (→ grimoire-verify)
44
+ **Catch yourself saying:** "Tests should pass." · "That obviously works." · "I'll trust the run I did earlier."
45
+ **Why it's wrong:** "Should pass" is a prediction, not evidence. Done is a claim about observed state; an unobserved claim is a guess in a confident voice.
46
+ **Instead:** Run it. Confirm every scenario has a real step definition with real assertions and no regressions (grimoire-verify). Evidence over claims — every time.
47
+
48
+ ## Doing more than the task (→ principles.md §4)
49
+ **Catch yourself saying:** "While I'm in here I'll also…" · "We'll need it later."
50
+ This is scope creep / YAGNI — its home is **`principles.md` §4 (KISS/YAGNI)**, not here. The named flags ("while I'm here", "for a future caller") live there; cut the speculative work and say so in one line.
51
+
52
+ ---
53
+
54
+ ## How skills cite this
55
+
56
+ - **draft** — *Skipping the spec*, *Silently filling a gap*
57
+ - **plan** — *Skipping the plan / vague tasks*
58
+ - **apply** — *Code before the test*
59
+ - **review** — *Skipping review*
60
+ - **verify** — *Declaring done without verifying*
61
+
62
+ Each skill links its own section; none restate the list. This is the one home (DRY).
@@ -60,9 +60,14 @@ kind: greenfield | refactor
60
60
  and cross-references (D7 cites D3) freely — this is how coupled decisions stay legible
61
61
  in one place. At projection, each NOVEL decision becomes a MADR (novelty gate applies —
62
62
  obvious tooling picks fold into the baseline ADR, they don't mint a record).
63
+
64
+ Phrase each row as a Y-statement (see the design-spine reference): the Decision cell states
65
+ "in the context of <spine layer / use-case>, facing <force>, chose <option> over
66
+ <alternatives>, accepting <downside>"; the Why cell is the "because". The context clause
67
+ ties the decision to the layer it serves so it's judged in situation, not in a vacuum.
63
68
  -->
64
69
 
65
- | # | Decision | Why |
70
+ | # | Decision (Y-statement: context · option over alternatives · accepting downside) | Why (because…) |
66
71
  |----|----------|-----|
67
72
  | D1 | | |
68
73