npm - @tgoodington/intuition - Versions diffs - 11.8.0 → 11.9.0 - Mend

@tgoodington/intuition 11.8.0 → 11.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@tgoodington/intuition",
-  "version": "11.8.0",
+  "version": "11.9.0",
   "description": "Domain-adaptive workflow system for Claude Code. Includes the Enuncia pipeline (discovery, compose, design, execute, verify) and the classic pipeline (prompt, outline, assemble, detail, build, test, implement).",
   "keywords": [
     "claude-code",

package/skills/intuition-enuncia-compose/SKILL.md CHANGED Viewed

@@ -19,11 +19,11 @@ You are a decomposition thinker. You see a vision and ask "what needs to be true
 1. You MUST read `.project-memory-state.json` and resolve context_path before anything else.
 2. You MUST read `{context_path}/discovery_brief.md`. If missing, stop: "No discovery brief found. Run `/intuition-enuncia-discovery` first."
 3. During dialogue, you MUST ask questions as plain text. AskUserQuestion is ONLY for the approval gate at the end.
-4. You MUST NOT make technical decisions. Architecture, technology choices, and implementation approaches belong to specialists.
+4. You MUST NOT make technical decisions. Architecture, technology choices, and implementation approaches belong to specialists. (Express exception: when `track == "express"` and a task carries no genuine technical decision, you may record the mechanical approach and fold design in — see EXPRESS MODE. Any real choice still goes to design.)
 5. You MUST produce experience slices that are stakeholder-perspective-in, not component-out.
 6. You MUST decompose tasks until each one passes the producer-ready test (see SIZING CHECK). There is no "Deep" or "Standard" — every task should be light enough to build directly.
 7. You MUST write `tasks.json`, `docs/project_notes/project_map.md`, and update state before routing.
-8. You MUST route to `/intuition-enuncia-design`. NEVER to `/intuition-enuncia-handoff`.
+8. You MUST route to `/intuition-enuncia-design` — EXCEPT on the express track when you fold design in (no genuine technical decision), where you route to `/intuition-enuncia-execute` (see EXPRESS MODE). NEVER to `/intuition-enuncia-handoff`.
 9. You MUST load three goals at activation (Project North Star, Branch Goal if on a branch, and your own Skill Goal) and hold them in working memory throughout the skill's run. See GOAL ALIGNMENT.
 10. You MUST NOT overwrite the `## Project North Star`, `## Branch Goals`, or `## Operational Foundation` sections in `project_map.md`. The first two are owned by discovery; the third is owned by design. You own `## Overview`, `## Capabilities`, and `## Component Reference`. Append history rows to `docs/project_notes/map_history.md`, not inside `project_map.md`.
 11. You MUST run the alignment check before routing — output must serve Project North Star, Branch Goal (if branch), and Skill Goal. See GOAL ALIGNMENT → Alignment Check.
@@ -72,9 +72,24 @@ Before writing outputs or routing, run the last-stop check:
 - **Project North Star**: Does the outline's set of experience slices, taken together, serve the Project North Star? If a slice doesn't trace to it, it doesn't belong.
 - **Branch Goal** (if branch): Does the outline advance what this branch is on the hook for, at altitude? Branch work that drifts into trunk-level concerns is a misalignment.
 - **Skill Goal**: Is the output decomposition-level work (experience slices, producer-ready tasks, project map updates) — not technical decisions that belong to design, not specs that belong to producers?
+- **Project map consistency**: Does every slice you wrote to `## Capabilities` trace to a stakeholder in the brief, and does every component you listed in `## Component Reference` correspond to the output of at least one task in tasks.json? No orphan slices, no components without a producing task.
 If any check fails, correct the output before presenting. If you cannot correct it within this skill (e.g., a missing stakeholder coverage that needs to go back to discovery), flag it explicitly to the user rather than shipping misaligned output.
+## EXPRESS MODE
+Read `track` from the active context (missing = `"full"`). When `track == "express"`:
+- Skip Phase 2's full experience-mapping dialogue. Draft the experience slices directly from the brief and confirm them in a single pass.
+- Decompose into tasks as usual — expect few. The producer-ready test still applies.
+- Do NOT create test tasks (see Test Coverage).
+- **Fold design in when there is no genuine technical decision.** If the work has an obvious, mechanical approach — no stack choice, no architecture decision, no external-tech selection the user should weigh in on — then enrich each task with a one-line `technical_approach` and a `files` list yourself, and skip the design phase:
+  - Write `technical_approach` and `files` onto each task in tasks.json.
+  - In state, beyond the normal compose transition, also set `status = "execute"`, `workflow.design.completed = true`, `workflow.design.completed_at = now`, `workflow.execute.started = true`, and `last_handoff_transition = "compose_to_execute"`.
+  - Route to `/intuition-enuncia-execute` (NOT design).
+- If the express work DOES carry a real technical decision, do NOT fold — leave it for design and route to `/intuition-enuncia-design` as normal. Design will run, just compressed.
+When `track == "full"` (or unset), run compose exactly as specified below.
 ## PROTOCOL
 ```
@@ -98,11 +113,13 @@ Read `{context_path}/discovery_brief.md`. Extract:
 ### Codebase Research (Conditional)
-Research is needed ONLY when ALL of these are true:
+Research is needed when ALL of these are true:
 - This is trunk (not a branch)
-- No `docs/project_notes/project_map.md` exists
+- The project map is still the blank scaffold — its `## Capabilities` and `## Component Reference` sections are empty/placeholder. **`initialize` always creates the map file, so file existence is NOT the test** — read the map and check whether those two sections still hold only template/placeholder text.
 - The project has an existing codebase (check: Glob for source files in common locations — `src/`, `app/`, `lib/`, `*.py`, `*.js`, `*.ts`, etc.)
+This condition is what keeps the map connected to reality on brownfield projects: without it, compose would author `## Capabilities` and `## Component Reference` without ever reading the existing code.
 If all conditions are met, launch ONE `intuition-researcher` agent:
 ```
@@ -246,6 +263,14 @@ Tag decisions on tasks ONLY when they are obvious from the discovery brief and t
 Do NOT pre-classify decisions you're uncertain about. Only tag decisions that are clearly visible at the outline level. Most tasks will have none.
+### Test Coverage (full track, code projects)
+When `track == "full"` and the project produces code with non-trivial behavior, add `test/*` tasks so the work has an automated regression net. Create one test task per experience slice that has logic worth protecting — data transforms, business rules, API contracts, validation, state machines. Do NOT create test tasks for purely static or presentational slices.
+A `test/*` task lists the `code/*` / `ui/*` tasks it covers as dependencies, and its acceptance criteria are outcome-based: "automated tests cover [the slice's key behaviors] and pass." Leave the framework and approach to design — compose only establishes that the coverage must exist.
+When `track == "express"`, do NOT create test tasks. Express relies on live verification only; this is a deliberate tradeoff (record it in the brief's Out-of-scope or a note).
 ### Present to User
 Walk the user through the task breakdown conversationally. Show how experience slices became tasks. Ask if the decomposition makes sense before moving to approval.

package/skills/intuition-enuncia-design/SKILL.md CHANGED Viewed

@@ -72,9 +72,22 @@ Before writing outputs or routing, run the last-stop check:
 - **Project North Star**: Does the technical approach serve the project's stated outcome? Speed, accuracy, stakeholder experience tradeoffs — all check against this.
 - **Branch Goal** (if branch): Does the design advance what this branch is on the hook for? Branch design that pulls in trunk-level architecture changes beyond scope is a misalignment.
 - **Skill Goal**: Is the output design-level specs — not code (that's producers), not decomposition (that's compose), not implementation (that's execute)?
+- **Project map consistency**: Do the components and connections you wrote into `## Operational Foundation`, `## Capabilities`, and `## Component Reference` match the interfaces and files in the enriched tasks.json? Every component you name should be produced by some task's `files`; every cross-task interface should surface as a "Key connection" in the relevant slice. The map and tasks.json must tell the same architectural story.
 If any check fails, correct before proceeding. If you cannot correct within this skill, flag it explicitly to the user.
+## EXPRESS MODE
+Read `track` from the active context (missing = `"full"`). On the express track, compose may have already folded design in and skipped this phase — if you were routed here at all, the work carries a real technical decision that needs you.
+When `track == "express"`:
+- Design only what's necessary. Skip Phase 2.5 (Operational Foundation) unless the change actually touches deployment/infra.
+- Collapse Phase 3's per-group loop — there is usually a single small group. Surface the one decision that matters, get the user's call, enrich.
+- Still run 3b.5 (Verify Against Current Landscape) for any external-tech choice; currency checks are cheap and matter even for small changes.
+- No `test/*` tasks exist on the express track, so there is no test enrichment.
+When `track == "full"` (or unset), run design exactly as specified below.
 ## PROTOCOL
 ```
@@ -278,6 +291,17 @@ When enriching `ui/*` tasks, the design fields describe **functional requirement
 The principle: design tells the UI producer what must be TRUE about the interface. The UI producer decides what it LOOKS like.
+### Test Task Enrichment (`test/*` domains)
+`test/*` tasks exist only on the full track (compose creates them for logic-bearing slices). When enriching one, specify the **testing framework and how the suite runs** so execute can produce it and verify can run it:
+- **`technical_approach`**: the test framework/runner (pytest, vitest, Jest, Go `testing`, etc.), where test files live, and the exact command that runs the suite.
+- **`interfaces`**: what the tests exercise — the modules, endpoints, or components under test, named concretely.
+- **`files`**: the test file paths to create.
+- **`producer_notes`**: fixtures, mocks, or seed data the tests need; and what NOT to test (don't test framework internals or third-party code).
+Record the suite run command in `## Operational Foundation` (or `key_facts.md`) so verify can find and run it later — verify's regression step depends on knowing how to invoke the suite.
 After enrichment, each task object should contain everything a producer needs. No ambiguity, no open questions.
 ## PHASE 5: USER REVIEW

package/skills/intuition-enuncia-discovery/SKILL.md CHANGED Viewed

@@ -30,6 +30,7 @@ You help users figure out what they're building. You do this through focused con
 12. On branches, you MUST revisit the existing Project North Star at the start of the conversation and confirm it still stands (or propose a trunk-level amendment). See BRANCH MODE → Revisit Project North Star.
 13. You MUST run the Exit Protocol after writing the brief and updating the map. Route to `/intuition-enuncia-compose`.
 14. You MUST NOT launch research agents proactively. Research fires ONLY when the user asks something you cannot confidently answer (see REACTIVE RESEARCH).
+15. You MUST read `track` from the active context and, when `track == "express"`, compress the conversation per EXPRESS MODE. A missing `track` defaults to `"full"`.
 ## THE FOUR DIMENSIONS
@@ -81,6 +82,19 @@ When authoring a Project North Star or a Branch Goal, apply these three rules. T
 **When your own draft is too prescriptive and the user pushes back**, take it as the signal to abstract. Don't defend prescriptive framing.
+## EXPRESS MODE
+Read `track` from the active context in `.project-memory-state.json` at activation (a missing value means `"full"`). If `track` is unset AND discovery hasn't started, the entry bypassed start's triage — ask the one Scope question from start's TRIAGE yourself, record the answer as `track` on the active context, then proceed.
+When `track == "express"`, compress hard:
+- Open by stating your read of all four dimensions inferred from the user's request, and ask only what you genuinely cannot infer — target one question, two at most.
+- Do NOT walk dimensions one at a time. Confirm the whole picture in a single pass.
+- The brief still covers all four dimensions but is minimal — a line each is fine. The North Star / Branch Goal is still authored at altitude (the altitude rules always apply); everything else can be a sentence.
+- Default Decision Posture to "Just flag surprises" unless the user signals otherwise. Still record it.
+- Still write `discovery_brief.md` and still propagate the goal into `project_map.md`. Express changes depth, never artifacts.
+When `track == "full"` (or unset), run discovery exactly as specified below.
 ## CONVERSATION FLOW
 Discovery operates in two modes depending on context: **trunk** (fresh foundation) and **branch** (delta from parent).

package/skills/intuition-enuncia-execute/SKILL.md CHANGED Viewed

@@ -71,6 +71,7 @@ Before writing build outputs or routing:
 - **Project North Star**: Do the produced deliverables, taken together, advance the project's stated outcome? The brief_alignment.north_star field in build_output.json reflects this check.
 - **Branch Goal** (if branch): Do the deliverables advance what this branch is on the hook for? Branch work that inadvertently affects trunk scope is a misalignment.
 - **Skill Goal**: Is the output built-and-verified artifacts — not new design decisions, not unreviewed code?
+- **Project map consistency**: If the build produced or revealed components/connections not in `docs/project_notes/project_map.md`, did you add them at the map's terseness? Every file you created should map to a component the map names; correct any stale entry for something you replaced.
 If any check fails, re-delegate, escalate, or flag to the user. Do not ship misaligned output.
@@ -95,6 +96,8 @@ Read these files:
 Validate: every task in `tasks.json` has a `technical_approach` field (indicating design enrichment). If any task lacks a `technical_approach`, inform the user and ask whether to proceed with partial specs or run design first.
+Also read `track` from the active context (missing = `"full"`). On the express track, expect a small plan and no `test/*` tasks — keep the plan confirmation brief. On the full track, `test/*` tasks should be present for logic-bearing slices; produce them after the code they cover (their dependencies enforce the order).
 From each task object, extract:
 - `task.technical_approach` and `task.files`
 - `task.acceptance_criteria` (refined from compose)
@@ -136,22 +139,36 @@ For each task per dependency order (parallelize independent tasks):
 ### Producer Selection
-Determine the producer type from the task's domain and the spec's technical approach:
-- `ui/*` domains → `ui-writer` (load from producer registry)
-- Code domains → `intuition-code-writer`
-- Document/report domains → load producer profile from registry if available
-- Other formats → load producer profile from registry if available
+Map each task's domain to a producer profile:
+| Domain | Producer profile |
+|--------|------------------|
+| `ui/*` | `ui-writer` |
+| `code/*` | none — `intuition-code-writer` is itself the implementer |
+| `test/*` | none — `intuition-code-writer` writes the tests (full track only) |
+| Document / report | `document-writer` |
+| Spreadsheet | `spreadsheet-builder` |
+| Presentation | `presentation-creator` |
+| Data file | `data-file-writer` |
+| Form | `form-filler` |
+| Other | best-matching `*.producer.md`, if one exists |
+**Producers are profiles, not subagents.** A producer such as `ui-writer` is a `.producer.md` file installed under `~/.claude/producers/` — it is NOT a registered agent, so you CANNOT pass it as `subagent_type`; that call would fail. To use a producer you MUST load its profile body and inject it as the subagent's system instructions, then spawn the work as an `intuition-code-writer` agent (the registered implementer). **This injection is the step that makes `ui-writer`'s aesthetic rules — and every other producer's domain rules — actually take effect. Skipping it silently discards the entire reason `ui/*` tasks are separated from `code/*`.**
-Registry scan order (for non-code producers):
+**Load the profile.** Scan in order, first found wins:
 1. Project: `.claude/producers/{producer-name}/{producer-name}.producer.md`
 2. User: `~/.claude/producers/{producer-name}/{producer-name}.producer.md`
-3. Framework: scan package root `producers/` directory
+3. Framework: scan the package root `producers/` directory
-If no matching producer profile is found, default to `intuition-code-writer` with the spec as instructions.
+Read the matched profile. Use the `model` declared in its frontmatter for the spawned `intuition-code-writer` agent. If the domain maps to no profile (plain `code/*`) or no profile file is found, spawn `intuition-code-writer` directly with no injected profile.
 ### Delegation Format
+The prompt has two parts: the **producer profile body** (when one was loaded above) followed by the **task spec**. When no profile was loaded (plain `code/*`), omit the profile block and begin at "You are building…".
 ```
+{PRODUCER PROFILE BODY — when a profile was loaded, paste everything after its YAML frontmatter here, verbatim. This makes the subagent act as that producer and apply its domain rules. Omit this block entirely for plain code tasks.}
 You are building a deliverable from a design spec. Follow the spec exactly.
 ## Your Task
@@ -209,6 +226,7 @@ Read the produced deliverable and verify against the spec and outline:
 - **Interface compliance**: Do the outputs match the interfaces defined in the spec? Would the components that depend on this task's output actually work?
 - **Decision compliance**: Were decisions from the spec honored? If the spec says "use approach X," did the producer use approach X?
 - **Unanticipated decisions**: Did the producer make choices not covered by the spec? If those choices affect stakeholder experience (check against discovery brief's Who and North Star), escalate to user. If internal/technical, log and continue.
+- **Test deliverables (full track)**: For a `test/*` task, verification IS running the tests — execute the produced suite and confirm it passes against the code it covers. A suite that doesn't run, or passes vacuously (no real assertions), FAILS Layer 1.
 If FAIL → re-delegate with specific issues. Max 2 retries, then escalate to user.
 If PASS → proceed to Layer 2.

package/skills/intuition-enuncia-handoff/SKILL.md CHANGED Viewed

@@ -49,6 +49,7 @@ If the user hasn't provided branch details, collect via AskUserQuestion:
   "created_at": "ISO timestamp",
   "purpose": "User-provided purpose",
   "status": "none",
+  "track": "full",
   "workflow": {
     "discovery": { "started": false, "completed": false, "completed_at": null },
     "compose": { "started": false, "completed": false, "completed_at": null },

package/skills/intuition-enuncia-initialize/SKILL.md CHANGED Viewed

@@ -133,6 +133,7 @@ The state file uses the Enuncia v11 schema:
   "active_context": "trunk",
   "trunk": {
     "status": "none",
+    "track": "full",
     "workflow": {
       "discovery": {
         "started": false,

package/skills/intuition-enuncia-initialize/references/state_template.json CHANGED Viewed

@@ -5,6 +5,7 @@
   "active_context": "trunk",
   "trunk": {
     "status": "none",
+    "track": "full",
     "workflow": {
       "discovery": {
         "started": false,

package/skills/intuition-enuncia-start/SKILL.md CHANGED Viewed

@@ -28,8 +28,10 @@ Detect where the project is in the Enuncia pipeline and route the user to the co
 1. You MUST read `.project-memory-state.json` before doing anything else.
 2. You MUST detect the current phase using the decision tree below.
-3. You MUST NOT write or modify files EXCEPT to bootstrap a missing state file.
+3. You MUST NOT write or modify files EXCEPT to (a) bootstrap a missing state file, or (b) record the `track` value during TRIAGE.
 4. You MUST suggest the correct next enuncia skill based on the detected phase.
+5. For brand-new work on a context (discovery not yet started), you MUST run TRIAGE to set the work `track` before routing.
+6. When the project already has content, you MUST open `docs/project_notes/project_map.md` and lead your output with a glanceable PROJECT SUMMARY. The map is the front door — make it the first thing the user sees.
 ## PROTOCOL
@@ -39,7 +41,9 @@ Step 1: Read .project-memory-state.json
 Step 2: Bootstrap if missing
 Step 3: Resolve active context
 Step 4: Detect phase
-Step 5: Route to next skill
+Step 5: Triage — for brand-new work, set track (express | full)
+Step 6: Project summary — read project_map.md, summarize (when content exists)
+Step 7: Route to next skill
 ```
 ## VERSION CHECK (Step 0)
@@ -70,6 +74,7 @@ Default state schema:
   "active_context": "trunk",
   "trunk": {
     "status": "none",
+    "track": "full",
     "workflow": {
       "discovery": { "started": false, "completed": false, "completed_at": null },
       "compose": { "started": false, "completed": false, "completed_at": null },
@@ -127,6 +132,37 @@ ELSE apply against context_workflow:
     → complete
 ```
+## TRIAGE (brand-new work only)
+Run TRIAGE only when the detected phase is `first_time` or `needs_discovery` AND `workflow.discovery.started` is false for the active context — i.e., work on this context hasn't begun. Otherwise skip: the track was already set and must not be re-asked.
+Ask the user ONE question via AskUserQuestion:
+- Question: "How big is this piece of work? This sets how much process we use — you can change it later."
+- Header: "Scope"
+- Options:
+  - "Focused change" — "A small, well-understood change. Compress the pipeline: quick discovery, inline planning, straight to build."
+  - "New build or substantial work" — "A new project or significant feature. Run the full pipeline: discovery → compose → design → execute → verify."
+Map the answer to a track and write it to the active context object in `.project-memory-state.json` (`state.trunk.track` or `state.branches[active_context].track`):
+- "Focused change" → `track = "express"`
+- "New build…" → `track = "full"`
+This is the one file write start is permitted besides bootstrap. Then route to discovery — discovery and every downstream skill read `track` and adjust depth. Express compresses discovery, folds design into compose when there are no technical decisions, and scopes verification; full runs every phase in full.
+## PROJECT SUMMARY (when content exists)
+When the project already has content (any phase past discovery started, or a populated map), open `docs/project_notes/project_map.md` and LEAD your output with a tight, glanceable summary so the map is the first thing the user sees each session:
+```
+[Project title or North Star one-liner]
+What it is: [from ## Overview — one sentence]
+Capabilities: [from ## Capabilities — comma-joined slice names, stakeholder-facing]
+Track: [express | full]   Where you are: [phase status line]
+```
+Keep it to ~4 lines. Pull only from the map's `## Project North Star`, `## Overview`, and `## Capabilities` headings — never dump the full map. If the map is still the blank scaffold (first_time / pre-compose), skip the summary and give only the status line.
 ## STEP 5: ROUTING TABLE
 | Phase | Status Line | Route |
@@ -168,5 +204,5 @@ Use AskUserQuestion:
 ## VOICE
-- Concise — one line of status, one routing suggestion.
-- No analysis, no opinions. Just detect and route.
+- Concise — a 3–4 line project summary when context exists, one status line, one routing suggestion.
+- Extract and route. No analysis, no opinions — the summary is pulled from the map, not composed.

package/skills/intuition-enuncia-verify/SKILL.md CHANGED Viewed

@@ -2,8 +2,8 @@
 name: intuition-enuncia-verify
 description: Integration and verification for code projects. Walks the user through every manual step until the app is online, then systematically tests every interaction surface from a UX perspective. Not satisfied until the user can access the landing page AND every button, link, and flow works as expected.
 model: opus
-tools: Read, Write, Edit, Glob, Grep, Task, AskUserQuestion, Bash, Agent, WebFetch, mcp__ide__getDiagnostics
-allowed-tools: Read, Write, Edit, Glob, Grep, Task, Bash, Agent, WebFetch, mcp__ide__getDiagnostics
+tools: Read, Write, Edit, Glob, Grep, Task, AskUserQuestion, Bash, WebFetch, mcp__ide__getDiagnostics
+allowed-tools: Read, Write, Edit, Glob, Grep, Task, Bash, WebFetch, mcp__ide__getDiagnostics
 ---
 # Verify Protocol
@@ -18,6 +18,8 @@ Two jobs, done relentlessly:
 No mocks. No synthetic verification. The real system, used the way a real user uses it.
+*How* you exercise it depends on the browser-control tier available this session (see BROWSER CONTROL CAPABILITY). Whatever the tier, scope your claims to what you actually exercised — never report an interaction as "working" if you couldn't reach it.
 ## CRITICAL RULES
 1. You MUST read `.project-memory-state.json` and resolve context_path before anything else.
@@ -25,7 +27,7 @@ No mocks. No synthetic verification. The real system, used the way a real user u
 3. You MUST integrate before anything else. Code that isn't wired in can't run.
 4. You MUST NOT begin UX validation until the app is online and the user confirms they can access it.
 5. You MUST NOT consider Phase 1 complete until the landing page (or primary entry point) is reachable and the user confirms it.
-6. You MUST NOT consider Phase 2 complete until every implemented interaction surface has been tested from a UX perspective.
+6. You MUST NOT consider Phase 2 complete until every implemented interaction surface has been tested from a UX perspective — to the extent the available browser-control tier allows. Where a tier can't reach a surface (e.g. WebFetch on a client-rendered SPA), the user's confirmation stands in, and the limit MUST be stated in the results.
 7. You MUST NOT fix failures that violate user decisions from the specs. Escalate immediately.
 8. You MUST delegate integration tasks and code fixes to subagents. Do not write code yourself.
 9. You MUST verify against the discovery brief after UX validation — does the system deliver the North Star?
@@ -74,6 +76,7 @@ Before presenting final verification results and committing:
 - **Project North Star**: Walking the live system as a real user, does it deliver the project's stated outcome? This is the ultimate test of the whole pipeline.
 - **Branch Goal** (if branch): Does the system, on this branch, advance what the branch is on the hook for?
 - **Skill Goal**: Is the system actually online and every interaction surface tested — not just compiling cleanly?
+- **Project map accuracy**: After the Map Integrity Pass, does `project_map.md` truthfully describe the system you just walked — every Capability real, every Component present, every Key connection confirmed? The map is the next session's starting truth; leave it accurate.
 If any check fails, keep fixing until it passes or flag the gap explicitly in the final results.
@@ -162,7 +165,7 @@ Run the project's toolchain to verify basic code health. Execute in order:
 1. **Type check / lint** (if applicable): `[type check command]`, `[lint command]`
 2. **Build / compile** (if applicable): `[build command]`
-3. **Existing tests**: `[test command]` — run the FULL existing test suite to catch regressions
+3. **Test suite (if one exists)**: On the full track, design recorded the suite command in `## Operational Foundation` / `key_facts.md` and execute produced `test/*` deliverables — run the FULL suite to catch regressions. On the express track, or any project with no authored tests, there is no suite to run: skip this step and note in the final results that there is no automated regression net. Do NOT hand-write tests here — verify validates, it does not author tests.
 Also run `mcp__ide__getDiagnostics` to catch IDE-visible issues.
@@ -274,7 +277,19 @@ Do NOT proceed to Phase 2 until the user confirms they can access the landing pa
 The app is online. Now systematically verify that every implemented interaction works from a real user's perspective.
-This is NOT writing automated test files. This is YOU walking through the application as a user would — fetching pages, analyzing what's rendered, verifying links go where they should, checking that actions produce the expected results.
+This is NOT writing automated test files. This is YOU walking through the application as a user would — driving the live app, analyzing what's rendered, verifying links go where they should, checking that actions produce the expected results.
+## BROWSER CONTROL CAPABILITY
+UX validation means actually exercising the running app. *How* depends on what's available this session — detect it once, at the start of Phase 2, and record which tier you used so "verified" means something specific.
+**Tier 1 — Real browser automation (preferred).** If a browser-driver tool is available — a Playwright or Puppeteer MCP server (tools named like `mcp__playwright__*` / `mcp__puppeteer__*`), or the host app's built-in browser control — USE IT. This is the only way to truly test client-rendered apps (React/Vue/Svelte/Angular SPAs): it executes JavaScript, renders the real DOM, clicks elements, fills and submits forms, and captures screenshots. Drive the app like a user and read the rendered result. Capture a screenshot of each page so you — and the user — can judge whether it actually *looks* right, not just whether it returns 200. (If the user must pre-authorize their browser MCP to avoid prompts, tell them once; browser control is interactive by nature.)
+**Tier 2 — WebFetch + direct API (fallback).** If no browser driver exists, you CANNOT render or click a client-rendered SPA — WebFetch returns server HTML only, which for a CSR app is a near-empty shell. Be honest about this:
+- **Server-rendered / multi-page apps** (Jinja, Rails, Django, server-side templates): WebFetch sees real HTML, so most of the walkthrough works — load pages, follow links, check rendered content.
+- **Client-rendered SPAs**: WebFetch can't exercise the UI. Test the backing API endpoints directly (call them with test data, verify responses), confirm the app builds and serves, then hand the visual/interaction walkthrough to the user with a concrete click-list and collect their confirmation. Do NOT claim an interaction works that you could not actually exercise.
+**Detect the render model** before choosing how far Tier 2 can go: a CSR SPA (client bundle, `<div id="root">`-style shell) vs an SSR/MPA (server returns full HTML). Note it in the results.
 ### STEP 7: BUILD THE INTERACTION MAP
@@ -316,7 +331,7 @@ Header: "Interaction Map"
 ### STEP 8: SYSTEMATIC WALKTHROUGH
-Work through the interaction map methodically. For each page/route:
+Work through the interaction map methodically, using your Tier 1 browser driver where available and the Tier 2 fallback where not (see BROWSER CONTROL CAPABILITY). The steps below describe *what* to verify; the *how* — real clicks vs. WebFetch + direct API calls — follows your tier. On the **express track**, scope the walkthrough to the surfaces this work actually changed plus their immediate neighbors; you do not need to re-walk the entire app. For each page/route:
 #### 8a. Load the Page
@@ -403,6 +418,13 @@ After the walkthrough is clean (all interactions work):
 If something drifts, flag it: "All interactions work, but [specific concern about North Star alignment]."
+**Map Integrity Pass.** You already have a full code inventory from Step 7b (routes, components, endpoints) and you've walked the live system — so this is nearly free. Reconcile `docs/project_notes/project_map.md` against reality:
+- **Ghosts** (in code, missing from map): real components or capabilities the map doesn't reflect → add them at the map's terseness.
+- **Orphans** (in map, gone from code): components or connections the map lists that no longer exist → remove them.
+- **Broken connections**: each "Key connection" the map asserts should correspond to something real you observed (a call, a route, an integration). Flag any you could not confirm.
+Correct the map in place. If reconciliation reveals real architectural drift the user should know about — a capability silently dropped, a connection that never actually worked — surface it in the final results. A quietly wrong map is worse than no map: it's the next session's starting truth.
 **Update `docs/project_notes/project_map.md`** if integration or testing revealed anything new — refine `## Capabilities` slices or `## Component Reference` entries. Keep the shape and terseness. **Do not touch `## Project North Star` or `## Branch Goals`.**
 **Run the GOAL ALIGNMENT → Alignment Check** against the three loaded goals. Any failures must be fixed or explicitly flagged in the final results.
@@ -440,6 +462,8 @@ Question: "Verification complete — every interaction tested against the live s
 **Buttons/actions verified**: [N — N working, N fixed, N escalated]
 **Forms verified**: [N — N working, N fixed, N escalated]
 **User flows verified**: [N experience slices — N working, N fixed, N escalated]
+**Method**: [Tier 1 browser automation | Tier 2 WebFetch + direct API — render model: SSR / CSR]
+**Automated regression net**: [suite of N tests, passing | none — express track / no authored tests]
 **North Star alignment**: [met / concerns]
 [If escalated issues exist, list them]
@@ -460,7 +484,7 @@ If committing: stage files from build output + integration changes + fixes, comm
 ## BRANCH MODE
 When verifying on a branch:
-- Run the FULL test suite (parent + branch tests) to catch compatibility issues
+- If an automated suite exists (full-track projects), run the FULL suite (parent + branch) to catch compatibility regressions. If there is none (express-track work), rely on the live walkthrough and state the absence of a regression net in the results.
 - Integration must be compatible with parent architecture
 - Update `docs/project_notes/project_map.md`