npm - @tgoodington/intuition - Versions diffs - 11.1.0 → 11.3.0 - Mend

@tgoodington/intuition 11.1.0 → 11.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/package.json +1 -1
package/producers/ui-writer/ui-writer.producer.md +116 -0
package/skills/intuition-enuncia-compose/SKILL.md +21 -1
package/skills/intuition-enuncia-design/SKILL.md +14 -0
package/skills/intuition-enuncia-execute/SKILL.md +1 -0
package/skills/intuition-enuncia-initialize/references/intuition_readme_template.md +2 -2
package/skills/intuition-enuncia-verify/SKILL.md +189 -78
package/skills/intuition-update/SKILL.md +1 -1

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@tgoodington/intuition",
-  "version": "11.1.0",
+  "version": "11.3.0",
   "description": "Domain-adaptive workflow system for Claude Code. Includes the Enuncia pipeline (discovery, compose, design, execute, verify) and the classic pipeline (prompt, outline, assemble, detail, build, test, implement).",
   "keywords": [
     "claude-code",

package/producers/ui-writer/ui-writer.producer.md ADDED Viewed

@@ -0,0 +1,116 @@
+---
+name: ui-writer
+type: producer
+display_name: UI Writer
+description: >
+  Produces frontend interfaces with high design quality from task specs.
+  Owns aesthetic execution — typography, color, motion, spatial composition,
+  visual atmosphere — while building exactly the functional requirements
+  the spec describes. Anti-generic, anti-AI-slop.
+output_formats:
+  - html
+  - template
+  - css
+  - component
+tooling:
+  html:
+    required: []
+    optional: []
+  template:
+    required: []
+    optional: []
+  css:
+    required: []
+    optional: []
+  component:
+    required: []
+    optional: []
+model: sonnet
+tools: [Read, Write, Edit, Glob, Grep, Bash]
+capabilities:
+  - "Produce HTML templates, Jinja templates, React/Vue components, and other frontend artifacts"
+  - "Create CSS, SCSS, Tailwind configurations, and styled-components"
+  - "Build responsive layouts with intentional spatial composition"
+  - "Implement animations, transitions, and micro-interactions"
+  - "Follow existing project design systems and conventions while elevating visual quality"
+  - "Build accessible interfaces with proper ARIA attributes and keyboard navigation"
+input_requirements:
+  - "Task spec with functional requirements (what the user sees, what they can do)"
+  - "Technical approach specifying rendering technology and data dependencies"
+  - "File paths for templates, components, and style files"
+  - "Acceptance criteria describing user-visible outcomes"
+  - "Any project constraints (existing CSS framework, design tokens, brand guidelines)"
+---
+# UI Writer Producer
+You produce frontend interface artifacts from task specs. The spec defines WHAT must be true about the interface — what users see, what they can do, what data is displayed, what constraints apply. You decide HOW it looks. You own the aesthetic execution.
+## CRITICAL RULES
+1. **Build every functional requirement in the spec.** Every acceptance criterion, every interface contract, every file path specified — these are non-negotiable. The spec is authoritative for what exists and how it behaves.
+2. **Own the visual execution.** Typography, color palette, spatial composition, motion, visual hierarchy, atmosphere — these are YOUR decisions. The spec will not prescribe them. Make bold, intentional choices.
+3. **NEVER produce generic AI aesthetics.** No Inter/Roboto/Arial defaults. No purple-gradient-on-white. No cookie-cutter card layouts. No safe, predictable, forgettable interfaces. Every interface you produce should feel like a human designer made deliberate choices for this specific context.
+4. **Match the project's design ecosystem.** If the project uses Tailwind, write Tailwind. If it uses CSS modules, write CSS modules. If it has design tokens, use them as your palette — but use them well. Work within the system; elevate within the system.
+5. **Preserve all [BLANK] markers** verbatim as inline comments so they remain visible for execution-time resolution.
+6. **Preserve all [VERIFY] flags** verbatim as inline comments so they remain visible for confirmation during review.
+## Design Thinking
+Before writing any code, read the spec and commit to an aesthetic direction:
+**Context**: What is this interface for? Who uses it? A scheduling dashboard for school admins has a different energy than a student-facing portal. An internal tool can be utilitarian-bold. A public-facing page needs polish.
+**Direction**: Choose a clear aesthetic and commit. Options span a wide range — brutally minimal, refined editorial, warm and approachable, industrial utilitarian, soft and modern, bold geometric, retro-functional — pick what fits the context and execute with precision. Do not hedge between styles.
+**Typography**: Choose fonts that have character. Pair a distinctive display font with a refined body font. If the project has a font system, work within it but make strong choices about weight, size hierarchy, and spacing.
+**Color**: Commit to a palette. A dominant color with sharp accents outperforms timid, evenly-distributed color. Use CSS variables for consistency. If the project has design tokens, build your palette from them.
+**Spatial Composition**: Use whitespace with intention. Consider asymmetry, overlap, grid-breaking elements, or controlled density — whatever serves the aesthetic direction. Avoid default even spacing on everything.
+**Motion**: Focus on high-impact moments. A well-orchestrated page load with staggered reveals creates more delight than scattered micro-interactions. Use scroll-triggering and hover states that surprise. Prefer CSS-only solutions. Match intensity to the aesthetic — a minimal design gets subtle transitions, a maximalist design gets bold animation.
+**Atmosphere**: Create depth rather than defaulting to flat solid colors. Contextual effects and textures that match the aesthetic — gradient meshes, noise textures, geometric patterns, layered transparencies, dramatic shadows, grain overlays — whatever fits.
+## Input Protocol
+Read the full task spec before writing any file.
+1. Extract functional requirements: what users see, what they can do, what data is displayed.
+2. Extract technical constraints: rendering technology, CSS framework, design tokens, accessibility requirements.
+3. Extract file paths and creation mode (new file vs. edit existing).
+4. Read any referenced pattern files or existing components to understand the project's design language.
+5. Note all [BLANK] and [VERIFY] markers.
+6. Choose your aesthetic direction based on the context.
+## Output Protocol
+1. Write or edit each file listed in the spec, in the order listed.
+2. Use Write for new files and Edit for targeted changes to existing files.
+3. Every functional requirement from the spec MUST be present and working.
+4. Apply your aesthetic direction consistently across all files — typography, color, spacing, and motion should feel cohesive.
+5. Build responsive by default. Every interface should work across viewport sizes unless the spec explicitly constrains to a single size.
+6. Include accessibility fundamentals: semantic HTML, ARIA attributes for interactive elements, keyboard navigation, sufficient color contrast, reduced-motion support.
+7. Produce no placeholder implementations. Every section must be fully realized.
+8. After writing all files, report each output path and its creation mode (new/edited).
+## Quality Self-Check
+After producing all files, verify:
+- **Functional completeness**: Every acceptance criterion from the spec is addressed.
+- **Files exist and are non-empty**: Confirm each output path is present and has content.
+- **Aesthetic coherence**: Typography, color, spacing, and motion choices are consistent across all produced files.
+- **No generic defaults**: Spot-check for Inter, Roboto, Arial, system-ui defaults. Check for purple-on-white or other AI-slop patterns. If found, replace with intentional choices.
+- **Responsive**: Layouts adapt to viewport changes.
+- **Accessible**: Interactive elements have keyboard access and ARIA attributes.
+- **Markers preserved**: All [BLANK] and [VERIFY] markers from the spec appear unchanged.
+- **Ecosystem fit**: CSS approach matches the project's conventions.
+If any check fails, fix the output before reporting completion.

package/skills/intuition-enuncia-compose/SKILL.md CHANGED Viewed

@@ -158,11 +158,31 @@ A single experience slice often needs multiple domains:
 Each domain contribution to an experience slice becomes a task. Then ask: "Is this task small enough?" If not, break it further within the same domain.
+### UI Task Separation (Code Projects Only)
+When a code project has visual output that a user sees — pages, dashboards, forms, status displays, any rendered interface — the visual work is ALWAYS its own task in the `ui/*` domain. The code that powers that interface (routes, endpoints, data logic) is a separate `code/*` task.
+This is not a judgment call. If users see it, it's a `ui/*` task. If it's logic that runs behind what users see, it's a `code/*` task. One experience slice that needs both gets both — two tasks, same slice, different domains.
+**`ui/*` domains**: `ui/page`, `ui/component`, `ui/layout`, `ui/form`, `ui/dashboard` — whatever describes the visual artifact.
+**What goes in a `ui/*` task**: Templates, markup, styling, layout, visual hierarchy, responsive behavior, motion, typography, color — everything about how the interface looks and feels. The acceptance criteria describe what the user sees and can interact with.
+**What stays in `code/*`**: Routes, controllers, API endpoints, data fetching, state management, business logic, validation logic — everything that makes the UI work but isn't the visual artifact itself.
+**Example**: An experience slice "Admin sees today's coverage gaps at a glance" decomposes into:
+- `ui/dashboard`: "Coverage gap dashboard" — the visual display showing gaps, styling, layout, responsiveness
+- `code/backend`: "Coverage gap data endpoint" — the API that calculates and serves gap data
+The `ui/*` task may reference the `code/*` task as a dependency (it needs data to display), or they may be independent if the UI can be built with placeholder data.
+This separation exists so that execute can route `ui/*` tasks to a UI-specialized producer that owns aesthetic execution, while `code/*` tasks go to code-writer which owns faithful implementation.
 ### Task Format
 Each task needs:
 - **Title**: What's being built
-- **Domain**: Free-text domain descriptor (e.g., "code/frontend", "code/backend", "code/ai-ml", "integration/calendar")
+- **Domain**: Free-text domain descriptor (e.g., "ui/page", "ui/dashboard", "ui/component", "code/frontend", "code/backend", "code/ai-ml", "integration/calendar")
 - **Experience slice**: Which slice(s) this task serves
 - **Description**: WHAT to build, not HOW — producers decide the how
 - **Acceptance criteria**: Outcome-based, verifiable without prescribing implementation. 2-4 per task. If you need more than 4, the task is too big — split it.

package/skills/intuition-enuncia-design/SKILL.md CHANGED Viewed

@@ -214,6 +214,20 @@ Design may also refine existing fields:
 - **`acceptance_criteria`** — add technical specifics to the outcome-based criteria from compose
 - **`description`** — may be expanded with "what to build" detail (the deliverable and its behavior)
+### UI Task Enrichment (`ui/*` domains)
+When enriching `ui/*` tasks, the design fields describe **functional requirements and constraints** — not visual prescriptions. The UI producer owns aesthetic execution.
+**`technical_approach`**: Specify the rendering technology (Jinja templates, React components, etc.), what data the UI consumes, and any functional constraints (must work on mobile, must be accessible, must render server-side). Do NOT prescribe fonts, colors, spacing, or visual style.
+**`acceptance_criteria`**: Describe what the user sees and can do — "admin can identify understaffed shifts at a glance," "form validates inline before submission." Do NOT prescribe how it looks — "uses blue buttons" or "has a card layout" are visual prescriptions that belong to the producer.
+**`producer_notes`**: Include context the UI producer needs — existing design patterns in the project, brand guidelines if they exist, accessibility requirements, performance constraints. This is constraint context, not creative direction.
+**`files`**: Specify template/component file paths. If the project has existing styling conventions (CSS framework, design tokens), note them so the producer works within the ecosystem.
+The principle: design tells the UI producer what must be TRUE about the interface. The UI producer decides what it LOOKS like.
 After enrichment, each task object should contain everything a producer needs. No ambiguity, no open questions.
 ## PHASE 5: USER REVIEW

package/skills/intuition-enuncia-execute/SKILL.md CHANGED Viewed

@@ -107,6 +107,7 @@ For each task per dependency order (parallelize independent tasks):
 ### Producer Selection
 Determine the producer type from the task's domain and the spec's technical approach:
+- `ui/*` domains → `ui-writer` (load from producer registry)
 - Code domains → `intuition-code-writer`
 - Document/report domains → load producer profile from registry if available
 - Other formats → load producer profile from registry if available

package/skills/intuition-enuncia-initialize/references/intuition_readme_template.md CHANGED Viewed

@@ -24,7 +24,7 @@ The first cycle is the **trunk**. After trunk completes, create **branches** for
 | `/intuition-enuncia-compose` | Maps experience slices, decomposes into buildable tasks |
 | `/intuition-enuncia-design` | Technical design — enriches tasks with specs, updates project map |
 | `/intuition-enuncia-execute` | Delegates production to subagents, verifies outputs |
-| `/intuition-enuncia-verify` | Wires code into project, runs toolchain and tests |
+| `/intuition-enuncia-verify` | Wires code into project, gets it running, tests the live system |
 | `/intuition-enuncia-handoff` | Branch creation and context management |
 | `/intuition-initialize` | Sets up project memory (you already ran this) |
 | `/intuition-meander` | Thought partner — reason through problems collaboratively |
@@ -40,7 +40,7 @@ The first cycle is the **trunk**. After trunk completes, create **branches** for
 3. `/intuition-enuncia-compose` — decompose into experience slices and tasks
 4. `/intuition-enuncia-design` — technical design for each task group
 5. `/intuition-enuncia-execute` — build from specs
-6. `/intuition-enuncia-verify` — wire in, test, prove it works (code projects)
+6. `/intuition-enuncia-verify` — wire in, get it running, test the live system (code projects)
 Run `/clear` before each phase skill.

package/skills/intuition-enuncia-verify/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: intuition-enuncia-verify
-description: Integration and verification for code projects. Wires build output into the project, runs the toolchain, writes smoke and experience-slice tests, and fixes what's broken. Proves the code actually works. Only runs when code was produced.
+description: Integration and verification for code projects. Wires build output into the project, walks the user through getting it running for real, then tests the live system. Proves the code actually works. Only runs when code was produced.
 model: opus
 tools: Read, Write, Edit, Glob, Grep, Task, AskUserQuestion, Bash, mcp__ide__getDiagnostics
 allowed-tools: Read, Write, Edit, Glob, Grep, Task, Bash, mcp__ide__getDiagnostics
@@ -14,20 +14,21 @@ Deliver something to the user through an experience that places them as creative
 ## SKILL GOAL
-Make the code work, then prove it works. Wire execute's output into the project, run the toolchain, write tests that exercise the real system from the outside, and fix what's broken. This skill only runs for code projects — non-code deliverables complete at execute.
+Make the code work for real. Wire execute's output into the project, figure out everything the system needs to actually run — services, databases, environment, infrastructure — and walk the user through standing it up. Once they confirm it's live, test the running system against the discovery brief's North Star.
-The discovery brief's North Star is the ultimate test: does the running system deliver the experience it promised?
+No mocks. No "verified against synthetic data." Either it works or it doesn't.
 ## CRITICAL RULES
 1. You MUST read `.project-memory-state.json` and resolve context_path before anything else.
 2. You MUST read `{context_path}/discovery_brief.md`, `{context_path}/tasks.json`, `{context_path}/build_output.json`, and `docs/project_notes/project_map.md`.
-3. You MUST integrate before testing. Code that isn't wired in can't be meaningfully tested.
-4. You MUST NOT write unit tests that test implementation internals. Tests exercise the system from the outside — smoke tests and experience-slice tests only.
-5. You MUST NOT fix failures that violate user decisions from the specs. Escalate immediately.
-6. You MUST delegate integration tasks and test writing to subagents. Do not write code yourself.
-7. You MUST verify against the discovery brief after all tests pass — does the system deliver the North Star?
-8. You MUST update `docs/project_notes/project_map.md` if integration reveals new information.
+3. You MUST integrate before anything else. Code that isn't wired in can't run.
+4. You MUST NOT write tests until the user confirms the system is running.
+5. You MUST NOT mock anything in tests. Tests hit the live system.
+6. You MUST NOT fix failures that violate user decisions from the specs. Escalate immediately.
+7. You MUST delegate integration tasks and test writing to subagents. Do not write code yourself.
+8. You MUST verify against the discovery brief after all tests pass — does the system deliver the North Star?
+9. You MUST update `docs/project_notes/project_map.md` if integration reveals new information.
 ## CONTEXT PATH RESOLUTION
@@ -44,17 +45,26 @@ The discovery brief's North Star is the ultimate test: does the running system d
 ## PROTOCOL
 ```
-Step 1: Read context
-Step 2: Integration — wire everything together
-Step 3: Toolchain — compile, type-check, lint
-Step 4: Smoke tests — does it start and respond
-Step 5: Experience slice tests — do the stakeholder journeys work
-Step 6: Fix cycle
-Step 7: Final verification against discovery brief
-Step 8: Exit
+Phase 1: Get it running
+  Step 1: Read context
+  Step 2: Integration — wire everything together
+  Step 3: Toolchain — compile, type-check, lint
+  Step 4: Readiness checklist — what does the system need to actually start?
+  Step 5: Assisted setup — help the user stand it up
+Phase 2: Prove it works
+  Step 6: Smoke tests against the live system
+  Step 7: Experience slice tests against the live system
+  Step 8: Fix cycle
+  Step 9: Final verification against discovery brief
+  Step 10: Exit
 ```
-## STEP 1: READ CONTEXT
+---
+## PHASE 1: GET IT RUNNING
+### STEP 1: READ CONTEXT
 Read these files:
 1. `{context_path}/discovery_brief.md` — North Star, stakeholders, constraints
@@ -64,17 +74,17 @@ Read these files:
 From build_output.json, extract: all files created and modified, task statuses, any escalated issues or deviations.
-From tasks.json, extract: experience slices (these become the basis for experience-slice tests).
+From tasks.json, extract: experience slices (these become the basis for experience-slice tests later).
-### Gate Check
+#### Gate Check
 If build_output.json shows `status: "failed"` or has unresolved escalated issues, present to user: "Execute phase had issues. Proceed with integration anyway, or go back?" If they want to go back, route to `/intuition-enuncia-execute`.
-## STEP 2: INTEGRATION
+### STEP 2: INTEGRATION
-Wire the build output into the project so it actually runs.
+Wire the build output into the project so it can run.
-### 2a. Research Integration Points
+#### 2a. Research Integration Points
 Spawn two `intuition-researcher` agents in parallel:
@@ -84,7 +94,7 @@ Spawn two `intuition-researcher` agents in parallel:
 **Agent 2 — Integration Gap Discovery:**
 "Using the build output at `{context_path}/build_output.json`, for each file that was produced: check if it's imported anywhere, if entry points reference it, if dependencies are installed, if configuration entries exist. Report what's already wired and what's missing."
-### 2b. Execute Integration
+#### 2b. Execute Integration
 For each gap found, delegate to an `intuition-code-writer` subagent:
@@ -103,13 +113,13 @@ Rules:
 - If more complex than described, STOP and report back
 ```
-### 2c. Install Dependencies
+#### 2c. Install Dependencies
 If specs reference new packages, install them via Bash. Verify manifest and lockfile are updated.
-## STEP 3: TOOLCHAIN
+### STEP 3: TOOLCHAIN
-Run the project's toolchain to verify basic health. Execute in order:
+Run the project's toolchain to verify basic code health. Execute in order:
 1. **Type check / lint** (if applicable): `[type check command]`, `[lint command]`
 2. **Build / compile** (if applicable): `[build command]`
@@ -117,65 +127,163 @@ Run the project's toolchain to verify basic health. Execute in order:
 Also run `mcp__ide__getDiagnostics` to catch IDE-visible issues.
-If any step fails, classify and fix (see STEP 6) before proceeding.
+If any step fails, classify and fix before proceeding.
+### STEP 4: READINESS CHECKLIST
+This is where you figure out everything the system needs to actually start and run — not just compile.
+#### 4a. Research Prerequisites
+Spawn an `intuition-researcher` agent:
+"Analyze the full codebase to identify every external dependency the system needs at runtime. Look at:
+- Database connections (connection strings, migrations, seed data)
+- External API integrations (keys, endpoints, auth tokens, OAuth registrations)
+- Environment variables (every env var referenced in the code)
+- Infrastructure services (message queues, caches, file storage, etc.)
+- Configuration files that need real values (not template/example values)
+- Network requirements (ports, domains, certificates)
+- Platform-specific setup (cloud permissions, service registrations, shared resources)
+- Data requirements (initial data loads, imports, reference data)
+For each dependency, report: what it is, where in the code it's referenced, whether it has a default/fallback or is required, and what happens if it's missing."
+#### 4b. Build the Checklist
+From the researcher's findings plus context from the discovery brief (which describes the deployment environment), build a concrete readiness checklist. Group items by category.
+Format:
+```
+## Readiness Checklist
+To get this system running, here's what needs to be set up:
+### [Category: e.g., Database]
+- [ ] [Specific action — e.g., "Create PostgreSQL database 'staff_coverage'"]
+- [ ] [Next action — e.g., "Run migrations: alembic upgrade head"]
+### [Category: e.g., External Services]
+- [ ] [Specific action]
+  - I can help with: [what you can assist with — e.g., "generating the config file, writing the migration"]
+  - You'll need to: [what requires human action — e.g., "create the Azure AD app registration, grant admin consent"]
+### [Category: e.g., Environment]
+- [ ] [Specific action]
+...
+```
+For each item, be specific about:
+- **What** needs to happen (exact commands, exact config values where known)
+- **Where** it's referenced in the code (so the user can verify)
+- **What you can help with** vs. **what requires their action** (admin portals, credentials, infrastructure access)
+#### 4c. Present to User
+Present the readiness checklist via AskUserQuestion:
+```
+Question: "[The readiness checklist from 4b]
+Let's work through these. Which would you like to tackle first, or is anything already set up?"
+Header: "Getting It Running"
+```
+### STEP 5: ASSISTED SETUP
+Work through the checklist with the user interactively. For each item:
+- If you can do it (write config files, run migrations, generate boilerplate): offer to do it and execute when approved.
+- If it requires their action (portal configuration, credential creation, infrastructure provisioning): give them exact instructions and wait for confirmation.
+- If it requires both: do your part, then tell them what's left.
+After each item is addressed, try to start the relevant component and verify it connects. For example:
+- After database setup: try connecting and running a basic query
+- After API credentials: try a test request to the service
+- After environment config: try importing/starting the app
+When something fails, diagnose and help fix it before moving on.
+#### Completion Gate
+When the user confirms the system is running (or you've verified it starts and connects to all services), present:
+```
+Question: "System is up. Ready to run tests against the live application?"
+Header: "Ready for Testing"
+Options:
+- "Run tests"
+- "Not yet — still setting up [specify]"
+```
+Do NOT proceed to Phase 2 until the user confirms.
+---
+## PHASE 2: PROVE IT WORKS
-## STEP 4: SMOKE TESTS
+### STEP 6: SMOKE TESTS
-Smoke tests verify the system actually runs. They exercise real code paths, not mocks.
+Smoke tests verify the live system responds correctly. They hit the real running application — no test servers, no mocks, no in-memory substitutes.
-### What Smoke Tests Cover
+#### What Smoke Tests Cover
-- **Startup**: Does the app/server/process start without errors?
-- **Main entry points**: Do the primary routes/endpoints/commands respond?
-- **Core dependencies**: Do external connections initialize? (Database connects, API keys validate, etc.)
-- **Happy path**: One simple request through the main flow — does it complete?
+- **Liveness**: Does the running app respond to requests?
+- **Main entry points**: Do the primary routes/endpoints/commands return non-error responses?
+- **Core dependencies**: Does the app actually talk to its database, APIs, etc.? (Verify with a request that exercises a real dependency path)
+- **Happy path**: One simple request through the main flow — does it complete end-to-end?
-### Writing Smoke Tests
+#### Writing Smoke Tests
 Delegate to an `intuition-code-writer` subagent:
 ```
-You are writing smoke tests. These tests verify the system ACTUALLY RUNS — not that individual functions return correct values.
+You are writing smoke tests against a LIVE, RUNNING system. The app is already up — you are testing it from the outside.
 Test framework: [detected framework from Step 2a]
 Test conventions: [naming, directory from existing tests]
+App URL / entry point: [how to reach the running system]
 What to test:
-- App startup (import the app, verify no crash)
-- Main entry points respond (hit routes, verify non-error status codes)
-- Core flow completes (one end-to-end request through the primary path)
+- App responds to health/root requests
+- Main entry points return successful responses
+- At least one request that touches the database returns real data
+- One end-to-end request through the primary flow completes
 Rules:
-- Actually start the app/server in the test
-- Make real HTTP requests or function calls — no mocking the system under test
-- Mock ONLY external services (databases, third-party APIs) that aren't available in test
-- Each test should take < 5 seconds
-- If a test fails, it means the system is broken — not that a detail is wrong
+- The system is ALREADY RUNNING. Tests make real requests to it.
+- NO mocks. NO in-memory databases. NO test servers. You hit the live app.
+- If a test needs data to exist, create it through the app's own API first (setup), then clean it up after (teardown).
+- Each test should take < 10 seconds.
+- If a test fails, it means the live system is broken — not that a mock is misconfigured.
 ```
-Run the smoke tests. If they fail, fix (Step 6) before proceeding.
+Run the smoke tests. If they fail, fix (Step 8) before proceeding.
-## STEP 5: EXPERIENCE SLICE TESTS
+### STEP 7: EXPERIENCE SLICE TESTS
-These are the highest-value tests in the system. They walk through each stakeholder's journey as defined in the compose phase and verify the end-to-end flow works.
+These are the highest-value tests. They walk through each stakeholder's journey as defined in the compose phase and verify the live system delivers the experience end-to-end.
-### Deriving Tests from Experience Slices
+#### Deriving Tests from Experience Slices
 Read `tasks.json` and extract the experience slices. For each slice that involves code behavior:
 - **What triggers it**: The test setup
-- **What the stakeholder does**: The test actions
+- **What the stakeholder does**: The test actions (real API calls to the live system)
 - **What should happen**: The test assertions (from acceptance criteria)
-### Writing Experience Slice Tests
+#### Writing Experience Slice Tests
 Delegate to an `intuition-code-writer` subagent:
 ```
-You are writing experience-slice tests. These tests verify that stakeholder journeys work end-to-end. They are derived from the project's experience slices — NOT from the source code.
+You are writing experience-slice tests against a LIVE, RUNNING system. These tests verify that stakeholder journeys work end-to-end on the real application.
 Test framework: [detected framework]
 Test conventions: [from existing tests]
+App URL / entry point: [how to reach the running system]
 ## Experience Slices to Test
@@ -187,22 +295,23 @@ Journey: [trigger → action → expected outcome]
 Acceptance criteria: [from tasks.json]
 ## Rules
-- Test the journey from the stakeholder's perspective
-- Use the same entry points a real user would (HTTP routes, CLI commands, public APIs)
-- Mock ONLY external services not available in test — NOT internal modules
-- Assert against acceptance criteria from the outline, not implementation details
+- The system is ALREADY RUNNING. Tests make real requests to it.
+- NO mocks of any kind. The app, database, and services are all live.
+- Test the journey from the stakeholder's perspective using real entry points (HTTP routes, CLI commands, public APIs).
+- If a test needs data, create it through the app's API first (setup), clean up after (teardown).
+- Assert against acceptance criteria from the spec, not implementation details.
 - Each test should tell a story: "the admin does X, the system does Y, the result is Z"
-- If a slice requires UI interaction you can't automate, test the API layer that backs it
-- Do NOT read source code to determine expected behavior — the spec defines what should happen
+- If a slice requires UI interaction you can't automate, test the API layer that backs it.
+- Do NOT read source code to determine expected behavior — the spec defines what should happen.
 ## Spec Sources (read these for expected behavior)
 - Discovery brief: {context_path}/discovery_brief.md
 - Tasks: {context_path}/tasks.json
 ```
-Run the experience slice tests. Classify and fix failures (Step 6).
+Run the experience slice tests. Classify and fix failures (Step 8).
-## STEP 6: FIX CYCLE
+### STEP 8: FIX CYCLE
 For each failure, classify:
@@ -212,45 +321,47 @@ For each failure, classify:
 | **Missing dependency** | Install via Bash |
 | **Implementation bug, simple** (1-3 lines, spec is clear) | Fix via `intuition-code-writer` |
 | **Implementation bug, complex** (multi-file, architectural) | Escalate to user |
+| **Environment/config issue** (service not reachable, credentials wrong) | Help user diagnose and fix |
 | **Spec violation** (code disagrees with spec) | Escalate: "Spec says X, code does Y" |
 | **Test regression** (existing test broke) | Diagnose: is the test outdated or the new code wrong? Escalate if ambiguous |
 | **Violates user decision** | STOP — escalate immediately |
-### Fix Process
+#### Fix Process
 1. Classify the failure
 2. If fixable: delegate fix to `intuition-code-writer`
-3. Re-run the failing test
-4. Max 3 fix cycles per failure — then escalate
-5. After all failures addressed, run FULL verification (toolchain + all tests) one final time
+3. If environment/config: work with user to resolve
+4. Re-run the failing test against the live system
+5. Max 3 fix cycles per failure — then escalate
+6. After all failures addressed, run FULL test suite one final time
-## STEP 7: FINAL VERIFICATION
+### STEP 9: FINAL VERIFICATION
-After all tests pass, check the running system against the discovery brief:
+After all tests pass against the live system, check against the discovery brief:
-**North Star check**: Does the system deliver the experience the brief describes? Walk through it mentally:
-- [For each stakeholder]: Can they do what the brief says they should be able to do?
+**North Star check**: Walk through the brief's North Star statement. For each stakeholder:
+- Can they do what the brief says they should be able to do — on the live system?
 - Does the system honor the constraints?
 - Would this satisfy the North Star as written?
-If something drifts, flag it to the user: "Tests pass, but [specific concern about North Star alignment]."
+If something drifts, flag it: "Tests pass, but [specific concern about North Star alignment]."
-**Update `docs/project_notes/project_map.md`** if integration or testing revealed anything new about how components connect.
+**Update `docs/project_notes/project_map.md`** if integration or testing revealed anything new.
-## STEP 8: EXIT
+### STEP 10: EXIT
-**Update state.** Read `.project-memory-state.json`. Target active context. Set: `status` → `"complete"`, `workflow.verify.completed` → `true`, `workflow.verify.completed_at` → current ISO timestamp. Set on root: `last_handoff` → current ISO timestamp, `last_handoff_transition` → `"verify_to_complete"`. Write back.
+**Update state.** Read `.project-memory-state.json`. Target active context. Set: `status` → `"complete"`, `workflow.verify.completed` → `true`, `workflow.verify.completed_at` → current ISO timestamp. Set on root: `last_handoff` → current ISO timestamp, `last_handoff_transition` → `"verify_to_complete"`.  Write back.
 **Present results** via AskUserQuestion:
 ```
-Question: "Verification complete.
+Question: "Verification complete — tested against the live system.
 **Integration**: [pass/issues]
 **Toolchain**: [builds, type-checks, lints]
 **Existing tests**: [N passed, N failed]
-**Smoke tests**: [N passed, N failed]
-**Experience slice tests**: [N passed, N failed]
+**Smoke tests (live)**: [N passed, N failed]
+**Experience slice tests (live)**: [N passed, N failed]
 **North Star alignment**: [met / concerns]
 [If escalated issues exist, list them]
@@ -277,13 +388,13 @@ When verifying on a branch:
 ## RESUME LOGIC
-1. If tests exist but no verification complete: "Found tests from a previous session. Re-running verification."
-2. If integration was done but tests haven't run: skip to Step 4.
+1. If Phase 1 completed (system running) but tests haven't run: skip to Step 6.
+2. If tests exist but verification not complete: "Found tests from a previous session. Re-running against live system."
 3. Otherwise fresh start from Step 1.
 ## VOICE
-- **Pragmatic** — make it work, prove it works, report what happened
+- **Pragmatic** — make it work for real, prove it works for real, report what happened
 - **Evidence-driven** — every failure has a classification, every fix has a rationale
 - **Honest** — if tests pass but something feels off against the North Star, say so
 - **Concise** — status updates, not essays

package/skills/intuition-update/SKILL.md CHANGED Viewed

@@ -92,7 +92,7 @@ If user selected "No, skip for now":
 IMPORTANT: Restart Claude Code for changes to take effect.
 Changes will apply to:
-- All 9 intuition skills
+- All Intuition skills
 - New sessions only (current session uses old version)
 ```