npm - @tgoodington/intuition - Versions diffs - 10.7.0 → 10.9.0 - Mend

@tgoodington/intuition 10.7.0 → 10.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/package.json +1 -1
package/skills/intuition-outline/SKILL.md +43 -12
package/skills/intuition-prompt/SKILL.md +64 -57

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@tgoodington/intuition",
-  "version": "10.7.0",
+  "version": "10.9.0",
   "description": "Domain-adaptive workflow system for Claude Code: prompt, outline, assemble specialist teams, detail with domain experts, build with format producers, test code output. Supports v8 compat (design, engineer, build) and v9 specialist workflows with 14 domain specialists and 6 format producers.",
   "keywords": [
     "claude-code",

package/skills/intuition-outline/SKILL.md CHANGED Viewed

@@ -256,14 +256,33 @@ If no large data files are detected, skip this entirely.
 For each major decision domain identified from the prompt brief, orientation research, and dialogue:
 1. **Identify** the decision needed. State it clearly.
-2. **Research** (when needed): Launch 1-2 targeted research agents via Task tool.
-   - Use `intuition-researcher` for straightforward fact-gathering.
-   - Use `intuition-researcher` (model override: sonnet) for trade-off analysis against the existing codebase.
-   - Each agent prompt MUST reference the specific decision domain, return under 400 words.
-   - Write results to `{context_path}/.outline_research/decision_[domain].md` (snake_case).
-   - NEVER launch more than 2 agents simultaneously.
+2. **Research** (when needed): Launch research agents via Task tool. Agent count scales with the selected depth tier:
+   **Lightweight (1 agent):** Launch a single `intuition-researcher` agent for neutral fact-gathering.
+   "Research [decision domain] in the context of [project]. What are the viable approaches, key trade-offs, and relevant patterns in the codebase? Under 400 words."
+   Write results to `{context_path}/.outline_research/decision_[domain].md`.
+   **Standard (2 agents — Advocate/Challenger):** Launch 2 agents in parallel using divergent framing to combat confirmation bias.
+   **Agent A — Advocate** (subagent_type: `intuition-researcher`):
+   "Research [decision domain] in the context of [project]. Make the strongest case for the most natural approach given the existing codebase and constraints. What patterns already exist that support it? What makes it the obvious choice? Under 400 words."
+   **Agent B — Challenger** (subagent_type: `intuition-researcher`, model override: sonnet):
+   "Research [decision domain] in the context of [project]. Identify the strongest reasons NOT to take the obvious approach. What are the risks, overlooked alternatives, and counterarguments? What has gone wrong when similar projects made the default choice? Under 400 words."
+   Both agents MUST be launched in parallel in a single response. Write combined results to `{context_path}/.outline_research/decision_[domain].md`, clearly labeling the advocate and challenger findings.
+   **Comprehensive (3 agents — Advocate/Challenger/Lateral):** Launch all Standard agents plus:
+   **Agent C — Lateral** (subagent_type: `intuition-researcher`):
+   "Research how [decision domain] has been solved in different contexts — different frameworks, industries, or scales. What non-obvious approaches exist that the team might not have considered? Under 400 words."
+   - NEVER launch more than 3 agents simultaneously.
    - WAIT for all research agents to return and read their results before proceeding to step 3.
-3. **Present** 2-3 options with trade-offs. Include your recommendation and why. Incorporate the research findings.
+3. **Synthesize and present** 2-3 options with trade-offs. Synthesis rules scale with tier:
+   - **Lightweight**: Present findings directly with your recommendation.
+   - **Standard/Comprehensive**: You MUST incorporate findings from BOTH the advocate and challenger before forming your recommendation. If advocate and challenger agree, note that convergence — it strengthens confidence. If they disagree, the tension between them should directly shape the options you present. Do not simply pick the advocate's position.
 4. **Ask** the user to select via AskUserQuestion.
 5. **Record** the resolved decision to `{context_path}/.outline_research/decisions_log.md`:
@@ -658,14 +677,26 @@ Launch 2 `intuition-researcher` agents in parallel via Task tool. See Phase 1, S
 ## Tier 2: Decision Research (launched on demand in Phase 3)
-Launch 1-2 agents per decision domain when dialogue reveals unknowns needing investigation.
+Agent count scales with depth tier to balance token cost against confirmation bias risk.
+**Lightweight (1 agent):**
+- Single `intuition-researcher` for neutral fact-gathering. Low-stakes decisions don't justify the overhead of divergent framing.
+**Standard (2 agents — Advocate/Challenger):**
+- **Agent A — Advocate** (`intuition-researcher`): Makes the strongest case FOR the natural/obvious approach. Looks for supporting patterns in the codebase.
+- **Agent B — Challenger** (`intuition-researcher`, model: sonnet): Makes the strongest case AGAINST the obvious approach. Surfaces risks, alternatives, counterarguments.
+**Comprehensive (3 agents — Advocate/Challenger/Lateral):**
+- Advocate and Challenger as above, plus:
+- **Agent C — Lateral** (`intuition-researcher`): Explores how the problem has been solved in different contexts, frameworks, or industries.
-- Use `intuition-researcher` agents for fact-gathering (e.g., "What testing framework does this project use?").
-- Use `intuition-researcher` agents (model override: sonnet) for trade-off analysis (e.g., "Compare approaches X and Y given the current architecture").
+Rules:
+- Standard/Comprehensive: advocate and challenger MUST be launched in parallel in a single response.
 - Each prompt MUST specify the decision domain and a 400-word limit.
 - Reference specific files or directories when possible.
-- Write results to `{context_path}/.outline_research/decision_[domain].md`.
-- NEVER launch more than 2 simultaneously.
+- Write results to `{context_path}/.outline_research/decision_[domain].md`. Standard/Comprehensive: label advocate and challenger sections.
+- NEVER launch more than 3 simultaneously.
+- Standard/Comprehensive synthesis MUST incorporate tension between advocate and challenger findings. If you find yourself ignoring the challenger, stop and re-read its findings.
 # CONTEXT MANAGEMENT

package/skills/intuition-prompt/SKILL.md CHANGED Viewed

@@ -92,17 +92,22 @@ This is the core of the skill. Each turn targets ONE gap using a dependency-orde
 ### Refinement Order
 ```
-1. SCOPE       → What is IN and what is OUT?
-2. INTENT      → What does the end-user experience when this is done and working?
-3. SUCCESS     → How do you know it worked? What's observable/testable?
-4. CONSTRAINTS → What can't change? Technology, team, timeline, budget?
-5. ASSUMPTIONS → What are we taking as given? How confident are we?
+1. SCOPE       → What is IN and what is OUT? Broad boundaries, not itemized feature lists.
+2. INTENT      → What does "done" look and feel like? The experiential outcome.
+3. BOUNDARIES  → What's fixed and what's flexible? Hard constraints and key givens.
 ```
+The prompt phase paints in broad strokes. Detailed success criteria, testable outcomes, assumption confidence ratings, and architectural constraints belong in outline — not here.
+**SCOPE sets the playing field** — enough to know what's in-bounds and out-of-bounds, not a requirements list.
 **INTENT captures the experiential outcome** — not metrics, but feel:
 - What the end-user experiences when interacting with the finished product
 - What the output/interface looks like and feels like in practice
 - Non-negotiable experiential qualities (fast, simple, invisible, delightful, etc.)
+- What "done" looks like at a high level — outline will sharpen this into testable criteria
+**BOUNDARIES merge constraints and assumptions into one pass** — what can't change, what we're taking as given, what's flexible. No confidence ratings or detailed analysis — just the lay of the land.
 INTENT grounds the brief in what success *feels like*, which downstream phases use to distinguish user-facing decisions from technical internals.
@@ -111,33 +116,46 @@ INTENT grounds the brief in what success *feels like*, which downstream phases u
 Before each question, run this internal check:
 ```
-Is SCOPE clear enough to plan against?
-  NO  → Ask a scope question
-  YES → Is INTENT defined (experiential outcome, look/feel, non-negotiable qualities)?
+Is SCOPE clear enough to know what's in-bounds and out-of-bounds?
+  NO  → Ask a scope question (broad boundaries, not feature lists)
+  YES → Is INTENT defined (experiential outcome, look/feel, what "done" looks like)?
     NO  → Ask an intent question
-    YES → Is SUCCESS defined with observable criteria?
-      NO  → Ask a success criteria question
-      YES → Are binding CONSTRAINTS surfaced?
-        NO  → Ask a constraints question
-        YES → Are key ASSUMPTIONS identified?
-          NO  → Ask an assumptions question
-          YES → Move to REFLECT
+    YES → Are BOUNDARIES clear (hard constraints, key givens, what's flexible)?
+      NO  → Ask a boundaries question
+      YES → Move to REFLECT
 ```
 If the user's initial CAPTURE response already covers some dimensions, skip them. Do not ask about what's already clear.
+**Stay at vision altitude.** If you catch yourself asking about specific metrics, implementation details, or testable acceptance criteria — stop. That's outline's job. The prompt phase asks "what are we building and why does it matter?" not "how will we verify it works?"
 ### Question Crafting Rules
 Every question in REFINE follows these principles:
 **Derive from their words.** Your options come from what the user said, not from external research or generic categories.
-**ANTI-PATTERN: Always presenting 3 options.** This is the most common failure mode. You must count the actual distinct possibilities the user's words imply — sometimes that's 2, sometimes 5, sometimes 6. Three is not a safe default; it's a lazy one. Before presenting options, explicitly list the distinct possibilities you've identified and use ALL of them.
+**MANDATORY OPTION ENUMERATION — execute this before EVERY AskUserQuestion call:**
+```
+STEP 1: In your thinking, brainstorm EVERY distinct possibility the user's
+        words and context imply. Write them all out. Do not stop at 3.
+STEP 2: Count them. If you have exactly 3, you are almost certainly anchored.
+        Go back to Step 1 and ask: "What am I collapsing together that is
+        actually two distinct things? What possibility am I forgetting?"
+STEP 3: Only proceed when your count genuinely reflects the decision space.
+        2 is fine. 4 is fine. 7 is fine. 3 is fine ONLY if you passed Step 2
+        and confirmed the space truly has exactly 3 distinct options.
+STEP 4: Use ALL the possibilities from your enumeration as AskUserQuestion
+        options. Do not trim to fit a round number.
+```
+If you skip this procedure or present 3 options without passing Step 2, the protocol has failed.
 Examples at different scales:
 - 2 options: "You said 'handle transfers' — does that mean (a) bulk migration when someone leaves, or (b) real-time co-ownership?"
-- 3 options: "You mentioned 'fast' — is that (a) sub-second response times, (b) same-day turnaround, or (c) perceived speed through progressive loading?"
+- 3 options (verified): "You mentioned 'fast' — is that (a) sub-second response times, (b) same-day turnaround, or (c) perceived speed through progressive loading?"
 - 4 options: "The notification system could be (a) email-only, (b) in-app real-time, (c) digest-based batching, or (d) user-configured per event type."
 - 5 options: "For auth you've got (a) email/password, (b) SSO via existing provider, (c) magic links, (d) OAuth social login, or (e) passkeys."
 - 6+ options: "The dashboard layout could be (a) single KPI grid, (b) tabbed by department, (c) role-based views, (d) customizable drag-and-drop, (e) narrative/report style, or (f) a combined feed with filters."
@@ -155,9 +173,9 @@ Always include a trailing "or something else entirely?" when the space might be
 **Aggressive skip rule:** After CAPTURE, check each dimension. If the user's initial response provides a clear, actionable answer for a dimension, mark it satisfied and skip it entirely. Do not ask confirmatory questions for dimensions that are already clear — that's ceremony, not refinement.
 **Convergence triggers — move to REFLECT when ANY of these are true:**
-- All 5 dimensions are satisfied (even if that's after 1 turn of REFINE)
-- You've asked 3+ REFINE questions and remaining gaps are minor enough to flag as open questions
-- By turn 3-4 you should be asking about what the solution DOES, not what the problem IS. If you're still gathering background context after turn 4, flag remaining unknowns as open questions and move to REFLECT.
+- All 3 dimensions are satisfied (even if that's after 1 turn of REFINE)
+- You've asked 2+ REFINE questions and remaining gaps are minor enough to flag as open questions for outline
+- By turn 3 you should have a clear enough picture to move toward REFLECT. If you're still gathering broad context after turn 3, flag remaining unknowns as open questions and move on.
 The goal is precision, not thoroughness. A 4-turn prompt session that nails the brief is better than a 9-turn session that asks about things the user already told you.
@@ -175,18 +193,14 @@ Question: "Here's what I've captured from our conversation:
 **Commander's Intent:**
 - Desired end state: [What success feels/looks like to the end user — experiential, not metrics]
 - Non-negotiables: [The 2-3 experiential qualities that would make the user reject the result]
-- Boundaries: [Constraints on the solution space, not prescribed solutions]
-**Success looks like:** [bullet list of observable outcomes]
-**In scope:** [list]
-**Out of scope:** [list]
-**Constraints:** [list]
+**In scope:** [list — broad boundaries]
+**Out of scope:** [list — broad boundaries]
-**Assumptions:** [list with confidence notes]
+**What's fixed:** [hard constraints and key givens — brief list]
+**What's flexible:** [areas where outline has room to explore]
-**Open questions for outlining:** [list]
+**Open questions for outlining:** [list — things outline should investigate]
 What needs adjusting?"
@@ -242,35 +256,29 @@ Write the output files and route to handoff.
 ## Commander's Intent
 **Desired end state:** [What success feels/looks like to the end user — experiential, not metrics]
 **Non-negotiables:** [The 2-3 experiential qualities that would make the user reject the result]
-**Boundaries:** [Constraints on the solution space, not prescribed solutions]
-## Success Criteria
-- [Observable, testable outcome 1]
-- [Observable, testable outcome 2]
-- [Observable, testable outcome 3]
 ## Scope
 **In scope:**
-- [Item 1]
-- [Item 2]
+- [Broad boundary 1]
+- [Broad boundary 2]
 **Out of scope:**
-- [Item 1]
-- [Item 2]
+- [Broad boundary 1]
+- [Broad boundary 2]
-## Constraints
-- [Non-negotiable limit 1]
-- [Non-negotiable limit 2]
+## Boundaries
+**What's fixed:**
+- [Hard constraint or key given 1]
+- [Hard constraint or key given 2]
-## Key Assumptions
-| Assumption | Confidence | Basis |
-|-----------|-----------|-------|
-| [statement] | High/Med/Low | [why we believe this] |
+**What's flexible:**
+- [Area where outline has room to explore 1]
+- [Area where outline has room to explore 2]
 ## Open Questions for Planning
-- [Build decision the outline phase should investigate]
-- [Technical unknown that affects architecture]
+- [Thing outline should investigate or decide]
 - [Assumption that needs validation]
+- [Area where the user was uncertain]
 ## Decision Posture
 | Area | Posture | Notes |
@@ -285,26 +293,23 @@ Write the output files and route to handoff.
   "summary": {
     "title": "...",
     "one_liner": "...",
-    "problem_statement": "...",
-    "success_criteria": "..."
+    "problem_statement": "..."
   },
   "commander_intent": {
     "desired_end_state": "...",
-    "non_negotiables": ["..."],
-    "boundaries": ["..."]
+    "non_negotiables": ["..."]
   },
   "scope": {
     "in": ["..."],
     "out": ["..."]
   },
-  "constraints": ["..."],
-  "assumptions": [
-    { "assumption": "...", "confidence": "high|medium|low", "basis": "..." }
-  ],
+  "boundaries": {
+    "fixed": ["..."],
+    "flexible": ["..."]
+  },
   "decision_posture": [
     { "area": "...", "posture": "i_decide|show_options|team_handles", "notes": "..." }
   ],
-  "research_performed": [],
   "open_questions": ["..."]
 }
 ```
@@ -366,6 +371,8 @@ These are banned. If you catch yourself doing any of these, stop and correct cou
 - Asking questions you could have asked in turn one (generic background)
 - Staying on the same sub-topic for more than 2 follow-ups when the user is uncertain — flag it as an open question and move on
 - Producing a brief with sections the outline phase doesn't consume
+- **Drilling into implementation-level detail** — observable/testable criteria, confidence-rated assumptions, architectural specifics. The prompt phase captures vision and boundaries; outline sharpens into specifications
+- **Presenting exactly 3 options without running the Mandatory Option Enumeration procedure** — this is the single most persistent failure mode. If you have 3 options, you MUST have verified via Step 2 that you aren't collapsing or omitting possibilities
 ## RESUME LOGIC