npm - mindsystem-cc - Versions diffs - 3.10.1 → 3.11.0 - Mend

mindsystem-cc 3.10.1 → 3.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/README.md +1 -1
package/agents/ms-designer.md +8 -8
package/agents/ms-executor.md +14 -163
package/agents/ms-plan-checker.md +2 -3
package/agents/ms-plan-writer.md +5 -11
package/agents/ms-verify-fixer.md +1 -1
package/commands/ms/design-phase.md +12 -10
package/commands/ms/execute-phase.md +0 -9
package/commands/ms/help.md +1 -3
package/commands/ms/review-design.md +9 -5
package/commands/ms/verify-work.md +1 -1
package/mindsystem/references/design-directions.md +1 -1
package/mindsystem/references/mock-patterns.md +48 -0
package/mindsystem/references/plan-format.md +2 -129
package/mindsystem/references/scope-estimation.md +3 -3
package/mindsystem/templates/config.json +0 -2
package/mindsystem/templates/design.md +1 -1
package/mindsystem/templates/phase-prompt.md +6 -142
package/mindsystem/templates/summary.md +24 -0
package/mindsystem/workflows/execute-phase.md +4 -98
package/mindsystem/workflows/execute-plan.md +12 -517
package/mindsystem/workflows/generate-mocks.md +74 -0
package/mindsystem/workflows/mockup-generation.md +1 -1
package/mindsystem/workflows/plan-phase.md +7 -24
package/mindsystem/workflows/verify-work.md +97 -17
package/package.json +1 -1
package/scripts/__pycache__/compare_mockups.cpython-314.pyc +0 -0
package/scripts/compare_mockups.py +219 -0
package/mindsystem/references/checkpoint-detection.md +0 -50
package/mindsystem/references/checkpoints.md +0 -788

package/mindsystem/workflows/plan-phase.md CHANGED Viewed

@@ -18,7 +18,7 @@ Decimal phases enable urgent work insertion without renumbering:
 1. .planning/ROADMAP.md
 2. .planning/PROJECT.md
-**Note:** Heavy references (phase-prompt.md, plan-format.md, scope-estimation.md, checkpoints.md, goal-backward.md, plan-risk-assessment.md) are loaded by the ms-plan-writer subagent, not main context. Lighter references (checkpoint-detection.md, tdd.md) are loaded on demand during task breakdown.
+**Note:** Heavy references (phase-prompt.md, plan-format.md, scope-estimation.md, goal-backward.md, plan-risk-assessment.md) are loaded by the ms-plan-writer subagent, not main context. Lighter references (tdd.md) are loaded on demand during task breakdown.
 </required_reading>
 <purpose>
@@ -184,7 +184,6 @@ type: execute
 wave: 1               # Gap closures typically single wave
 depends_on: []        # Usually independent of each other
 files_modified: [...]
-autonomous: true
 gap_closure: true     # Flag for tracking
 ---
 ```
@@ -450,9 +449,9 @@ For each potential task, ask:
 3. **Can this run independently?** (no dependencies = Wave 1 candidate)
 **Standard tasks need:**
-- **Type**: auto, checkpoint:human-verify, checkpoint:decision (human-action rarely needed)
+- **Type**: auto
 - **Task name**: Clear, action-oriented
-- **Files**: Which files created/modified (for auto tasks)
+- **Files**: Which files created/modified
 - **Action hint**: Brief implementation guidance
 - **Verify hint**: How to prove it worked
 - **Done hint**: Acceptance criteria
@@ -480,11 +479,9 @@ Standard tasks (remain in standard plans):
 Read `~/.claude/mindsystem/references/tdd.md` now for TDD criteria and plan structure.
-**Checkpoints:** Visual/functional verification → checkpoint:human-verify. Implementation choices → checkpoint:decision. Manual action (email, 2FA) → checkpoint:human-action (rare).
+**Decisions:** If you identify a task that requires choosing between approaches (which auth provider, which database, etc.), use AskUserQuestion to resolve it now. Don't defer decisions to execution. For purely technical choices where the user hasn't expressed preference, make the decision and document it in the plan's objective.
-**Critical:** If external resource has CLI/API (Vercel, Stripe, etc.), use type="auto" to automate. Only checkpoint for verification AFTER automation.
-Read `~/.claude/mindsystem/references/checkpoint-detection.md` now for detection rules.
+**Critical:** If external resource has CLI/API (Vercel, Stripe, etc.), use type="auto" to automate.
 **User setup detection:** For tasks involving external services, identify human-required configuration:
@@ -506,7 +503,6 @@ Note external services for risk scoring.
     <type>auto</type>
     <needs>nothing</needs>
     <creates>src/models/user.ts</creates>
-    <checkpoint>false</checkpoint>
     <tdd_candidate>false</tdd_candidate>
     <action_hint>Define User type with id, email, createdAt</action_hint>
     <verify_hint>tsc --noEmit passes</verify_hint>
@@ -517,33 +513,20 @@ Note external services for risk scoring.
     <type>auto</type>
     <needs>src/models/user.ts</needs>
     <creates>src/app/api/auth/login/route.ts</creates>
-    <checkpoint>false</checkpoint>
     <tdd_candidate>true</tdd_candidate>
     <action_hint>POST endpoint with bcrypt validation</action_hint>
     <verify_hint>curl returns 200 with valid credentials</verify_hint>
     <done_hint>Login works with valid credentials</done_hint>
   </task>
-  <task id="3">
-    <name>Verify login flow</name>
-    <type>checkpoint:human-verify</type>
-    <needs>src/app/api/auth/login/route.ts</needs>
-    <creates>nothing</creates>
-    <checkpoint>true</checkpoint>
-    <tdd_candidate>false</tdd_candidate>
-    <action_hint>N/A</action_hint>
-    <verify_hint>User tests login manually</verify_hint>
-    <done_hint>User approves login flow</done_hint>
-  </task>
 </task_list>
 ```
 Each task captures:
 - `id`: Sequential identifier
 - `name`: Action-oriented task name
-- `type`: auto, checkpoint:human-verify, checkpoint:decision, checkpoint:human-action
+- `type`: auto
 - `needs`: Files/types this task requires (or "nothing")
 - `creates`: Files/types this task produces (or "nothing")
-- `checkpoint`: true if requires user interaction
 - `tdd_candidate`: true if should be TDD plan
 - `action_hint`: Brief implementation guidance (subagent expands)
 - `verify_hint`: How to verify completion
@@ -790,7 +773,7 @@ Tasks are instructions for Claude, not Jira tickets.
 - [ ] Tasks identified with needs/creates dependencies
 - [ ] Task list handed off to ms-plan-writer
 - [ ] PLAN file(s) created by subagent with XML structure
-- [ ] Each plan: depends_on, files_modified, autonomous in frontmatter
+- [ ] Each plan: depends_on, files_modified in frontmatter
 - [ ] Each plan: must_haves derived (truths, artifacts, key_links)
 - [ ] Each plan: 2-3 tasks (~50% context)
 - [ ] Wave structure maximizes parallelism

package/mindsystem/workflows/verify-work.md CHANGED Viewed

@@ -139,25 +139,63 @@ Skip internal/non-observable items (refactors, type changes, etc.).
 </step>
 <step name="classify_tests">
-**Classify all tests by mock requirements:**
+**Classify all tests by mock requirements using two-tier analysis:**
-For each test, analyze the expected behavior to determine:
+For each test, determine:
 1. **mock_required**: Does this need special backend state?
-2. **mock_type**: Freeform string describing needed state (e.g., "error_state", "premium_user", "empty_response")
+2. **mock_type**: Classification (e.g., "transient_state", "external_data", "error_state", "premium_user", "empty_response")
 3. **dependencies**: Other tests this depends on (infer from descriptions)
-**Classification heuristics:**
+**Tier 1: SUMMARY.md mock_hints (primary)**
+Check if SUMMARY.md files contain mock_hints frontmatter:
+```bash
+grep -l "mock_hints:" "$PHASE_DIR"/*-SUMMARY.md 2>/dev/null
+```
+If found, check the value:
+- **`mock_hints: none`** → Short-circuit: all tests start as `mock_required: false`. Apply keyword heuristics only as safety net (scan test descriptions for obvious mock signals like "error", "loading", "empty").
+- **`mock_hints` with content** → Parse and match tests against hints:
+  - Test relates to a `transient_states` entry → `mock_type: "transient_state"`, `mock_reason: "[from hint]"`
+  - Test relates to an `external_data` entry → `mock_type: "external_data"`, `needs_user_confirmation: true`
+  - Test doesn't match any hint → apply keyword heuristics below
+**Tier 2: Inline classification (fallback for legacy summaries)**
+When no `mock_hints` key found in any SUMMARY.md (legacy summaries written before executor populated this field):
+Classify in main context using the two-question framework:
+1. **Is the observable state transient?** — Does it appear briefly during async operations? (loading skeleton, spinner, transition animation). If YES → `mock_type: "transient_state"`
+2. **Does the test depend on external data?** — Does the feature fetch from an API, database, or external service? Would the test fail without specific data existing? If YES → `mock_type: "external_data"`, `needs_user_confirmation: true`
+Reason over SUMMARY.md content (accomplishments, files created/modified, decisions) to answer these questions. Supplement with keyword heuristics:
 | Expected behavior contains | Likely mock_type |
 |---------------------------|------------------|
-| "error", "fails", "invalid" | error_state |
+| "error", "fails", "invalid", "retry" | error_state |
 | "premium", "pro", "paid", "subscription" | premium_user |
 | "empty", "no results", "placeholder" | empty_response |
-| "loading", "spinner", "skeleton" | loading_state |
+| "loading", "spinner", "skeleton" | transient_state |
 | "offline", "no connection" | offline_state |
 | Normal happy path | no mock needed |
-**Dependency inference:**
+For tests that remain genuinely uncertain after both the two-question framework and keyword heuristics, present them via AskUserQuestion grouped by uncertainty:
+```
+questions:
+  - question: "Does [test name] require mock data or a special app state to test?"
+    header: "Mock needed?"
+    options:
+      - label: "No mock needed"
+        description: "Can test with real/local data"
+      - label: "Needs mock"
+        description: "Requires simulated state or data"
+```
+**Dependency inference (both tiers):**
 - "Reply to comment" depends on "View comments"
 - "Delete account" depends on "Login"
 - Tests mentioning prior state depend on tests that create that state
@@ -173,12 +211,21 @@ tests:
   - name: "Login error message"
     mock_required: true
     mock_type: "error_state"
+    mock_reason: "error response from auth endpoint"
     dependencies: ["login_flow"]
-  - name: "Premium badge display"
+  - name: "Recipe list loading skeleton"
     mock_required: true
-    mock_type: "premium_user"
-    dependencies: ["login_flow"]
+    mock_type: "transient_state"
+    mock_reason: "loading skeleton during recipe fetch — async, resolves in <1s"
+    dependencies: []
+  - name: "View recipe list"
+    mock_required: true
+    mock_type: "external_data"
+    mock_reason: "recipe items from /api/recipes"
+    needs_user_confirmation: true
+    dependencies: []
 ```
 </step>
@@ -189,10 +236,33 @@ tests:
 **Rules:**
 1. Group by mock_type (tests needing same mock state go together)
-2. Respect dependencies (if B depends on A, A must be in same or earlier batch)
-3. Max 4 tests per batch (AskUserQuestion limit)
-4. No-mock tests first (run before any mock setup)
-5. Order mock states logically: success → error → empty → loading
+2. **User confirmation for external_data tests:** Before batching, collect all tests with `needs_user_confirmation: true`, grouped by data source. Present via AskUserQuestion:
+```
+questions:
+  - question: "Do you have [data_type] data from [source] locally?"
+    header: "[data_type]"
+    options:
+      - label: "Yes, data exists"
+        description: "I have [data_type] in my local environment"
+      - label: "No, needs mock"
+        description: "I need this data mocked for testing"
+      - label: "Skip these tests"
+        description: "Log as assumptions and move on"
+    multiSelect: false
+```
+Handle responses:
+- "Yes, data exists" → reclassify affected tests as `mock_required: false`
+- "No, needs mock" → keep as `mock_required: true`, `mock_type: "external_data"`
+- "Skip these tests" → mark all affected tests as `skipped`
+Group by data source (not per-test) to stay within AskUserQuestion's 4-question limit.
+3. **Separate transient_state batch:** Transient states use a different mock strategy (delay/force) than data mocks. Give them their own batch.
+4. Respect dependencies (if B depends on A, A must be in same or earlier batch)
+5. Max 4 tests per batch (AskUserQuestion limit)
+6. Batch ordering: no-mock → external_data → error_state → empty_response → transient_state → premium_user → offline_state
 **Batch structure:**
 ```yaml
@@ -203,14 +273,24 @@ batches:
     tests: [1, 2, 3]
   - batch: 2
+    name: "External Data"
+    mock_type: "external_data"
+    tests: [4, 5]
+  - batch: 3
     name: "Error States"
     mock_type: "error_state"
-    tests: [4, 5, 6, 7]
+    tests: [6, 7, 8]
-  - batch: 3
+  - batch: 4
+    name: "Transient States"
+    mock_type: "transient_state"
+    tests: [9, 10]
+  - batch: 5
     name: "Premium Features"
     mock_type: "premium_user"
-    tests: [8, 9, 10]
+    tests: [11, 12]
 ```
 </step>

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mindsystem-cc",
-  "version": "3.10.1",
+  "version": "3.11.0",
   "description": "A meta-prompting, context engineering and spec-driven development system for Claude Code by TÂCHES.",
   "bin": {
     "mindsystem-cc": "bin/install.js"

package/scripts/__pycache__/compare_mockups.cpython-314.pyc ADDED Viewed

Binary file

package/scripts/compare_mockups.py ADDED Viewed

@@ -0,0 +1,219 @@
+#!/usr/bin/env python3
+# /// script
+# requires-python = ">=3.10"
+# ///
+"""Combine variant-*.html mockup files into a single side-by-side comparison page."""
+import re
+import sys
+from pathlib import Path
+def extract_title(html: str) -> str:
+    """Extract the content of the <title> tag from an HTML string."""
+    match = re.search(r"<title>(.*?)</title>", html, re.IGNORECASE)
+    return match.group(1) if match else "Untitled"
+def is_device_mockup(html: str) -> bool:
+    """Detect if the HTML is a phone-frame mockup (has a fixed-size .device element)."""
+    return bool(re.search(r"\.device\s*\{", html))
+def build_device_html(panels: list[tuple[str, str]]) -> str:
+    """Side-by-side layout for phone-frame mockups."""
+    panels_html = ""
+    for label, filename in panels:
+        panels_html += f"""
+      <div class="panel">
+        <h2>{label}</h2>
+        <iframe src="{filename}"></iframe>
+      </div>"""
+    return f"""\
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="UTF-8">
+<title>Mockup Comparison</title>
+<style>
+  * {{ margin: 0; padding: 0; box-sizing: border-box; }}
+  body {{
+    background: #1a1a1a;
+    color: #fff;
+    font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", sans-serif;
+    padding: 24px;
+  }}
+  h1 {{
+    text-align: center;
+    margin-bottom: 24px;
+    font-size: 24px;
+    font-weight: 600;
+  }}
+  .container {{
+    display: flex;
+    justify-content: center;
+    gap: 32px;
+    flex-wrap: wrap;
+  }}
+  .panel {{
+    display: flex;
+    flex-direction: column;
+    align-items: center;
+    gap: 12px;
+  }}
+  .panel h2 {{
+    font-size: 16px;
+    font-weight: 500;
+    color: #ccc;
+  }}
+  iframe {{
+    border: none;
+    border-radius: 8px;
+    background: #e5e5e5;
+    width: 450px;
+    height: 920px;
+  }}
+</style>
+</head>
+<body>
+  <h1>Mockup Comparison</h1>
+  <div class="container">{panels_html}
+  </div>
+</body>
+</html>
+"""
+def build_fluid_html(panels: list[tuple[str, str]]) -> str:
+    """Tabbed layout for full web page mockups."""
+    tabs_html = ""
+    panes_html = ""
+    for i, (label, filename) in enumerate(panels):
+        active = " active" if i == 0 else ""
+        tabs_html += f"""
+        <button class="tab{active}" data-index="{i}">{label}</button>"""
+        panes_html += f"""
+      <iframe class="pane{active}" data-index="{i}" src="{filename}"></iframe>"""
+    return f"""\
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="UTF-8">
+<title>Mockup Comparison</title>
+<style>
+  * {{ margin: 0; padding: 0; box-sizing: border-box; }}
+  html, body {{ height: 100%; }}
+  body {{
+    background: #1a1a1a;
+    color: #fff;
+    font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", sans-serif;
+    display: flex;
+    flex-direction: column;
+  }}
+  .tab-bar {{
+    display: flex;
+    gap: 4px;
+    padding: 16px 24px 0;
+    flex-shrink: 0;
+  }}
+  .tab {{
+    padding: 10px 24px;
+    border: none;
+    border-radius: 8px 8px 0 0;
+    background: #2a2a2a;
+    color: #888;
+    font-size: 14px;
+    font-weight: 500;
+    font-family: inherit;
+    cursor: pointer;
+    transition: background 0.15s, color 0.15s;
+  }}
+  .tab:hover {{
+    background: #333;
+    color: #bbb;
+  }}
+  .tab.active {{
+    background: #e5e5e5;
+    color: #111;
+  }}
+  .pane-container {{
+    flex: 1;
+    padding: 0 24px 24px;
+    min-height: 0;
+  }}
+  .pane {{
+    display: none;
+    width: 100%;
+    height: 100%;
+    border: none;
+    border-radius: 0 8px 8px 8px;
+    background: #e5e5e5;
+  }}
+  .pane.active {{
+    display: block;
+  }}
+</style>
+</head>
+<body>
+  <div class="tab-bar">{tabs_html}
+  </div>
+  <div class="pane-container">{panes_html}
+  </div>
+  <script>
+    document.querySelectorAll('.tab').forEach(tab => {{
+      tab.addEventListener('click', () => {{
+        const idx = tab.dataset.index;
+        document.querySelectorAll('.tab').forEach(t => t.classList.remove('active'));
+        document.querySelectorAll('.pane').forEach(p => p.classList.remove('active'));
+        tab.classList.add('active');
+        document.querySelector('.pane[data-index="' + idx + '"]').classList.add('active');
+      }});
+    }});
+  </script>
+</body>
+</html>
+"""
+def main() -> None:
+    if len(sys.argv) > 1:
+        mockups_dir = Path(sys.argv[1])
+    else:
+        mockups_dir = Path.cwd()
+    variants = sorted(mockups_dir.glob("variant-*.html"))
+    if not variants:
+        print(f"No variant-*.html files found in {mockups_dir}", file=sys.stderr)
+        sys.exit(1)
+    print(f"Found {len(variants)} variants: {', '.join(v.name for v in variants)}")
+    # Read variants and detect layout mode
+    panels: list[tuple[str, str]] = []
+    all_device = True
+    for path in variants:
+        html = path.read_text()
+        letter = path.stem.split("-")[-1].upper()
+        title = extract_title(html)
+        label = f"Variant {letter}: {title}"
+        if not is_device_mockup(html):
+            all_device = False
+        panels.append((label, path.name))
+    mode = "device" if all_device else "fluid"
+    print(f"Layout mode: {mode}")
+    if mode == "device":
+        comparison = build_device_html(panels)
+    else:
+        comparison = build_fluid_html(panels)
+    output = mockups_dir / "comparison.html"
+    output.write_text(comparison)
+    print(f"Generated: {output}")
+if __name__ == "__main__":
+    main()

package/mindsystem/references/checkpoint-detection.md DELETED Viewed

@@ -1,50 +0,0 @@
-<checkpoint_detection>
-Lite reference for identifying checkpoint types during task breakdown. Full checkpoint templates and examples are in the ms-plan-writer subagent.
-<checkpoint_types>
-| Type | Use When | Frequency |
-|------|----------|-----------|
-| `checkpoint:human-verify` | Claude automated work, human confirms visual/functional correctness | 90% |
-| `checkpoint:decision` | Human must choose between options affecting implementation | 9% |
-| `checkpoint:human-action` | Truly unavoidable manual step with no CLI/API (rare) | 1% |
-</checkpoint_types>
-<detection_rules>
-**Mark as `checkpoint:human-verify` when:**
-- Visual UI checks needed (layout, styling, responsiveness)
-- Interactive flows require human testing (click through wizard, test user flows)
-- Functional verification beyond automated tests (feature works as expected)
-- Audio/video playback, animation smoothness, accessibility
-**Mark as `checkpoint:decision` when:**
-- Technology selection (auth provider, database, library)
-- Architecture choices (monorepo vs separate, API patterns)
-- Design decisions (color scheme, layout approach)
-- Feature prioritization between variants
-**Mark as `checkpoint:human-action` when (rare):**
-- Email verification links (account creation)
-- SMS 2FA codes (phone verification)
-- Manual account approvals
-- Credit card 3D Secure flows
-- OAuth app approvals requiring browser
-</detection_rules>
-<automation_first_principle>
-**If it has CLI/API, Claude automates it. No exceptions.**
-Never create `checkpoint:human-action` for:
-- Deployments (use `vercel`, `railway`, `fly` CLI)
-- Database operations (use provider CLI)
-- Webhook setup (use APIs)
-- Environment files (use Write tool)
-- Running builds/tests (use Bash tool)
-The rule: Claude does everything automatable. Checkpoints verify AFTER automation, not replace it.
-</automation_first_principle>
-</checkpoint_detection>