npm - mindsystem-cc - Versions diffs - 3.17.1 → 3.18.1 - Mend

mindsystem-cc 3.17.1 → 3.18.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

package/agents/ms-consolidator.md +4 -4
package/agents/ms-debugger.md +3 -3
package/agents/ms-designer.md +33 -70
package/agents/ms-executor.md +7 -6
package/agents/ms-plan-writer.md +52 -26
package/agents/ms-researcher.md +13 -13
package/commands/ms/check-phase.md +1 -1
package/commands/ms/complete-milestone.md +47 -54
package/commands/ms/design-phase.md +33 -30
package/commands/ms/review-design.md +106 -395
package/mindsystem/references/principles.md +3 -3
package/mindsystem/references/routing/next-phase-routing.md +1 -1
package/mindsystem/references/scope-estimation.md +22 -35
package/mindsystem/templates/design-iteration.md +13 -13
package/mindsystem/templates/design.md +145 -327
package/mindsystem/templates/knowledge.md +1 -1
package/mindsystem/templates/milestone-archive.md +3 -3
package/mindsystem/templates/phase-prompt.md +6 -7
package/mindsystem/templates/research-subagent-prompt.md +2 -2
package/mindsystem/templates/research.md +7 -7
package/mindsystem/templates/roadmap.md +1 -1
package/mindsystem/templates/verification-report.md +1 -1
package/mindsystem/workflows/complete-milestone.md +52 -227
package/mindsystem/workflows/discuss-phase.md +3 -3
package/mindsystem/workflows/execute-plan.md +1 -1
package/mindsystem/workflows/plan-phase.md +22 -50
package/mindsystem/workflows/verify-phase.md +1 -1
package/package.json +1 -1
package/scripts/archive-milestone-files.sh +68 -0
package/scripts/archive-milestone-phases.sh +138 -0
package/scripts/gather-milestone-stats.sh +179 -0
package/scripts/ms-lookup/ms_lookup/backends/context7.py +17 -5
package/scripts/ms-lookup/ms_lookup/backends/perplexity.py +17 -3
package/scripts/ms-lookup-wrapper.sh +1 -1
package/scripts/scan-planning-context.py +186 -36
package/scripts/validate-execution-order.sh +4 -5
package/scripts/cleanup-phase-artifacts.sh +0 -68

package/agents/ms-consolidator.md CHANGED Viewed

@@ -67,10 +67,10 @@ You are a Mindsystem knowledge consolidator spawned by execute-phase after plan
 | DESIGN Content | Target Knowledge Section |
 |----------------|------------------------|
-| Component specs, layout choices | Design |
-| Design tokens, color values, spacing | Design |
-| UX flows, interaction patterns | Design |
-| Component states, responsive behavior | Design |
+| Wireframe layouts, inline annotations | Design |
+| Design tokens (colors, spacing, typography) | Design |
+| Behavior notes, interaction patterns | Design |
+| States tables, platform hints | Design |
 ### RESEARCH.md (distribute across affected subsystems)

package/agents/ms-debugger.md CHANGED Viewed

@@ -615,7 +615,7 @@ The cost of insufficient verification: bug returns, user frustration, emergency
 **2. Library/framework behavior doesn't match expectations**
 - Using library correctly but it's not working
 - Documentation contradicts behavior
-- **Action:** Check official docs (Context7), GitHub issues
+- **Action:** Check official docs (`ms-lookup docs`), GitHub issues
 **3. Domain knowledge gaps**
 - Debugging auth: need to understand OAuth flow
@@ -657,7 +657,7 @@ The cost of insufficient verification: bug returns, user frustration, emergency
 - Include version: `"react 18 useEffect behavior"`
 - Add "github issue" for known bugs
-**Context7 MCP:**
+**`ms-lookup docs`:**
 - For API reference, library concepts, function signatures
 **GitHub Issues:**
@@ -687,7 +687,7 @@ Is this an error message I don't recognize?
 └─ NO ↓
 Is this library/framework behavior I don't understand?
-├─ YES → Check docs (Context7 or official docs)
+├─ YES → Check docs (`ms-lookup docs` or official docs)
 └─ NO ↓
 Is this code I/my team wrote?

package/agents/ms-designer.md CHANGED Viewed

@@ -17,10 +17,9 @@ Your job: Transform user vision into concrete, implementable design specificatio
 **Core responsibilities:**
 - Analyze existing project aesthetic (project UI skill, codebase patterns)
 - Apply quality-forcing patterns (commercial benchmark, pre-emptive criticism, self-review)
-- Create ASCII wireframes with precise spacing and component placement
-- Specify component behaviors, states, and platform-specific requirements
-- Document UX flows with clear user journey steps
-- Provide verification criteria that prove design was implemented correctly
+- Create ASCII wireframes with inline annotations (token refs, spacing, dimensions, component names)
+- Specify component states, non-obvious behaviors, and implementation hints per screen
+- Produce a compact Design Tokens table where every value appears exactly once
 </role>
 <upstream_input>
@@ -196,7 +195,8 @@ After completing design but BEFORE returning:
 2. **Verify color pairs** — Primary text/background, secondary text/background
 3. **Check all interactive elements** — Buttons, links, inputs, cards with actions
 4. **Fix violations** — Adjust specs until all checks pass
-5. **Document in Verification Criteria** — Note which validations were verified
+Append `Validation: passed` to the end of DESIGN.md after all checks pass.
 **This validation is not optional.** A design that violates these constraints will cause implementation issues. Fix now, not later.
 </validation_rules>
@@ -242,54 +242,19 @@ Based on context chain, determine:
 - **Color direction:** warm, cool, monochromatic, vibrant (with specific values)
 - **Density:** tight, comfortable, spacious
-Document these decisions in the Visual Identity section.
+Capture these in the Design Direction section and populate the Design Tokens table.
-## Step 5: Design Screens/Layouts
+## Step 5: Design Screens
 For each screen in the phase:
-1. Create ASCII wireframe with component placement
-2. Specify spacing (edge padding, component gaps) with exact values
-3. List components used (reference existing or define new)
-4. Note responsive behavior (breakpoints, layout changes)
+1. Create ASCII wireframe with inline annotations (token refs, spacing values, dimensions, component names)
+2. Create States table (element, state, change, trigger)
+3. Write Behavior notes (non-obvious interactions only — skip obvious actions)
+4. Write Hints (framework-specific reuse, gotchas, responsive behavior)
 Apply quality-forcing patterns — check for generic output after each screen.
-## Step 6: Specify Components
-For each new or modified component:
-1. Visual description (colors, borders, shadows with exact values)
-2. States (default, hover, active, disabled, loading)
-3. Size constraints (min/max dimensions)
-4. Platform-specific notes (if cross-platform)
-## Step 7: Document UX Flows
-For each user journey in this phase:
-1. Entry point (what triggers the flow)
-2. Steps (numbered sequence of user actions and system responses)
-3. Decision points (branches in the flow)
-4. Success state (what user sees on completion)
-5. Error handling (what happens when things go wrong)
-## Step 8: Capture Design Decisions
-Create decision table with rationale:
-| Category | Decision | Values | Rationale |
-|----------|----------|--------|-----------|
-| Colors | Primary | `#[hex]` | [why this specific color] |
-| Typography | Headings | [font] | [why chosen] |
-| Spacing | Base unit | [value]px | [why this scale] |
-## Step 9: Write Verification Criteria
-Observable behaviors that prove correct implementation:
-- Visual verification (colors, typography, spacing match spec)
-- Functional verification (interactions work as designed)
-- Platform verification (responsive, touch targets, safe areas)
-- Accessibility verification (contrast, screen reader, focus states)
-## Step 10: Self-Review and Refine
+## Step 6: Self-Review and Refine
 Run through the quality-forcing checklist:
 - [ ] Does the color palette have character or is it generic?
@@ -299,7 +264,7 @@ Run through the quality-forcing checklist:
 If any answer is "generic/arbitrary/default/no" → refine before returning.
-## Step 11: Mathematical Validation (BLOCKING)
+## Step 7: Mathematical Validation (BLOCKING)
 Run through validation rules from `<validation_rules>` section:
@@ -313,10 +278,7 @@ Run through validation rules from `<validation_rules>` section:
 - Re-run validation
 - Do NOT proceed until all checks pass
-**Document in Verification Criteria:**
-- "Validation passed: bounds, touch targets, spacing, contrast"
-## Step 12: Write DESIGN.md
+## Step 8: Write DESIGN.md
 Write to: `.planning/phases/{phase}-{slug}/{phase}-DESIGN.md`
@@ -328,20 +290,22 @@ Return brief confirmation to orchestrator.
 <output_format>
 ## DESIGN.md Structure
-Write the design specification following the 7-section template:
+Write the design specification following the 3-section template:
-1. **Visual Identity** — Philosophy, direction, inspiration (2-3 sentences)
-2. **Screen Layouts** — ASCII wireframes with dimensions, spacing, components
-3. **Component Specifications** — Visual, states, content, platform notes
-4. **UX Flows** — Entry, steps, decisions, success, error handling
-5. **Design System Decisions** — Colors, typography, spacing with rationale table
-6. **Platform-Specific Notes** — Responsive, touch targets, accessibility
-7. **Verification Criteria** — Observable behaviors proving correct implementation
+1. **Design Direction** — 1-2 sentences: feel, inspiration, what sets this apart
+2. **Design Tokens** — Compact table: token, value, note. Every value appears once.
+3. **Screens** — Each screen is self-contained:
+   - ASCII wireframe with inline annotations (token refs, spacing, dimensions)
+   - States table (element, state, change, trigger)
+   - Behavior notes (non-obvious interactions only)
+   - Hints (framework-specific reuse, gotchas, responsive)
 All values must be specific:
 - Colors: `#hex` values, not "blue" or "primary"
 - Spacing: `16px`, not "some padding"
 - Typography: `14px/500`, not "medium text"
+End with `Validation: passed` after mathematical validation completes.
 </output_format>
 <structured_returns>
@@ -362,9 +326,9 @@ When design finishes successfully:
 ### Screens Designed
-| Screen | Components | Notes |
-|--------|------------|-------|
-| [name] | [count] | [key feature] |
+| Screen | States | Hints | Notes |
+|--------|--------|-------|-------|
+| [name] | [count] | [count] | [key feature] |
 ### Files Created
@@ -408,20 +372,19 @@ User decision to continue design.
 <success_criteria>
 Design is complete when:
-- [ ] All screens have ASCII layouts with spacing specified
-- [ ] All new components have state definitions
+- [ ] All screens have ASCII wireframes with inline annotations (token refs, spacing, dimensions)
+- [ ] All component states defined in per-screen States tables
 - [ ] Color palette has character (not generic dark gray + blue)
 - [ ] Spacing values follow consistent scale (4/8/12/16/24/32)
 - [ ] Typography hierarchy is explicit (sizes, weights, colors)
-- [ ] UX flows document all user journeys
-- [ ] Verification criteria are observable and testable
+- [ ] Non-obvious behaviors documented in Behavior notes
+- [ ] Implementation hints provided per screen (reuse, gotchas, responsive)
 - [ ] Self-review checklist passed (no generic/arbitrary answers)
-- [ ] **Mathematical validation passed (bounds, touch targets, spacing, contrast)**
-- [ ] DESIGN.md written to phase directory
+- [ ] **Mathematical validation passed (bounds, touch targets, spacing, contrast) and DESIGN.md written to phase directory**
 Quality indicators:
 - **Specific:** Hex codes, pixel values, font weights — not vague descriptions
 - **Intentional:** Every decision has rationale — not arbitrary choices
 - **Professional:** Passes accountability check — would show to a designer
-- **Verifiable:** Criteria are observable — can prove implementation matches
+- **Compact:** Values appear exactly once — in tokens table or inline annotation
 </success_criteria>

package/agents/ms-executor.md CHANGED Viewed

@@ -17,12 +17,13 @@ Follow the workflow exactly:
 </role>
 <design_context>
-**If plan references DESIGN.md:** The DESIGN.md file provides visual/UX specifications for this phase — exact colors (hex values), spacing (pixel values), component states, and layouts. When implementing UI:
-- Use exact color values from the design spec, not approximations
-- Follow the specified spacing scale (e.g., 4/8/12/16/24/32)
-- Implement all component states (default, hover, active, disabled, loading)
-- Match ASCII wireframe layouts for component placement
-- Include verification criteria from DESIGN.md in your task verification
+**If plan references DESIGN.md:** The DESIGN.md file provides visual/UX specifications for this phase. When implementing UI:
+- Use exact values from the Design Tokens table (hex colors, px spacing, font weights)
+- Follow inline wireframe annotations for layout, spacing, and component placement
+- Implement all states from per-screen States tables (default, hover, active, disabled, loading)
+- Follow Behavior notes for non-obvious interactions
+- Apply Hints for framework-specific reuse and gotchas
+- Derive verification criteria from token values, states, and behavior specs
 </design_context>
 <completion_format>

package/agents/ms-plan-writer.md CHANGED Viewed

@@ -151,24 +151,43 @@ Wave assignments are written to EXECUTION-ORDER.md, not to individual plans.
 </step>
 <step name="group_into_plans">
-**Group tasks into plans based on wave and feature affinity.**
+**Group tasks into plans using budget-based packing per wave.**
-Rules:
-1. **Same-wave tasks with no file conflicts → parallel plans**
-2. **Tasks with shared files → same plan**
-3. **TDD candidates → dedicated plans (one feature per TDD plan)**
-4. **2-3 tasks per plan, ~50% context target**
-5. **Default to 3 tasks for simple-medium work, 2 for complex**
+**1. Classify task weight** (L/M/H) from action description, file count, domain complexity using the weight table in scope-estimation.md.
-Grouping algorithm:
+**2. Greedy budget packing per wave:**
 ```
-1. Start with Wave 1 tasks (no dependencies)
-2. Group by feature affinity (vertical slice)
-3. Check file ownership (no conflicts)
-4. Move to Wave 2, repeat
-5. Continue until all tasks assigned
+For each wave:
+  Sort tasks by feature affinity (vertical slice)
+  current_plan = new plan
+  current_budget = 0
+  For each task:
+    If task.tdd_candidate:
+      Create dedicated plan for this task (TDD plans always standalone)
+      Continue
+    task_cost = marginal_cost(task.weight)
+    If current_budget + task_cost > 45% AND current_plan not empty:
+      Finalize current_plan
+      current_plan = new plan
+      current_budget = 0
+    If no file conflicts with current_plan:
+      Add task to current_plan
+      current_budget += task_cost
+    Else:
+      Finalize current_plan
+      current_plan = new plan with this task
+      current_budget = task_cost
 ```
+**3. Minimum threshold check:** Plans under ~10% → merge with adjacent same-wave plan if no file conflicts and combined budget <= 45%.
+**4. File ownership constraint:** Tasks sharing files must be in the same plan.
+**5. Record grouping rationale** for each plan (weight classification, budget total, merge decisions).
 **Plan assignment:**
 - Each plan gets a sequential number (01, 02, 03...)
 </step>
@@ -197,16 +216,14 @@ The verifier derives artifacts and key_links from the plan's ## Changes section.
 **Verify each plan fits context budget.**
 Per plan:
-- Count tasks (target: 2-3, max: 3)
-- Count files modified (target: 5-8, max: 10)
-- Estimate context usage (target: ~50%)
+- Sum weights (target: 25-45%)
+- Check minimum threshold (plans under ~10% → consolidate)
+- Count files modified (max: 10)
 If any plan exceeds:
-- 4+ tasks: Split by feature
+- Budget > 45%: Split by feature affinity
 - 10+ files: Split by subsystem
-- Complex domain (auth, payments): Consider extra split
-Default to 3 tasks for simple-medium work, 2 for complex. Executor overhead reduction creates headroom for the third task.
+- Under 10%: Merge with related same-wave plan
 </step>
 <step name="write_plan_files">
@@ -334,11 +351,11 @@ Capture commit hash for return.
 score = 0
 factors = []
-# Task count (4+ tasks in any plan)
-max_tasks = max(task_count for each plan)
-if max_tasks >= 4:
+# Budget per plan (>45%)
+max_budget = max(budget_sum for each plan)
+if max_budget > 45:
   score += 15
-  factors.append(f"Plan has {max_tasks} tasks")
+  factors.append(f"Plan exceeds 45% budget ({max_budget}%)")
 # Plan count (5+ plans in phase)
 if plan_count >= 5:
@@ -400,6 +417,13 @@ Return structured markdown to orchestrator:
 | 1 | 01, 02 | None (parallel) |
 | 2 | 03 | Waits for 01, 02 |
+### Grouping Rationale
+| Plan | Tasks | Est. Marginal | Notes |
+|------|-------|---------------|-------|
+| 01 | 4 (L+L+L+M) | ~25% | Merged light fixes with medium refactor |
+| 02 | 3 (M+M+M) | ~30% | Vertical slice: model + API + tests |
 ### Risk Assessment
 **Score:** {score}/100 ({tier})
@@ -416,6 +440,8 @@ Return structured markdown to orchestrator:
 - ...
 ```
+**Include Grouping Rationale** only when consolidation occurred or grouping is non-obvious. Omit for straightforward groupings.
 The orchestrator parses this to present risk via AskUserQuestion and offer next steps.
 </output_format>
@@ -433,7 +459,7 @@ Bad: Plan 01 = all models, Plan 02 = all APIs
 Good: Plan 01 = User (model + API), Plan 02 = Product (model + API)
 **DO NOT exceed scope limits.**
-4+ tasks per plan → split. Complex domains → split. Files > 10 → split.
+Budget > 45% → split. Single light task alone → consolidate. Files > 10 → split.
 **DO NOT write implementation-focused truths.**
 Bad: "bcrypt library installed"
@@ -451,7 +477,7 @@ Plan writing complete when:
 - [ ] References loaded (phase-prompt, plan-format, scope-estimation, + tdd if needed)
 - [ ] Dependency graph built from needs/creates
 - [ ] Waves assigned (all roots wave 1, dependents correct)
-- [ ] Tasks grouped into plans (2-3 tasks, ~50% context)
+- [ ] Tasks grouped into plans (budget-based, target 25-45%, consolidate under 10%)
 - [ ] Must-haves derived as markdown checklists
 - [ ] PLAN.md files written with pure markdown format
 - [ ] EXECUTION-ORDER.md generated with wave groups

package/agents/ms-researcher.md CHANGED Viewed

@@ -59,9 +59,9 @@ Claude's training data is 6-18 months stale. Treat pre-existing knowledge as hyp
 - Wrong (Claude misremembered or hallucinated)
 **The discipline:**
-1. **Verify before asserting** - Don't state library capabilities without checking Context7 or official docs
+1. **Verify before asserting** - Don't state library capabilities without checking `ms-lookup docs` or official docs
 2. **Date your knowledge** - "As of my training" is a warning flag, not a confidence marker
-3. **Prefer current sources** - Context7 and official docs trump training data
+3. **Prefer current sources** - `ms-lookup docs` and official docs trump training data
 4. **Flag uncertainty** - LOW confidence when only training data supports a claim
 ## Honest Reporting
@@ -197,7 +197,7 @@ When researching "best library for X":
 The CLI is at `~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh`.
-### Library Documentation (Context7)
+### Library Documentation
 ```bash
 ~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh docs <library> "<query>"
@@ -213,7 +213,7 @@ Example:
 **Response format:** JSON with results array containing title, content, source_url, tokens.
-### Deep Research (Perplexity)
+### Deep Research
 ```bash
 ~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh deep "<query>"
@@ -255,7 +255,7 @@ WebSearch("[technology] vs [alternative] {current_year}")
 **Always include current year** in queries for freshness.
-**Why WebSearch over Perplexity search:** Free with Max subscription. Perplexity search costs $5/1k queries with marginal quality improvement for discovery tasks.
+**Why WebSearch over `ms-lookup deep`:** WebSearch is free. `ms-lookup deep` costs money per query — reserve it for high-value questions, not discovery.
 ## Token Limit Strategy (for ms-lookup)
@@ -264,7 +264,7 @@ WebSearch("[technology] vs [alternative] {current_year}")
 **Rationale:**
 - The 50% rule: Research must complete before hitting 100k tokens
 - At 2000 tokens/query, you can make ~50 queries
-- Context7 returns results ranked by relevance — first 3-4 snippets are most important
+- `ms-lookup docs` returns results ranked by relevance — first 3-4 snippets are most important
 - Query flexibility > per-query comprehensiveness
 **When to increase (`--max-tokens 4000-6000`):**
@@ -296,15 +296,15 @@ WebSearch("[technology] vs [alternative] {current_year}")
 | Level | Sources | Use |
 |-------|---------|-----|
-| HIGH | Context7, official documentation, official releases | State as fact |
+| HIGH | `ms-lookup docs`, official documentation, official releases | State as fact |
 | MEDIUM | WebSearch verified with official source, multiple credible sources agree | State with attribution |
 | LOW | WebSearch only, single source, unverified | Flag as needing validation |
 ## Source Prioritization
-**1. Context7 (highest priority)**
-- Current, authoritative documentation
-- Library-specific, version-aware
+**1. `ms-lookup docs` (highest priority)**
+- Current, authoritative library documentation
+- Version-aware, sourced from official docs
 - Trust completely for API/feature questions
 **2. Official Documentation**
@@ -526,7 +526,7 @@ Based on research question, identify what needs investigating:
 For each domain, follow tool strategy in order:
-1. **Context7 First** - Resolve library, query topics
+1. **`ms-lookup docs` First** - Query library documentation
 2. **Official Docs** - WebFetch for gaps
 3. **WebSearch** - Ecosystem discovery with year
 4. **Verification** - Cross-reference all findings
@@ -626,7 +626,7 @@ When research cannot proceed:
 Research is complete when:
 - [ ] Research question answered with actionable findings
-- [ ] Source hierarchy followed (Context7 → Official → WebSearch)
+- [ ] Source hierarchy followed (`ms-lookup docs` → Official → WebSearch)
 - [ ] All findings have confidence levels
 - [ ] Verification protocol checklist passed
 - [ ] Output file(s) created in correct format
@@ -636,7 +636,7 @@ Research is complete when:
 Research quality indicators:
 - **Specific, not vague:** "Three.js r160 with @react-three/fiber 8.15" not "use Three.js"
-- **Verified, not assumed:** Findings cite Context7 or official docs
+- **Verified, not assumed:** Findings cite `ms-lookup docs` or official docs
 - **Honest about gaps:** LOW confidence items flagged, unknowns admitted
 - **Actionable:** Developer could start work based on this research
 - **Current:** Year included in searches, publication dates checked

package/commands/ms/check-phase.md CHANGED Viewed

@@ -21,7 +21,7 @@ This spawns ms-plan-checker to analyze your PLAN.md files against the phase goal
 2. **Task Completeness** — Does every change have Files, implementation details, and verification?
 3. **Dependency Correctness** — Are plan dependencies valid and acyclic?
 4. **Key Links Planned** — Are artifacts wired together, not just created in isolation?
-5. **Scope Sanity** — Will plans complete within context budget (2-3 tasks per plan)?
+5. **Scope Sanity** — Will plans complete within context budget (target 25-45%)?
 6. **Verification Derivation** — Are Must-Haves user-observable, not implementation-focused?
 7. **Context Compliance** — Do plans honor decisions from CONTEXT.md?
 </what_it_checks>

package/commands/ms/complete-milestone.md CHANGED Viewed

@@ -62,96 +62,89 @@ Output: Milestone archived (roadmap + requirements), PROJECT.md evolved, git tag
    ✓ Milestone audit passed. Proceeding with completion.
    ```
-1. **Verify readiness:**
-   - Check all phases in milestone have completed plans (SUMMARY.md exists)
-   - Present milestone scope and stats
-   - Wait for confirmation
-1.5. **Clean up raw artifacts:**
-   Delete remaining raw artifacts from phase directories. Knowledge files are already current from phase-level consolidation in execute-phase.
+1. **Verify readiness and gather stats:**
    ```bash
-   ~/.claude/mindsystem/scripts/cleanup-phase-artifacts.sh $PHASE_START $PHASE_END
+   ~/.claude/mindsystem/scripts/gather-milestone-stats.sh $PHASE_START $PHASE_END
    ```
-2. **Gather stats:**
-   - Count phases, plans, tasks
-   - Calculate git range, file changes, LOC
-   - Extract timeline from git log
-   - Present summary, confirm
+   - If NOT READY: stop and report incomplete plans
+   - If READY: present readiness + stats summary, proceed
-3. **Extract accomplishments:**
+2. **Extract accomplishments:**
    - Read all phase SUMMARY.md files in milestone range
    - Extract 4-6 key accomplishments
-   - Present for approval
-4. **Archive milestone:**
+3. **Create MILESTONES.md entry:**
+   - Create or update `.planning/MILESTONES.md` using `templates/milestone.md`
+   - Prepend new entry (reverse chronological order)
+4. **Update PROJECT.md:**
+   - Full evolution review (What This Is, Core Value, Requirements audit, Key Decisions, Context)
+   - Move all shipped requirements to Validated
+   - Update "Last updated" footer
+5. **Archive milestone:**
    - Create `.planning/milestones/v{{version}}/` directory
-   - Create `.planning/milestones/v{{version}}/ROADMAP.md`
-   - Extract full phase details from ROADMAP.md
-   - Fill milestone-archive.md template
-   - Update ROADMAP.md to one-line summary with link
+   - Create `.planning/milestones/v{{version}}/ROADMAP.md` from template
+   - Delete ROADMAP.md
-5. **Archive requirements:**
+6. **Archive requirements:**
    - Create `.planning/milestones/v{{version}}/REQUIREMENTS.md`
-   - Mark all v1 requirements as complete (checkboxes checked)
-   - Note requirement outcomes (validated, adjusted, dropped)
-   - Delete `.planning/REQUIREMENTS.md` (fresh one created for next milestone)
+   - Mark all requirements as complete with outcomes
+   - Delete `.planning/REQUIREMENTS.md`
+7. **Archive milestone files:**
-5.5. **Archive research:**
+   ```bash
+   ~/.claude/mindsystem/scripts/archive-milestone-files.sh v{{version}}
+   ```
-   - If `.planning/research/` exists, move to `.planning/milestones/v{{version}}/research/`
-   - Skip silently if no research directory exists
+8. **Archive and cleanup phases:**
+   ```bash
+   ~/.claude/mindsystem/scripts/archive-milestone-phases.sh $PHASE_START $PHASE_END v{{version}}
+   ```
-6. **Update PROJECT.md:**
+9. **Update STATE.md:**
-   - Add "Current State" section with shipped version
-   - Add "Next Milestone Goals" section
-   - Archive previous content in `<details>` (if v1.1+)
+    - Update project reference with current core value and next focus
+    - Reset current position for next milestone
-7. **Commit and tag:**
+10. **Commit and tag:**
-   - Stage: MILESTONES.md, PROJECT.md, ROADMAP.md, STATE.md, archive files
-   - Commit: `chore: archive v{{version}} milestone`
-   - Tag: `git tag -a v{{version}} -m "[milestone summary]"`
-   - Ask about pushing tag
+    - Stage: MILESTONES.md, PROJECT.md, STATE.md, archive files, deletions
+    - Commit: `chore: archive v{{version}} milestone`
+    - Tag: `git tag -a v{{version}} -m "[milestone summary]"`
+    - Ask about pushing tag
-8. **Offer next steps:**
-   - `/ms:new-milestone` — discover goals and update PROJECT.md (includes optional discovery mode)
+11. **Offer next steps:**
+    - `/ms:new-milestone` — discover goals and update PROJECT.md
-9. **Update last command**
-   - Update `.planning/STATE.md` Last Command field
-   - Format: `Last Command: ms:complete-milestone $ARGUMENTS | YYYY-MM-DD HH:MM`
+12. **Update last command:**
+    - Format: `Last Command: ms:complete-milestone $ARGUMENTS | YYYY-MM-DD HH:MM`
 </process>
 <success_criteria>
-- Raw artifacts cleaned from phase directories (CONTEXT, DESIGN, RESEARCH, SUMMARY, UAT, VERIFICATION, EXECUTION-ORDER)
+- PROJECT.md full evolution review completed (What This Is, Core Value, Requirements, Key Decisions, Context)
+- All shipped requirements moved to Validated in PROJECT.md
+- Key Decisions updated with outcomes
 - Milestone archived to `.planning/milestones/v{{version}}/ROADMAP.md`
 - Requirements archived to `.planning/milestones/v{{version}}/REQUIREMENTS.md`
-- Research archived to `.planning/milestones/v{{version}}/research/` (if existed)
 - `.planning/REQUIREMENTS.md` deleted (fresh for next milestone)
-- ROADMAP.md collapsed to one-line entry
-- PROJECT.md updated with current state
 - Git tag v{{version}} created
-- Commit successful
-- User knows next steps (/ms:new-milestone)
   </success_criteria>
 <critical_rules>
-- **Load workflow first:** Read complete-milestone.md before executing
 - **Verify completion:** All phases must have SUMMARY.md files
-- **User confirmation:** Wait for approval at verification gates
 - **Archive before deleting:** Always create archive files (ROADMAP, REQUIREMENTS) before updating/deleting originals
 - **One-line summary:** Collapsed milestone in ROADMAP.md should be single line with link
-- **Context efficiency:** Archive keeps ROADMAP.md and REQUIREMENTS.md constant size per milestone
-- **Fresh requirements:** Next milestone starts with `/ms:create-roadmap`, not reusing old file
   </critical_rules>