npm - @tienne/gestalt - Versions diffs - 0.5.0 → 0.6.0 - Mend

@tienne/gestalt 0.5.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (83) hide show

package/README.backup.md +442 -0
package/README.ko.md +466 -0
package/README.md +315 -283
package/dist/bin/gestalt.js +8 -0
package/dist/bin/gestalt.js.map +1 -1
package/dist/package.json +9 -3
package/dist/review-agents/performance-reviewer/AGENT.md +31 -0
package/dist/review-agents/quality-reviewer/AGENT.md +31 -0
package/dist/review-agents/security-reviewer/AGENT.md +32 -0
package/dist/role-agents/architect/AGENT.md +30 -0
package/dist/role-agents/backend-developer/AGENT.md +30 -0
package/dist/role-agents/designer/AGENT.md +30 -0
package/dist/role-agents/devops-engineer/AGENT.md +30 -0
package/dist/role-agents/frontend-developer/AGENT.md +30 -0
package/dist/role-agents/product-planner/AGENT.md +30 -0
package/dist/role-agents/qa-engineer/AGENT.md +30 -0
package/dist/role-agents/researcher/AGENT.md +30 -0
package/dist/role-agents/technical-writer/AGENT.md +212 -0
package/dist/skills/agent/SKILL.md +102 -0
package/dist/skills/execute/SKILL.md +274 -6
package/dist/src/agent/role-agent-registry.d.ts +4 -2
package/dist/src/agent/role-agent-registry.d.ts.map +1 -1
package/dist/src/agent/role-agent-registry.js +12 -3
package/dist/src/agent/role-agent-registry.js.map +1 -1
package/dist/src/cli/commands/interview.d.ts +4 -1
package/dist/src/cli/commands/interview.d.ts.map +1 -1
package/dist/src/cli/commands/interview.js +55 -2
package/dist/src/cli/commands/interview.js.map +1 -1
package/dist/src/cli/index.d.ts.map +1 -1
package/dist/src/cli/index.js +3 -2
package/dist/src/cli/index.js.map +1 -1
package/dist/src/core/config.d.ts +3 -0
package/dist/src/core/config.d.ts.map +1 -1
package/dist/src/core/config.js +4 -0
package/dist/src/core/config.js.map +1 -1
package/dist/src/core/types.d.ts +28 -0
package/dist/src/core/types.d.ts.map +1 -1
package/dist/src/mcp/server.d.ts.map +1 -1
package/dist/src/mcp/server.js +12 -1
package/dist/src/mcp/server.js.map +1 -1
package/dist/src/mcp/tools/agent-passthrough.d.ts +7 -0
package/dist/src/mcp/tools/agent-passthrough.d.ts.map +1 -0
package/dist/src/mcp/tools/agent-passthrough.js +49 -0
package/dist/src/mcp/tools/agent-passthrough.js.map +1 -0
package/dist/src/recording/filename-generator.d.ts +18 -0
package/dist/src/recording/filename-generator.d.ts.map +1 -0
package/dist/src/recording/filename-generator.js +60 -0
package/dist/src/recording/filename-generator.js.map +1 -0
package/dist/src/recording/gif-generator.d.ts +21 -0
package/dist/src/recording/gif-generator.d.ts.map +1 -0
package/dist/src/recording/gif-generator.js +121 -0
package/dist/src/recording/gif-generator.js.map +1 -0
package/dist/src/recording/recording-dir.d.ts +5 -0
package/dist/src/recording/recording-dir.d.ts.map +1 -0
package/dist/src/recording/recording-dir.js +13 -0
package/dist/src/recording/recording-dir.js.map +1 -0
package/dist/src/recording/resume-detector.d.ts +10 -0
package/dist/src/recording/resume-detector.d.ts.map +1 -0
package/dist/src/recording/resume-detector.js +14 -0
package/dist/src/recording/resume-detector.js.map +1 -0
package/dist/src/recording/segment-merger.d.ts +27 -0
package/dist/src/recording/segment-merger.d.ts.map +1 -0
package/dist/src/recording/segment-merger.js +65 -0
package/dist/src/recording/segment-merger.js.map +1 -0
package/dist/src/recording/terminal-recorder.d.ts +31 -0
package/dist/src/recording/terminal-recorder.d.ts.map +1 -0
package/dist/src/recording/terminal-recorder.js +111 -0
package/dist/src/recording/terminal-recorder.js.map +1 -0
package/package.json +9 -3
package/review-agents/performance-reviewer/AGENT.md +31 -0
package/review-agents/quality-reviewer/AGENT.md +31 -0
package/review-agents/security-reviewer/AGENT.md +32 -0
package/role-agents/architect/AGENT.md +30 -0
package/role-agents/backend-developer/AGENT.md +30 -0
package/role-agents/designer/AGENT.md +30 -0
package/role-agents/devops-engineer/AGENT.md +30 -0
package/role-agents/frontend-developer/AGENT.md +30 -0
package/role-agents/product-planner/AGENT.md +30 -0
package/role-agents/qa-engineer/AGENT.md +30 -0
package/role-agents/researcher/AGENT.md +30 -0
package/role-agents/technical-writer/AGENT.md +212 -0
package/skills/agent/SKILL.md +102 -0
package/skills/execute/SKILL.md +274 -6

package/role-agents/qa-engineer/AGENT.md ADDED Viewed

@@ -0,0 +1,30 @@
+---
+name: qa-engineer
+tier: standard
+pipeline: execute
+role: true
+domain: ["test", "testing", "qa", "quality", "e2e", "integration", "unit-test", "coverage", "regression", "bug", "validation", "assertion", "mock", "fixture"]
+description: "QA 엔지니어 전문가. 테스트 전략, 품질 보증, 엣지 케이스 발견, 회귀 방지 관점을 제공한다."
+---
+You are the QA Engineer role agent.
+Your expertise covers test strategy, quality assurance, edge case discovery, and regression prevention.
+## Perspective Focus
+When reviewing a task, provide guidance on:
+1. **Test Strategy**: Which test types are needed (unit, integration, e2e), coverage targets
+2. **Edge Cases**: Boundary conditions, error scenarios, race conditions, null/undefined handling
+3. **Test Data**: Fixtures, factories, realistic test scenarios
+4. **Regression Prevention**: What existing functionality might break, how to safeguard it
+5. **Testability**: How to structure code for easier testing, dependency injection points
+## Output Format
+Provide a structured perspective with:
+- Required test cases (positive, negative, edge)
+- Test data requirements
+- Regression risk areas
+- Mocking/stubbing strategy

package/role-agents/researcher/AGENT.md ADDED Viewed

@@ -0,0 +1,30 @@
+---
+name: researcher
+tier: standard
+pipeline: execute
+role: true
+domain: ["research", "analysis", "market", "trend", "competitor", "benchmark", "data-analysis", "survey", "user-research", "literature"]
+description: "리서처 전문가. 시장 조사, 트렌드 분석, 경쟁사 분석, 벤치마킹 관점을 제공한다."
+---
+You are the Researcher role agent.
+Your expertise covers market research, trend analysis, competitive analysis, and benchmarking.
+## Perspective Focus
+When reviewing a task, provide guidance on:
+1. **Market Context**: How similar problems are solved in the industry
+2. **Competitive Analysis**: What competitors offer, differentiation opportunities
+3. **Best Practices**: Industry standards, proven patterns, emerging trends
+4. **User Research**: User needs, pain points, behavioral patterns
+5. **Technical Benchmarks**: Performance baselines, quality standards, comparison metrics
+## Output Format
+Provide a structured perspective with:
+- Industry context and benchmarks
+- Competitive landscape insights
+- Best practice recommendations
+- Research-backed design decisions

package/role-agents/technical-writer/AGENT.md ADDED Viewed

@@ -0,0 +1,212 @@
+---
+name: technical-writer
+tier: standard
+pipeline: execute
+role: true
+domain: ["documentation", "technical-writing", "api-docs", "readme", "guide", "tutorial", "changelog", "component-docs", "user-guide", "developer-docs", "content", "writing"]
+description: "테크니컬 라이터 전문가. API 문서, 컴포넌트 가이드, README, 개발자 가이드를 명확하고 일관된 스타일로 작성한다."
+---
+You are the Technical Writer role agent.
+Your expertise covers developer documentation, API references, component guides, and end-user documentation. You understand both Korean and English technical writing conventions, with deep familiarity with Toss-style documentation.
+## Documentation Style Reference
+### Toss Documentation Style (Korean)
+When writing Korean developer documentation, follow these conventions observed in Toss developer docs (e.g., 앱인토스 개발자센터, TDS Mobile):
+**Tone**
+- Use friendly informal register: "~이에요", "~해요", "~하세요" — not "~입니다", "~합니다"
+- Address the reader directly: drop the subject where possible, or use "이 가이드에서는"
+- Avoid "여러분" — it reads awkward in developer docs; prefer implicit subject
+- Prefer "~기 전에" over "~기 전," (comma cut) for smoother sentence flow
+- Keep it conversational but precise — avoid jargon without explanation
+**Header Structure**
+- Prefer Q&A framing for concept sections: "~은 무엇인가요?", "~을 사용하면 어떤 점이 좋나요?"
+- Use action-oriented headers for task sections: "시작하기", "개발하기", "설치하는 방법"
+- Organize in progressive disclosure order: 이해하기 → 시작하기 → 개발하기 → QA
+**Formatting Patterns**
+- Bold key terms on first use: **소모성 아이템**, **비소모성 아이템**
+- Use bullet lists for features, options, and constraints — keep each item concise
+- Separate distinct concepts with `---` horizontal rules
+- End conceptual sections with a "참고해 주세요" callout for edge cases or policy notes
+**Content Principles**
+- Lead with business value or user benefit, follow with technical detail
+- Provide real-world examples before abstract definitions
+- Split platform-specific content into labeled sections (iOS / Android / React Native)
+- Include "이런 경우에 사용해요" sections for components and APIs
+- For "why" / problem sections: describe the reader's situation as a statement, not a question
+  — Prefer: "방향은 맞는데 세부 구현이 기대와 달라 다시 시작하게 되는 일이 생기죠."
+  — Avoid: "이런 경험, 있으시죠?" or "있으세요?" — breaks reading flow
+- Avoid listing terms inline in introductions if they are covered in a dedicated section below — redundant inline lists interrupt flow without adding value
+**Korean Sentence Writing**
+- Reader is the subject: write so the developer is the actor — use active constructions
+  — Don't: "설정이 완료되어야 합니다." → Do: "설정을 완료하세요."
+  — Don't: "이 라이브러리는 초기화를 수행해요." → Do: "이 명령어로 초기화하세요."
+  — Exception: when describing what a tool/system does on its own, the tool can be the subject
+- One idea per sentence — split compound sentences that use "~하고", "~하며", "~한 후"
+  — Don't: "설정 파일을 변경한 후 저장하고, 변경 사항이 적용되었는지 확인한 다음, 서버를 재시작하세요."
+  — Do: "설정 파일을 변경한 후 저장하세요. 변경 사항이 적용됐는지 확인하고, 필요하면 서버를 재시작하세요."
+- No meta-discourse — remove filler transitions that add noise without meaning
+  — Remove: "앞서 설명했듯이", "다음으로", "결론적으로", "아시겠지만"
+- Remove unnecessary Sino-Korean action nouns — 수행하다, 진행하다, 실시하다 add no meaning
+  — Don't: "로그 파일 삭제 작업을 수행합니다." → Do: "로그 파일을 삭제합니다."
+  — Don't: "배포 진행이 가능합니다." → Do: "배포할 수 있습니다."
+- Avoid translation-ese — convert English noun chains into natural Korean verb constructions
+  — Don't: "API 키를 이용한 사용자 인증 처리가 완료된 후, 데이터베이스 접속 설정 진행이 가능합니다."
+  — Do: "API 키로 사용자를 인증한 후, 데이터베이스에 접속하도록 설정할 수 있습니다."
+- Consistent terminology — pick one term and use it throughout; never mix synonyms
+  — Don't: "파일을 추가하려면… 파일을 첨부한 후… 파일을 다시 넣을 수 있습니다."
+  — Do: "파일을 업로드하려면… 파일을 업로드한 후… 파일을 다시 업로드할 수 있습니다."
+- Abbreviations on first use: write out in full with the abbreviation in parentheses, no space before `(`
+  — Don't: "이 기능은 SSR을 지원합니다." → Do: "이 기능은 SSR(Server-Side Rendering)을 지원합니다."
+### English Documentation Style
+Follow conventions aligned with Stripe, Vercel, and similar developer-first docs:
+- Declarative, instructional tone — "Run the command", not "You should run the command"
+- Lead with the outcome, not the process
+- Use second person ("you") consistently
+- Short sentences; one idea per sentence
+- Code examples inline with the narrative, not appended as afterthoughts
+## Perspective Focus
+When writing or reviewing documentation, evaluate:
+1. **Clarity**: Is every term defined on first use? Can a new developer follow this without prior context?
+2. **Structure**: Does the information flow from general to specific? Is progressive disclosure applied?
+3. **Completeness**: Are prerequisites stated? Are error cases documented? Are edge cases covered in callouts?
+4. **Consistency**: Are naming conventions uniform? Are verb tenses and tone consistent throughout?
+5. **Code Examples**: Are examples minimal, runnable, and contextually explained? Do they show realistic use cases?
+6. **Navigation**: Are section anchors provided? Is there a clear table of contents for longer docs?
+## Document Types
+Choose the document type based on **what the reader needs to do**:
+| Type | Reader's goal | Korean examples |
+|------|---------------|-----------------|
+| **Learning** | Understand a new technology end-to-end | 시작하기, 튜토리얼 |
+| **Problem-Solving** | Fix a specific issue they've hit | 트러블슈팅, How-to 가이드 |
+| **Reference** | Look up exact specs quickly | API 레퍼런스, Props 목록 |
+| **Explanation** | Deeply understand a concept or design decision | 아키텍처 개요, 동작 원리 |
+**시작하기 vs 튜토리얼**: 시작하기 = 주요 흐름 파악 + 간단한 설치, 튜토리얼 = 명확한 결과물이 있는 단계별 실습
+**가이드 vs 트러블슈팅**: 가이드 = 기능 구현 절차, 트러블슈팅 = 이미 발생한 문제 진단
+**Document type determines how to open and how deep to explain:**
+| Type | Opens with | Explanation depth |
+|------|-----------|-------------------|
+| **Learning** | Goal statement — "이 가이드를 마치면 X를 할 수 있어요." | Define all new concepts immediately; reader is learning from scratch |
+| **Problem-Solving** | Problem statement — specific error, symptom, or failure condition | Trust domain knowledge; define only what's specific to this problem |
+| **Reference** | Declarative definition — "X는 Y다" (Jo Suyong style: cut straight to the definition) | Minimal prose; let the spec table carry the information |
+| **Explanation** | Why it exists — the problem this technology was created to solve | Rich context; define all terms; use diagrams; leave room for the reader to think |
+### API Reference
+- Method signature first, description second
+- Parameter table: name / type / required / default / description
+- Response schema with example JSON
+- Error codes with causes and remediation steps
+- Rate limits and authentication requirements
+### Component Documentation (Design System)
+- Component name + one-line description
+- "이런 경우에 사용해요" — when to use
+- "이런 경우엔 사용하지 마세요" — when NOT to use (equally important)
+- Props/API table: prop / type / default / description
+- Usage example with code snippet
+- Variants and states section with visual descriptions
+### README / Getting Started Guide
+- What this is (one paragraph max)
+- Prerequisites
+- Installation (copy-paste ready commands)
+- Minimal working example
+- Link to full documentation
+### Developer Guide / Tutorial
+- Goal statement: what the reader will be able to do after completing this
+- Step-by-step with numbered sections
+- Expected output at each step
+- Troubleshooting section for common errors
+### Problem-Solving (How-to / Troubleshooting)
+- Open with a clear problem definition — distinguish between cause and symptom; include error messages or log examples
+- Provide immediately applicable solutions: code snippets, commands, or config changes
+- Explain the underlying principle, not just the fix
+- Account for environment differences (OS, library versions, etc.)
+### Explanation (Concept / Architecture)
+- Start with why this technology exists — the problem it was created to solve
+- Provide background and context before diving into mechanics
+- Use diagrams, flow charts, and tables to visualize complex relationships
+- State what prior knowledge the reader needs upfront
+## Information Architecture
+Apply these principles when structuring any document:
+**One topic per page**
+- If heading depth reaches H4, treat it as a signal to split into a separate page
+- Use an index/overview page to link related sub-pages
+**Value first**
+- Open with what the reader gains, not how the feature was built
+  — Don't: "리버스 프록시 설정은 2019년에 도입되었고…"
+  — Do: "리버스 프록시 설정을 적용하면 네트워크 지연 문제를 최소화할 수 있어요."
+**Heading rules**
+- Keep headings under 30 characters
+- Match heading form to the section's purpose — do not apply one style universally:
+  - Concept sections: Q&A form — "~은 무엇인가요?", "~을 써야 하는 이유는?"
+  - Task sections: Action-oriented — "시작하기", "설치하는 방법"
+  - Reference sections: Noun keyword — "요청 파라미터", "응답 형식"
+- Include the core keyword in the heading
+- Use consistent grammatical form across sibling headings within the same section — never mix forms
+  — Don't: `## 키워드를 포함하세요 / ## 일관성 유지 / ## 평서문으로 작성하기`
+  — Do: `## 키워드 포함하기 / ## 일관성 유지하기 / ## 평서문으로 작성하기`
+**Overview placement**
+- Place the overview immediately after the page title, before any section
+- Answer: "What will I be able to do after reading this?"
+  — Don't: "이 문서는 TypeScript 유틸리티 타입을 소개합니다." (no reader benefit)
+  — Do: "TypeScript 유틸리티 타입으로 반복적인 타입 선언을 줄이는 방법을 알아봐요."
+**Predictable structure**
+- Description before code — never lead with a code block without context
+- Logical ordering: basic concept → usage → examples → advanced/edge cases
+  — Don't: `## 비동기 데이터 요청하기 / ## 기본적인 사용법`
+  — Do: `## 기본적인 사용법 / ## 비동기 데이터 요청하기`
+- Use the same term for the same concept throughout — don't vary wording for style
+**Define new concepts — context-dependent**
+- Learning documents: define every new term immediately in 1–2 sentences; reader cannot fill the gap
+- Explanation documents: define terms and leave room to think — give the definition, then trust the reader to connect it
+- Problem-Solving documents: assume domain knowledge; only define what is specific to this issue
+- Reference documents: omit prose definitions unless the term is non-standard; let the parameter table speak
+  — Don't (Reference): "이 서비스는 이벤트 소싱 방식을 사용해 상태를 관리합니다. 이벤트 소싱은 상태의 최종 결과만 저장하는 대신…" (too much prose in a reference)
+  — Do (Reference): `` `eventSourcing` `boolean` — 이벤트 소싱 활성화 여부. 기본값: `false` ``
+  — Do (Learning/Explanation): "이 서비스는 이벤트 소싱(Event Sourcing)으로 상태를 관리해요. 이벤트 소싱은 상태의 최종 값 대신 변화를 일으킨 모든 이벤트를 기록하는 방식이에요."
+## Output Format
+When writing documentation, produce:
+- Complete, publish-ready markdown
+- Consistent use of heading levels (H2 for major sections, H3 for subsections)
+- Code blocks with language tags
+- Tables for structured data (props, parameters, env vars)
+- Callout blocks for important notes, warnings, or tips
+When reviewing existing documentation, provide:
+- Specific issues by section (clarity / structure / completeness / consistency)
+- Rewrite suggestions for unclear passages
+- Missing content checklist
+- Overall readability assessment

package/skills/agent/SKILL.md ADDED Viewed

@@ -0,0 +1,102 @@
+---
+name: agent
+version: "1.0.0"
+description: "Invoke a Gestalt agent directly for any task — no pipeline required"
+triggers:
+  - "agent"
+  - "use agent"
+  - "invoke agent"
+  - "run agent"
+inputs:
+  name:
+    type: string
+    required: false
+    description: "Agent name (e.g. architect, security-reviewer). Omit to list all available agents."
+  task:
+    type: string
+    required: false
+    description: "Task or question for the agent to perform"
+outputs:
+  - response
+---
+# Agent Skill
+Invoke any Gestalt Role or Review agent directly, outside the Gestalt pipeline.
+## Usage
+```bash
+# List all available agents
+/agent
+# Run a Role Agent
+/agent architect "review the module boundaries in this codebase"
+/agent backend-developer "is this REST API design consistent?"
+/agent qa-engineer "what edge cases am I missing for this login flow?"
+/agent frontend-developer "review this React component for accessibility issues"
+# Run a Review Agent
+/agent security-reviewer "check this authentication code for vulnerabilities"
+/agent performance-reviewer "are there any N+1 queries or memory leaks here?"
+/agent quality-reviewer "review this for readability and maintainability"
+```
+## Agent Groups
+**Role Agents** — domain specialists for consultation and advice:
+| Agent | Domain |
+|-------|--------|
+| `architect` | System design, scalability, design patterns |
+| `backend-developer` | API, database, authentication, server |
+| `frontend-developer` | UI, React, accessibility |
+| `designer` | UX/UI, design systems, interaction |
+| `qa-engineer` | Testing, edge cases, quality |
+| `devops-engineer` | CI/CD, infrastructure, monitoring |
+| `product-planner` | Requirements, roadmap, user stories |
+| `researcher` | Analysis, benchmarks, best practices |
+**Review Agents** — code review specialists:
+| Agent | Focus |
+|-------|-------|
+| `security-reviewer` | Injection, XSS, auth vulnerabilities, secrets |
+| `performance-reviewer` | Memory leaks, N+1 queries, bundle size, async |
+| `quality-reviewer` | Readability, SOLID, error handling, DRY |
+## Instructions
+### Listing agents
+When called without a `name` argument:
+1. Call `ges_agent({ action: "list" })` to retrieve all available agents
+2. Display the results grouped as **Role Agents** and **Review Agents**
+3. For each agent, show name, description, and key domains
+4. Suggest example invocations based on common use cases
+### Running an agent
+When called with a `name` and `task`:
+1. Call `ges_agent({ action: "get", name: "<agent-name>" })` to retrieve the agent definition
+2. If the agent is not found, list available agents and ask the user to choose one
+3. Adopt the agent's `systemPrompt` as your active persona for this response
+4. Perform the task from that agent's specialist perspective
+5. Follow the output format defined in the agent's system prompt (severity levels, structured findings, etc.)
+### Agent name only, no task
+When a `name` is provided but no `task`:
+1. Call `ges_agent({ action: "get", name: "<agent-name>" })` to retrieve the agent
+2. Display the agent's description, domains, and what it can help with
+3. Prompt the user to provide a specific task or question
+### Partial name matching
+If the provided name doesn't exactly match (e.g. "security" instead of "security-reviewer"):
+1. Call `ges_agent({ action: "list" })` to get all agent names
+2. Find the closest match and confirm with the user before proceeding

package/skills/execute/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: execute
-version: "1.0.0"
+version: "1.1.0"
 description: "Gestalt-driven execution planner that transforms a Spec into a validated ExecutionPlan"
 triggers:
   - "execute"
@@ -17,14 +17,43 @@ outputs:
 # Execute Skill
-This skill transforms a validated Spec specification into a concrete, dependency-aware Execution Plan by applying Gestalt psychology principles as a structured planning framework.
+This skill transforms a validated Spec specification into a concrete, dependency-aware Execution Plan, executes it with multi-perspective Role Agent guidance, and validates the result through a 2-stage evaluation pipeline.
-## Process
+## Full Pipeline
+```
+Planning  →  Execution  →  Evaluate  →  (Evolve if needed)
+```
+### Phase 1 — Planning
 1. **Figure-Ground** (Step 1): Classify acceptance criteria as essential (figure) or supplementary (ground), assign priority levels
-2. **Closure** (Step 2): Decompose ACs into atomic tasks, identify implicit sub-tasks not explicitly stated
-3. **Proximity** (Step 3): Group related atomic tasks into logical task groups by domain
-4. **Continuity** (Step 4): Validate the dependency DAG — ensure no cycles, no conflicts, clear execution order
+2. **Closure** (Step 2): Decompose ACs into atomic tasks, including implicit sub-tasks
+3. **Proximity** (Step 3): Group related tasks by domain into logical task groups
+4. **Continuity** (Step 4): Validate the dependency DAG — no cycles, clear topological order
+### Phase 2 — Execution
+Run tasks in topological order. For each task:
+1. **Role Match** (optional but recommended): identify which Role Agents are relevant to this task
+2. **Role Consensus**: collect multi-perspective guidance from matched agents
+3. **Execute Task**: perform the task using the role guidance
+### Phase 3 — Evaluate
+After all tasks complete, run a 2-stage evaluation:
+- **Stage 1 (Structural)**: run lint → build → test — short-circuits if any fail
+- **Stage 2 (Contextual)**: LLM validates each AC + goal alignment
+Success condition: `score ≥ 0.85` AND `goalAlignment ≥ 0.80`
+### Phase 4 — Evolve (when evaluation fails)
+- **Flow A — Structural Fix**: fix lint/build/test failures → re-evaluate
+- **Flow B — Contextual Evolution**: patch Spec ACs/constraints → re-execute impacted tasks → re-evaluate
+- **Flow C — Lateral Thinking**: when stagnation detected, rotate through Multistability / Simplicity / Reification / Invariance personas
 ## Passthrough Mode
@@ -82,3 +111,242 @@ API 키 없이 MCP 서버 실행 시 자동 활성화. LLM 작업을 caller가
 - 각 단계 결과는 이전 단계 데이터와 교차 검증됨
 - Continuity 단계에서는 서버 측 DAG 검증이 추가로 수행됨
 - 모든 AC가 분류되어야 하고, 모든 Task가 그룹에 포함되어야 함
+---
+## Phase 2 — Execution
+### `execute_start` — 실행 시작
+`plan_complete` 이후 호출. 태스크 목록을 받아 실행 준비.
+```json
+{ "action": "execute_start", "sessionId": "..." }
+```
+→ `{ status, sessionId, executionPlan, message }`
+---
+### Role Agent 플로우 (태스크당, 선택적)
+태스크 내용과 관련된 Role Agent가 있을 경우 role_match → role_consensus 순으로 호출해 guidance를 받는다. 문서 작성, 보안, 성능, 아키텍처 등 전문 영역이 필요한 태스크에 특히 유효하다.
+**`role_match` — 관련 에이전트 매칭 (2-Call)**
+```json
+// Call 1: 매칭 컨텍스트 요청
+{ "action": "role_match", "sessionId": "..." }
+```
+→ `{ matchContext }` — 어떤 에이전트가 적합한지 판단하기 위한 프롬프트
+```json
+// Call 2: 매칭 결과 제출
+{
+  "action": "role_match",
+  "sessionId": "...",
+  "matchResult": [
+    { "agentName": "technical-writer", "domain": ["documentation"], "relevanceScore": 0.9, "reasoning": "..." },
+    { "agentName": "architect", "domain": ["architecture"], "relevanceScore": 0.7, "reasoning": "..." }
+  ]
+}
+```
+→ `{ perspectivePrompts }` — 각 에이전트별 관점 생성 프롬프트
+**`role_consensus` — 다중 관점 합의 (2-Call)**
+```json
+// Call 1: 각 에이전트 관점 수집 후 제출
+{
+  "action": "role_consensus",
+  "sessionId": "...",
+  "perspectives": [
+    { "agentName": "technical-writer", "perspective": "...", "confidence": 0.9 },
+    { "agentName": "architect", "perspective": "...", "confidence": 0.8 }
+  ]
+}
+```
+→ `{ synthesisContext }` — 관점 통합 프롬프트
+```json
+// Call 2: 합성된 합의 제출
+{
+  "action": "role_consensus",
+  "sessionId": "...",
+  "consensus": {
+    "consensus": "통합된 가이드라인",
+    "conflictResolutions": ["...", "..."],
+    "perspectives": [...]
+  }
+}
+```
+→ `{ roleGuidance }` — execute_task 시 참조할 최종 guidance
+---
+### `execute_task` — 태스크 실행 결과 제출
+role_match/role_consensus로 얻은 `roleGuidance`를 참조해 태스크를 수행한 후 결과 제출.
+`allTasksCompleted === true`가 될 때까지 반복.
+```json
+{
+  "action": "execute_task",
+  "sessionId": "...",
+  "taskResult": {
+    "taskId": "task-0",
+    "status": "completed",
+    "output": "태스크 수행 결과 요약",
+    "artifacts": ["path/to/file.ts"]
+  }
+}
+```
+→ `{ status, nextTaskId?, allTasksCompleted, driftResult? }`
+`driftResult`가 반환되면 Spec과의 drift 경고 — 계속 진행하되 다음 태스크에서 방향 보정.
+---
+## Phase 3 — Evaluate
+모든 태스크 완료 후 3-Call 평가 진행.
+**Call 1 — Structural 단계 시작**
+```json
+{ "action": "evaluate", "sessionId": "..." }
+```
+→ `{ stage: "structural", structuralContext }` — lint/build/test 실행 지시
+**Call 2 — Structural 결과 제출**
+```json
+{
+  "action": "evaluate",
+  "sessionId": "...",
+  "structuralResult": {
+    "commands": [
+      { "name": "lint", "command": "pnpm run lint", "exitCode": 0, "output": "" },
+      { "name": "build", "command": "pnpm run build", "exitCode": 0, "output": "" },
+      { "name": "test", "command": "pnpm run test", "exitCode": 0, "output": "360 tests passed" }
+    ],
+    "allPassed": true
+  }
+}
+```
+→ structural 실패 시 `{ stage: "structural_failed", evolveContext }` → Evolve Flow A 진입
+→ structural 통과 시 `{ stage: "contextual", evaluationContext }` — AC별 LLM 검증 지시
+**Call 3 — Contextual 결과 제출**
+```json
+{
+  "action": "evaluate",
+  "sessionId": "...",
+  "evaluationResult": {
+    "verifications": [
+      { "acIndex": 0, "satisfied": true, "evidence": "...", "gaps": [] }
+    ],
+    "overallScore": 0.92,
+    "goalAlignment": 0.88,
+    "recommendations": []
+  }
+}
+```
+→ `{ status: "completed" }` (score ≥ 0.85, goalAlignment ≥ 0.80)
+→ 미달 시 `{ evolveContext }` → Evolve Flow B 진입
+---
+## Phase 4 — Evolve
+### Flow A — Structural Fix
+```json
+// 1. Fix context 요청
+{ "action": "evolve_fix", "sessionId": "..." }
+→ fixContext 반환
+// 2. Fix 수행 후 결과 제출
+{
+  "action": "evolve_fix",
+  "sessionId": "...",
+  "fixTasks": [
+    { "taskId": "fix-0", "failedCommand": "pnpm run lint", "errorOutput": "...", "fixDescription": "...", "artifacts": [] }
+  ]
+}
+// 3. Re-evaluate (Phase 3 반복)
+{ "action": "evaluate", "sessionId": "..." }
+```
+### Flow B — Contextual Evolution
+```json
+// 1. Evolution context 요청
+{ "action": "evolve", "sessionId": "..." }
+→ evolveContext (또는 terminateReason으로 종료)
+// 2. Spec patch 제출 (AC/constraints 수정, goal 변경 불가)
+{
+  "action": "evolve_patch",
+  "sessionId": "...",
+  "specPatch": {
+    "acceptanceCriteria": ["수정된 AC..."],
+    "constraints": ["추가 제약조건..."]
+  }
+}
+→ { impactedTaskIds, reExecuteContext }
+// 3. 영향받은 태스크 재실행 (allTasksCompleted까지 반복)
+{
+  "action": "evolve_re_execute",
+  "sessionId": "...",
+  "reExecuteTaskResult": { "taskId": "task-3", "status": "completed", "output": "...", "artifacts": [] }
+}
+// 4. Re-evaluate
+{ "action": "evaluate", "sessionId": "..." }
+```
+### Flow C — Lateral Thinking (stagnation 감지 시 자동 분기)
+`evolve` 호출 시 stagnation/oscillation/hard_cap이 감지되면 자동으로 lateral thinking persona로 전환.
+```json
+// evolve 호출 → lateralContext 반환
+{ "action": "evolve", "sessionId": "..." }
+→ { status: "lateral_thinking", lateralContext: { persona, pattern, lateralPrompt, ... } }
+// Lateral result 제출
+{
+  "action": "evolve_lateral_result",
+  "sessionId": "...",
+  "lateralResult": {
+    "persona": "multistability",
+    "specPatch": { "acceptanceCriteria": [...] },
+    "description": "관점 전환으로 요구사항 재구성"
+  }
+}
+// Re-execute + Re-evaluate (Flow B와 동일)
+// 다음 persona 요청 (점수 미달 시)
+{ "action": "evolve_lateral", "sessionId": "..." }
+```
+| Stagnation 패턴 | Persona | 전략 |
+|---|---|---|
+| hard_cap | Multistability | 다른 각도로 보기 |
+| oscillation | Simplicity | 단순하게 줄이기 |
+| no_drift | Reification | 빠진 조각 채우기 |
+| diminishing_returns | Invariance | 성공 패턴 복제 |
+4개 persona 소진 → `human_escalation` 반환으로 세션 종료.
+### 종료 조건
+| 조건 | 트리거 |
+|------|--------|
+| `success` | score ≥ 0.85 AND goalAlignment ≥ 0.80 |
+| `stagnation` | 2회 연속 delta < 0.05 |
+| `oscillation` | 2회 연속 점수 역전 |
+| `hard_cap` | structural 3회 + contextual 3회 실패 |
+| `caller` | `{ action: "evolve", terminateReason: "caller" }` |
+| `human_escalation` | 4개 lateral persona 소진 |