npm - agestra - Versions diffs - 4.1.1 → 4.3.0 - Mend

agestra 4.1.1 → 4.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/.claude-plugin/marketplace.json +2 -5
package/.claude-plugin/plugin.json +13 -11
package/README.ko.md +80 -24
package/README.md +80 -24
package/agents/agestra-designer.md +122 -0
package/agents/{ideator.md → agestra-ideator.md} +43 -9
package/agents/agestra-moderator.md +253 -0
package/agents/{qa.md → agestra-qa.md} +34 -6
package/agents/{reviewer.md → agestra-reviewer.md} +46 -3
package/agents/agestra-team-lead.md +331 -0
package/commands/design.md +46 -32
package/commands/idea.md +45 -31
package/commands/review.md +45 -31
package/dist/bundle.js +235 -26776
package/hooks/user-prompt-submit.md +11 -0
package/package.json +2 -1
package/skills/build-fix.md +76 -0
package/skills/cancel.md +68 -0
package/skills/design.md +115 -0
package/skills/idea.md +144 -0
package/skills/provider-guide.md +105 -19
package/skills/trace.md +61 -0
package/skills/worker-manage.md +75 -0
package/agents/designer.md +0 -78
package/agents/moderator.md +0 -84
package/agents/team-lead.md +0 -167

package/hooks/user-prompt-submit.md ADDED Viewed

@@ -0,0 +1,11 @@
+---
+event: UserPromptSubmit
+---
+If the user's message involves code review, design discussion, idea exploration, comparing options, validation, or parallel work — output exactly:
+```
+AGESTRA_SUGGESTION
+```
+Otherwise output nothing.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agestra",
-  "version": "4.1.1",
+  "version": "4.3.0",
   "description": "Claude Code plugin — orchestrate Ollama, Gemini, and Codex for multi-AI debates, cross-validation, and GraphRAG memory",
   "type": "module",
   "packageManager": "npm@11.11.0",
@@ -18,6 +18,7 @@
     ".claude-plugin/",
     "agents/",
     "commands/",
+    "hooks/",
     "skills/"
   ],
   "scripts": {

package/skills/build-fix.md ADDED Viewed

@@ -0,0 +1,76 @@
+---
+name: build-fix
+description: >
+  Use when build fails, TypeScript type errors occur, lint errors need fixing,
+  or compilation errors block progress. Triggers on: build failure output,
+  "tsc" errors, "build failed", "fix build", "type error", "lint error",
+  "compilation error".
+---
+## Purpose
+Automatically diagnoses and fixes build/typecheck/lint errors with minimal scope changes, one error at a time, highest-impact first.
+## Strategy: One-at-a-Time
+Many build errors are cascading — one root cause produces multiple error messages. Fixing all at once risks unnecessary changes. Instead:
+1. Fix the **first** (or most impactful) error
+2. Rebuild to see which errors remain
+3. Repeat until clean
+## Workflow
+### Step 1: Identify Errors
+Run the appropriate build command for the project:
+| Detected Project Type | Command |
+|---|---|
+| `tsconfig.json` present | `npx tsc --noEmit 2>&1` |
+| `turbo.json` present | `npx turbo build 2>&1` |
+| `package.json` with `build` script | `npm run build 2>&1` |
+| ESLint configured | `npx eslint . 2>&1` |
+If the user provided specific error output, use that instead of re-running.
+### Step 2: Triage
+Parse errors and rank by impact:
+1. **Syntax errors** — block all downstream compilation
+2. **Missing imports/exports** — cascade to many dependents
+3. **Type mismatches** — usually isolated
+4. **Lint warnings** — lowest priority
+### Step 3: Fix Loop (max 5 cycles)
+For each cycle:
+1. Read the file containing the highest-priority error
+2. Diagnose the root cause (not the symptom)
+3. Apply the **minimal** fix — do not refactor surrounding code
+4. Re-run the build command
+5. If errors remain, continue to next cycle
+6. If no errors remain or same error persists 3 times, stop
+### Step 4: Report
+Present results to the user:
+```
+Build Fix Summary
+- Cycles: {n}
+- Errors fixed: {count}
+- Remaining errors: {count or "none"}
+- Files modified: {list}
+```
+If errors remain after 5 cycles, list them and suggest manual investigation.
+## Constraints
+- **Minimal changes only** — fix the error, nothing else
+- **No refactoring** — do not "improve" code while fixing
+- **No new dependencies** — do not add packages to fix type errors
+- **Preserve behavior** — fixes must not change runtime behavior
+- **Read before edit** — always read the full file before modifying

package/skills/cancel.md ADDED Viewed

@@ -0,0 +1,68 @@
+---
+name: cancel
+description: >
+  Use when the user wants to stop, cancel, or abort a running operation.
+  Triggers on: "cancel", "stop", "abort", "enough", "quit", "중단", "취소",
+  "그만". Performs graceful shutdown with state cleanup.
+---
+## Purpose
+Gracefully cancels running Agestra operations and cleans up associated state. Detects which operation is active and performs appropriate cleanup.
+## Detection
+Check for active operations in this order:
+1. **CLI Workers** — Call `cli_worker_status` to check for workers in RUNNING or SPAWNING state
+2. **Debate** — Call `agent_debate_status` to check for active debates
+3. **Task Chain** — Call `agent_task_chain_status` to check for running chains
+4. **Task** — Call `agent_task_status` for individual running tasks
+5. **Background agents** — Check for any spawned background agents still running
+If nothing is detected as active, inform the user: "No active Agestra operations found."
+If multiple types are active, list them all and ask the user which to cancel (or all).
+## Cleanup by Operation Type
+### CLI Workers
+1. List all workers in RUNNING/SPAWNING state with their provider, elapsed time, and task description.
+2. Ask the user which to stop (or all):
+   - Single worker: call `cli_worker_stop` with the worker ID.
+   - All workers: call `cli_worker_stop` for each.
+3. Workers receive SIGTERM, then SIGKILL after 5 seconds.
+4. Worktrees are cleaned up automatically.
+5. Report: which workers were stopped, any partial results available via `cli_worker_collect`.
+### Debate
+1. Call `agent_debate_conclude` with a summary noting early termination
+2. Inform the user which providers participated and how many rounds completed
+### Task Chain
+1. Note the current step and remaining steps
+2. Let the current step finish if nearly complete, otherwise stop
+3. Report: completed steps, skipped steps, any partial results
+### Individual Task
+1. Wait for current provider response if in-flight (do not interrupt mid-response)
+2. Report the task status and any partial results
+### Background Agents
+1. List running background agents with their descriptions
+2. Ask the user which to cancel (or all)
+3. Stop selected agents
+## Post-Cleanup
+After cancellation:
+- Summarize what was stopped and what completed
+- Note any artifacts produced (debate documents, partial results, worker diffs)
+- If CLI workers produced changes before stopping, mention that partial diffs may be available via `cli_worker_collect`
+- If the operation produced useful partial work, mention it so the user can resume later
+## Constraints
+- **Never discard results silently** — always report what was produced before cancellation
+- **Prefer graceful over forced** — let in-flight operations finish when possible
+- **Ask before bulk cancel** — if multiple operations are running, confirm which to stop

package/skills/design.md ADDED Viewed

@@ -0,0 +1,115 @@
+---
+name: agestra-design
+description: >
+  Use when exploring architecture, discussing design trade-offs, planning implementation approaches,
+  or structuring a feature before writing code. Triggers on: "design this", "how should I architect",
+  "what's the best approach", "explore approaches", "design trade-offs", "before implementing",
+  "설계", "아키텍처", "구조 잡아줘", "어떻게 만들지", "방향 잡아줘",
+  "設計", "アーキテクチャ", "架构", "设计"
+---
+## Purpose
+Pre-implementation design exploration. Understand intent through targeted questions, explore the codebase for existing patterns, propose multiple approaches with trade-offs, and produce a design document.
+## Scope
+Design features and systems **for the current project**. If the request is outside this project's scope (a new product idea, a business question, or something unrelated to this codebase), say so and suggest `/agestra idea` for open exploration instead.
+## Workflow
+### Phase 1: Clarity Gate
+Before asking questions, check if the request is already clear. If it includes specific file paths, function names, or concrete acceptance criteria, skip the interview.
+**Clarity Dimensions:**
+| Dimension | Weight (greenfield) | Weight (brownfield) |
+|-----------|-------------------|-------------------|
+| Goal | 40% | 35% |
+| Constraints | 30% | 25% |
+| Success Criteria | 30% | 25% |
+| Context | N/A | 15% |
+Greenfield: no relevant source code exists for the feature.
+Brownfield: modifying or extending existing code.
+**After each user answer:**
+1. Score all dimensions 0.0–1.0
+2. Calculate: `ambiguity = 1 - weighted_sum`
+3. Display progress:
+   ```
+   Round {n} | Ambiguity: {score}% | Targeting: {weakest dimension}
+   ```
+4. If ambiguity <= 20% → proceed to Phase 2
+5. If ambiguity > 20% → ask the next question targeting the WEAKEST dimension
+**Question targeting:** Always target the dimension with the lowest score. Ask ONE question at a time. Expose assumptions, not feature lists.
+| Dimension | Question Style |
+|-----------|---------------|
+| Goal | "What exactly happens when...?" / "What specific action does a user take first?" |
+| Constraints | "What are the boundaries?" / "Should this work offline?" |
+| Success Criteria | "How do we know it works?" / "What would make you say 'yes, that's it'?" |
+| Context (brownfield) | "How does this fit with existing...?" / "Extend or replace?" |
+**Challenge modes** (each used once, then return to normal):
+- Round 4+: **Contrarian** — "What if the opposite were true? What if this constraint doesn't actually exist?"
+- Round 6+: **Simplifier** — "What's the simplest version that would still be valuable?"
+- Round 8+: **Ontologist** (if ambiguity still > 30%) — "What IS this, really? One sentence."
+**Soft limits:**
+- Round 3+: allow early exit if user says "enough" — show ambiguity warning
+- Round 10: soft warning — "We're at 10 rounds. Current ambiguity: {score}%. Continue or proceed?"
+- Round 20: hard cap — proceed with current clarity, note the risk
+### Phase 2: Explore
+Search the codebase for relevant existing patterns:
+- Use Glob to find related files by name
+- Use Grep to find similar implementations
+- Use Read to understand existing architecture
+- Note conventions: naming, file organization, patterns used
+### Phase 3: Propose
+Present 2-3 distinct approaches. For each:
+- **Approach name** — one-line summary
+- **How it works** — architecture overview
+- **Fits with** — which existing patterns it aligns with
+- **Trade-offs** — pros and cons
+- **Effort** — relative complexity (low/medium/high)
+### Phase 4: Refine
+Based on user feedback:
+- Deep-dive into the selected approach
+- Address concerns raised
+- Detail component boundaries and data flow
+- Identify risks and mitigation
+### Phase 5: Document
+Write a design document to `docs/plans/` with this structure:
+```markdown
+# [Feature/System Name] Design
+## Problem
+## Approach
+## Architecture
+## Components
+## Data Flow
+## Trade-offs & Decisions
+## Open Questions
+## Implementation Steps
+```
+## Constraints
+- Ask one question at a time. Do not dump multiple questions.
+- Present approaches before solutions. Let the user choose direction.
+- Always explore the codebase before proposing — do not design in a vacuum.
+- Document all decisions made during the conversation in the final design document.
+- Do not write implementation code. Design documents only.
+- Communicate in the user's language.

package/skills/idea.md ADDED Viewed

@@ -0,0 +1,144 @@
+---
+name: agestra-idea
+description: >
+  Use when discovering improvements, comparing with similar projects, collecting user feedback,
+  exploring new features, researching what to build, or validating ideas. Triggers on:
+  "find improvements", "what should I add", "compare with competitors", "what are users asking for",
+  "explore ideas", "feature ideas", "what's missing", "is this worth building", "what do users want",
+  "what problem does this solve", "who would use this", "what should I focus on next",
+  "개선점", "뭐 추가하면 좋을까", "아이디어", "유사 프로젝트", "뭐가 부족해",
+  "이거 만들 가치가 있어?", "다음에 뭘 해야 할까", "비슷한 도구",
+  "改善", "アイデア", "改进", "想法"
+---
+## Purpose
+Idea and improvement discovery. Research similar projects, collect user complaints and feature requests, compare capabilities, and generate actionable suggestions.
+## Scope
+**Mode A: Existing project** — The codebase has a README or meaningful code.
+Research improvements, missing features, and competitive gaps for this project.
+**Mode B: New project** — The codebase is empty/new, but the user has a seed idea (e.g., "I want to build a writing tool").
+Research the landscape: what already exists, what users complain about, what gaps remain.
+**Out of scope:** Requests with no seed idea at all (e.g., "what should I build?"). You need at least a domain or concept to anchor research. Ask for one:
+> "I need at least a rough idea — a domain, a tool type, or a problem you want to solve. For example: 'a writing tool', 'a CLI for deployment', 'something for managing bookmarks'."
+## Workflow
+### Phase 1: Clarity Gate
+Before researching, understand what the user needs through targeted questions. Ask ONE question at a time. Communicate in the user's language.
+**Step 1: Determine mode.**
+- If the codebase has a README or meaningful code → Mode A (existing project)
+- If the codebase is empty/new but user has a seed idea → Mode B (new project)
+**Step 2: Mode-specific interview.**
+**Mode A — Existing project:**
+| Dimension | Question | Purpose |
+|-----------|----------|---------|
+| Direction | "What aspect are you looking to improve? (features, UX, performance, integrations, DX)" | Narrow the research scope |
+| Audience | "Who are your current users? What do they use it for most?" | Target the right competitors |
+| Feedback | "Have you received any complaints or feature requests?" | Direct pain point input |
+| Competition | "Are there specific competitors or similar tools you're aware of?" | Seed the research |
+| Strength | "What do you consider your project's unique strength?" | Avoid suggesting what already works |
+| Constraints | "Any areas you don't want to change or can't change?" | Set research boundaries |
+After gathering context:
+- Read the project's README and key files to understand what it does
+- Use Glob and Grep to map the current feature set
+- Identify the project's category and target audience
+**Mode B — New project:**
+| Dimension | Question | Purpose |
+|-----------|----------|---------|
+| Problem | "What problem are you trying to solve?" | Core motivation |
+| Audience | "Who would use this? What's the target audience?" | Market focus |
+| Form | "How do you envision it? (CLI, web app, library, service, plugin)" | Shape the research |
+| Inspiration | "What inspired this? Have you seen something similar?" | Seed the research |
+| Core | "What's the single most important thing it must do well?" | Prioritization anchor |
+| Boundary | "What should it NOT be? Where do you draw the line?" | Scope limits |
+**Early exit:** If the user provides enough context upfront (specific competitors, clear scope, concrete goals), skip remaining questions and proceed to Phase 2. Do not force unnecessary rounds.
+### Phase 2: Research Similar Projects
+- Use WebSearch to find similar tools, libraries, and projects
+- Look for: direct competitors, adjacent tools, inspirational projects
+- Collect names, URLs, and key differentiators
+### Phase 3: Collect Pain Points
+- WebSearch for complaints about similar tools (GitHub issues, forums, discussions)
+- WebFetch relevant issue pages and discussion threads
+- Identify recurring themes in user feedback
+- Note what users wish existed but doesn't
+### Phase 4: Feature Comparison
+Build a comparison table:
+| Feature | This Project | Competitor A | Competitor B |
+|---------|-------------|-------------|-------------|
+| Feature 1 | Yes/No | Yes/No | Yes/No |
+### Phase 5: Generate Suggestions
+For each suggestion:
+- **Title** — clear, actionable name
+- **Category** — UX, Performance, Feature, Integration, DX
+- **Source** — where this idea came from (competitor, user complaint, own analysis)
+- **Priority** — HIGH / MEDIUM / LOW with rationale
+- **Effort** — estimated complexity
+- **Description** — what it does and why it matters
+### Phase 6: Prioritized Recommendations
+Present a ranked list:
+1. **Quick wins** — high impact, low effort
+2. **Strategic investments** — high impact, high effort
+3. **Nice-to-haves** — low impact, low effort
+## Output Format
+```markdown
+## Research Summary
+### Similar Projects
+(list with URLs and key features)
+### User Pain Points
+(categorized complaints from research)
+### Feature Comparison
+(table)
+### Recommendations
+#### Quick Wins
+1. ...
+#### Strategic Investments
+1. ...
+#### Nice-to-Haves
+1. ...
+### Sources
+- [Source 1](url)
+- [Source 2](url)
+```
+## Constraints
+- Always include source URLs for claims about other projects.
+- Do not fabricate features of competitors — verify via web research.
+- Prioritize actionable suggestions over theoretical improvements.
+- Communicate in the user's language.

package/skills/provider-guide.md CHANGED Viewed

@@ -3,17 +3,26 @@ name: provider-guide
 description: >
   Use when routing tasks to AI providers, using any agestra MCP tool,
   reviewing code with multiple providers, starting debates, dispatching
-  parallel tasks, or cross-validating work. Also triggers on mentions of
-  Ollama, Gemini, or Codex providers.
+  parallel tasks, cross-validating work, or managing CLI workers. Also
+  triggers on mentions of Ollama, Gemini, or Codex providers.
 ---
 ## Available Providers
 - **Ollama** — Local models. Detected at runtime via `ollama_models`.
-- **Gemini** — Cloud agent. Full capability.
-- **Codex** — Cloud agent. Full capability.
+- **Gemini** — Cloud agent. Full capability. Can run as autonomous CLI worker.
+- **Codex** — Cloud agent. Full capability. Can run as autonomous CLI worker.
-All providers are detected at runtime. Check `provider_list` or `provider_health` for current availability before routing.
+All providers are detected at runtime. Call `environment_check` for a full capability map, or `provider_list` / `provider_health` for provider availability.
+## Environment Check
+At session start or on demand, `environment_check` provides:
+- CLI tool availability (codex, gemini, tmux)
+- Ollama models with size-based tier classification
+- Git worktree support
+- Available modes: `claude_only`, `independent`, `debate`, `team`
+- Whether autonomous CLI workers can be spawned
 ## Provider Capability Guidelines
@@ -32,7 +41,46 @@ Models change frequently. Always call `ollama_models` before assigning tasks.
 ### Gemini / Codex (Cloud)
-Full-capability agents. Use for complex tasks, parallel work, and as validators.
+Full-capability agents. Use for:
+- Complex tasks via `ai_chat` or `agent_assign_task` (text response)
+- Autonomous coding via `cli_worker_spawn` (file modifications in worktree)
+- Parallel work and as validators
+## Work Modes
+### Text Work (리뷰/설계/아이디어)
+Three modes available via `/agestra review`, `/agestra design`, `/agestra idea`:
+| Mode | Description | When to Use |
+|------|-------------|-------------|
+| **Claude only** | Specialist agent works alone | Quick analysis, no external AI needed |
+| **각자 독립** | Each AI works independently → moderator aggregates | Want multiple perspectives, fast |
+| **끝장토론** | Independent work + document review rounds until consensus | Need thorough, agreed-upon analysis |
+### Implementation Work (실제 구현)
+Two modes available via team-lead orchestration:
+| Mode | Description | When to Use |
+|------|-------------|-------------|
+| **Claude만으로** | Claude directly implements with project/global agents | Simple tasks, 1-2 files |
+| **다른 AI도 함께** | CLI workers do autonomous coding, Claude supervises | Complex tasks, 3+ files, parallelizable |
+## CLI Workers
+CLI workers spawn Codex or Gemini in `--full-auto` mode within isolated git worktrees.
+| Tool | Purpose |
+|------|---------|
+| `cli_worker_spawn` | Spawn autonomous CLI worker with task manifest |
+| `cli_worker_status` | Check worker FSM state, output, heartbeat |
+| `cli_worker_collect` | Collect completed worker results (diff, output) |
+| `cli_worker_stop` | Stop worker (SIGTERM → SIGKILL) + cleanup |
+Worker lifecycle: SPAWNING → RUNNING → COLLECTING → COMPLETED (or FAILED/CANCELLED/TIMEOUT)
+Use the `worker-manage` skill for user-friendly worker operations.
 ## Auto-Routing Guidelines
@@ -40,7 +88,8 @@ Full-capability agents. Use for complex tasks, parallel work, and as validators.
 |---|---|
 | Simple (formatting, pattern matching) | Ollama local model preferred |
 | Moderate (code review, summarization) | Ollama >= 3 GB or cloud |
-| Complex (architecture, refactoring) | Cloud providers (Gemini, Codex) |
+| Complex implementation (multi-file, multi-step) | CLI worker (Codex/Gemini) |
+| Complex analysis (architecture, refactoring) | Cloud providers (Gemini, Codex) via ai_chat |
 | No providers available | Handle directly — do not suggest agestra tools |
 ## When to Suggest Agestra Tools
@@ -51,37 +100,48 @@ Match by **semantic intent**, not literal keywords. These triggers apply in any
 | Intent | Tool | When |
 |---|---|---|
-| Code review, review request | `agent_debate_start` or `workspace_create_review` | User asks to review code, PR, or implementation |
-| Second opinion, other perspectives | `ai_compare` or `agent_debate_start` | User wants multiple viewpoints on a decision |
+| Code review, review request | `/agestra review` or `workspace_create_review` | User asks to review code, PR, or implementation |
+| Second opinion, other perspectives | `ai_compare` or `/agestra review` (각자 독립) | User wants multiple viewpoints on a decision |
 | Validation, verification, cross-check | `agent_cross_validate` | User wants to confirm correctness of work output |
-| Speed up, parallelize, split work | `agent_dispatch` | User wants faster execution or has independent tasks |
+| Speed up, parallelize, split work | `agent_dispatch` or CLI workers | User wants faster execution or has independent tasks |
 | Past experience, history, previous attempts | `memory_search` or `memory_dead_ends` | User asks about prior work or known issues |
 | Remember this, save for later | `memory_store` | User wants to persist knowledge across sessions |
 | Mention a provider by name (Gemini, Codex, Ollama) | `ai_chat` or `agent_assign_task` | Route directly to the named provider |
-| Architecture review, design discussion | `agent_debate_start` | Structured multi-AI discussion on design choices |
+| Architecture review, design discussion | `/agestra design` | Structured multi-AI architecture exploration |
 | Compare options, which is better | `ai_compare` | Side-by-side comparison from multiple providers |
-| Large refactoring, many files to change | `agent_dispatch` | Split by file/module for parallel processing |
+| Large refactoring, many files to change | CLI workers or `agent_dispatch` | Split by file/module for parallel processing |
 | About to commit, create PR, finalize work | `agent_cross_validate` | Pre-commit validation by other AI providers |
+| Check worker status, manage workers | `worker-manage` skill | User asks about running workers |
 ### Commands and Agents
 | Command | Specialist Agent | Purpose |
 |---------|-----------------|---------|
-| `/agestra review` | `reviewer` | Post-implementation quality verification |
-| `/agestra idea` | `ideator` | Improvement discovery and competitive analysis |
-| `/agestra design` | `designer` | Pre-implementation architecture exploration |
+| `/agestra review` | `agestra-reviewer` | Post-implementation quality verification |
+| `/agestra idea` | `agestra-ideator` | Improvement discovery and competitive analysis |
+| `/agestra design` | `agestra-designer` | Pre-implementation architecture exploration |
-When "Debate" is selected, `moderator` facilitates while the specialist provides Claude's perspective.
+### Utility Skills
-Commands and hook-triggered suggestions share the same 4-choice pattern. Commands are explicit entry points; hooks detect intent from natural language.
+| Skill | Purpose |
+|-------|---------|
+| `trace` | View agent execution timeline, summary stats, and flow visualization |
+| `build-fix` | Auto-diagnose and fix build/typecheck/lint errors one at a time |
+| `cancel` | Gracefully stop running operations (including CLI workers) with state cleanup |
+| `worker-manage` | List, check, collect, and stop CLI workers |
+When "각자 독립" is selected, each AI works independently and `agestra-moderator` aggregates results.
+When "끝장토론" is selected, `agestra-moderator` facilitates document review rounds after independent aggregation.
+Commands and hook-triggered suggestions share the same 3-choice pattern (Claude only / 각자 독립 / 끝장토론). Commands are explicit entry points; hooks detect intent from natural language.
 ### Hook-Triggered Choice
 When an `AGESTRA_SUGGESTION` marker appears from the UserPromptSubmit hook, present these choices:
 1. **Claude only** — Claude Code handles it alone
-2. **Compare** — Send the same prompt to multiple AIs, compare responses (`ai_compare`)
-3. **Debate** — AIs discuss until consensus is reached (`agent_debate_start`)
+2. **각자 독립** — Each AI works independently, moderator aggregates
+3. **끝장토론** — Independent work + document review rounds until consensus
 4. **Other** — User specifies the approach
 Present choices in the user's language. If no providers are available, skip and proceed directly.
@@ -101,6 +161,32 @@ Do NOT wait for rate limit reset.
 - Call `memory_dead_ends` before starting work to avoid repeating failed strategies.
 - Call `memory_store` to save findings for future sessions.
+## Orchestration Pipeline
+When team-lead orchestrates multi-AI work, the full pipeline is:
+```
+Phase 0: Clarity Gate (designer — ambiguity scoring, skip if request is clear)
+Phase 1: Situation Assessment (team-lead — environment_check, providers, design doc)
+Phase 2: Task Design (team-lead — work mode selection, decompose, route by AI capability)
+Phase 3: Parallel Execution (team-lead — Claude + CLI workers + Ollama, monitor loop)
+Phase 4: Result Inspection (team-lead — review diffs, check consistency, merge)
+Phase 5: QA Cycle (qa — verify, classify failures → team-lead auto-fixes, max 5 cycles)
+Phase 6: Quality Gate (reviewer — TRUST 5: Tested/Readable/Unified/Secured/Trackable)
+Phase 7: Report
+```
+**Execution modes:**
+- `supervised` (default): user approves task plan, decides on QA failures
+- `autonomous` ("알아서 해줘"): auto-proceeds, escalates only on 3x same failure or Secured FAIL
+**Work modes:**
+- `Claude만으로`: Claude directly implements, no external workers
+- `다른 AI도 함께`: CLI workers + Ollama for parallelized execution, Claude supervises
+**QA Fix Loop — provider escalation:**
+On failure, immediately assign to a DIFFERENT provider with full context (original task, previous AI, diagnosis, fix instruction, scope boundary). Never retry the same provider for the same failure.
 ## Completion Verification
 Before marking work complete, verify all four:

package/skills/trace.md ADDED Viewed

@@ -0,0 +1,61 @@
+---
+name: trace
+description: >
+  Use when the user wants to see agent execution flow, debug agent interactions,
+  view timeline of recent operations, or understand what happened during a
+  multi-AI session. Triggers on: "trace", "what happened", "show flow",
+  "agent timeline", "execution history", "bottleneck", "performance".
+---
+## Purpose
+Wraps the `trace_query`, `trace_summary`, and `trace_visualize` MCP tools into a single cohesive workflow for inspecting agent execution history.
+## Usage
+When this skill activates, determine what the user wants and route accordingly:
+### Timeline View (default)
+Show chronological agent execution flow:
+1. Call `trace_query` with appropriate filters (recent by default, or user-specified time range / event type)
+2. Present results as a formatted timeline:
+   ```
+   [timestamp] event_type — provider — duration — status
+   ```
+3. Highlight anomalies: failed events, unusually long durations, repeated retries
+### Summary View
+Aggregate statistics for a session or time range:
+1. Call `trace_summary` to get aggregate data
+2. Present:
+   - Total events by type (debate turns, dispatches, comparisons, memory ops)
+   - Provider usage breakdown (which providers were called, how often)
+   - Success/failure rates per provider
+   - Average response times
+   - Bottleneck identification (slowest operations)
+### Visual Flow
+For complex multi-agent interactions:
+1. Call `trace_visualize` to get a flow diagram
+2. Present the visualization to the user
+3. Annotate key decision points and branching
+## Routing Logic
+| User Intent | Action |
+|---|---|
+| "what happened" / "show trace" / no specific request | Timeline View (last 20 events) |
+| "summary" / "stats" / "how did it go" | Summary View |
+| "flow" / "diagram" / "visualize" | Visual Flow |
+| Specific event ID or time range | Timeline View with filters |
+## Error Handling
+- If no trace data exists: inform the user that no agent activity has been recorded yet
+- If trace tools are unavailable: suggest the user check that the Agestra MCP server is running