agestra 4.1.1 → 4.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,11 @@
1
+ ---
2
+ event: UserPromptSubmit
3
+ ---
4
+
5
+ If the user's message involves code review, design discussion, idea exploration, comparing options, validation, or parallel work — output exactly:
6
+
7
+ ```
8
+ AGESTRA_SUGGESTION
9
+ ```
10
+
11
+ Otherwise output nothing.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agestra",
3
- "version": "4.1.1",
3
+ "version": "4.3.0",
4
4
  "description": "Claude Code plugin — orchestrate Ollama, Gemini, and Codex for multi-AI debates, cross-validation, and GraphRAG memory",
5
5
  "type": "module",
6
6
  "packageManager": "npm@11.11.0",
@@ -18,6 +18,7 @@
18
18
  ".claude-plugin/",
19
19
  "agents/",
20
20
  "commands/",
21
+ "hooks/",
21
22
  "skills/"
22
23
  ],
23
24
  "scripts": {
@@ -0,0 +1,76 @@
1
+ ---
2
+ name: build-fix
3
+ description: >
4
+ Use when build fails, TypeScript type errors occur, lint errors need fixing,
5
+ or compilation errors block progress. Triggers on: build failure output,
6
+ "tsc" errors, "build failed", "fix build", "type error", "lint error",
7
+ "compilation error".
8
+ ---
9
+
10
+ ## Purpose
11
+
12
+ Automatically diagnoses and fixes build/typecheck/lint errors with minimal scope changes, one error at a time, highest-impact first.
13
+
14
+ ## Strategy: One-at-a-Time
15
+
16
+ Many build errors are cascading — one root cause produces multiple error messages. Fixing all at once risks unnecessary changes. Instead:
17
+
18
+ 1. Fix the **first** (or most impactful) error
19
+ 2. Rebuild to see which errors remain
20
+ 3. Repeat until clean
21
+
22
+ ## Workflow
23
+
24
+ ### Step 1: Identify Errors
25
+
26
+ Run the appropriate build command for the project:
27
+
28
+ | Detected Project Type | Command |
29
+ |---|---|
30
+ | `tsconfig.json` present | `npx tsc --noEmit 2>&1` |
31
+ | `turbo.json` present | `npx turbo build 2>&1` |
32
+ | `package.json` with `build` script | `npm run build 2>&1` |
33
+ | ESLint configured | `npx eslint . 2>&1` |
34
+
35
+ If the user provided specific error output, use that instead of re-running.
36
+
37
+ ### Step 2: Triage
38
+
39
+ Parse errors and rank by impact:
40
+ 1. **Syntax errors** — block all downstream compilation
41
+ 2. **Missing imports/exports** — cascade to many dependents
42
+ 3. **Type mismatches** — usually isolated
43
+ 4. **Lint warnings** — lowest priority
44
+
45
+ ### Step 3: Fix Loop (max 5 cycles)
46
+
47
+ For each cycle:
48
+
49
+ 1. Read the file containing the highest-priority error
50
+ 2. Diagnose the root cause (not the symptom)
51
+ 3. Apply the **minimal** fix — do not refactor surrounding code
52
+ 4. Re-run the build command
53
+ 5. If errors remain, continue to next cycle
54
+ 6. If no errors remain or same error persists 3 times, stop
55
+
56
+ ### Step 4: Report
57
+
58
+ Present results to the user:
59
+
60
+ ```
61
+ Build Fix Summary
62
+ - Cycles: {n}
63
+ - Errors fixed: {count}
64
+ - Remaining errors: {count or "none"}
65
+ - Files modified: {list}
66
+ ```
67
+
68
+ If errors remain after 5 cycles, list them and suggest manual investigation.
69
+
70
+ ## Constraints
71
+
72
+ - **Minimal changes only** — fix the error, nothing else
73
+ - **No refactoring** — do not "improve" code while fixing
74
+ - **No new dependencies** — do not add packages to fix type errors
75
+ - **Preserve behavior** — fixes must not change runtime behavior
76
+ - **Read before edit** — always read the full file before modifying
@@ -0,0 +1,68 @@
1
+ ---
2
+ name: cancel
3
+ description: >
4
+ Use when the user wants to stop, cancel, or abort a running operation.
5
+ Triggers on: "cancel", "stop", "abort", "enough", "quit", "중단", "취소",
6
+ "그만". Performs graceful shutdown with state cleanup.
7
+ ---
8
+
9
+ ## Purpose
10
+
11
+ Gracefully cancels running Agestra operations and cleans up associated state. Detects which operation is active and performs appropriate cleanup.
12
+
13
+ ## Detection
14
+
15
+ Check for active operations in this order:
16
+
17
+ 1. **CLI Workers** — Call `cli_worker_status` to check for workers in RUNNING or SPAWNING state
18
+ 2. **Debate** — Call `agent_debate_status` to check for active debates
19
+ 3. **Task Chain** — Call `agent_task_chain_status` to check for running chains
20
+ 4. **Task** — Call `agent_task_status` for individual running tasks
21
+ 5. **Background agents** — Check for any spawned background agents still running
22
+
23
+ If nothing is detected as active, inform the user: "No active Agestra operations found."
24
+
25
+ If multiple types are active, list them all and ask the user which to cancel (or all).
26
+
27
+ ## Cleanup by Operation Type
28
+
29
+ ### CLI Workers
30
+ 1. List all workers in RUNNING/SPAWNING state with their provider, elapsed time, and task description.
31
+ 2. Ask the user which to stop (or all):
32
+ - Single worker: call `cli_worker_stop` with the worker ID.
33
+ - All workers: call `cli_worker_stop` for each.
34
+ 3. Workers receive SIGTERM, then SIGKILL after 5 seconds.
35
+ 4. Worktrees are cleaned up automatically.
36
+ 5. Report: which workers were stopped, any partial results available via `cli_worker_collect`.
37
+
38
+ ### Debate
39
+ 1. Call `agent_debate_conclude` with a summary noting early termination
40
+ 2. Inform the user which providers participated and how many rounds completed
41
+
42
+ ### Task Chain
43
+ 1. Note the current step and remaining steps
44
+ 2. Let the current step finish if nearly complete, otherwise stop
45
+ 3. Report: completed steps, skipped steps, any partial results
46
+
47
+ ### Individual Task
48
+ 1. Wait for current provider response if in-flight (do not interrupt mid-response)
49
+ 2. Report the task status and any partial results
50
+
51
+ ### Background Agents
52
+ 1. List running background agents with their descriptions
53
+ 2. Ask the user which to cancel (or all)
54
+ 3. Stop selected agents
55
+
56
+ ## Post-Cleanup
57
+
58
+ After cancellation:
59
+ - Summarize what was stopped and what completed
60
+ - Note any artifacts produced (debate documents, partial results, worker diffs)
61
+ - If CLI workers produced changes before stopping, mention that partial diffs may be available via `cli_worker_collect`
62
+ - If the operation produced useful partial work, mention it so the user can resume later
63
+
64
+ ## Constraints
65
+
66
+ - **Never discard results silently** — always report what was produced before cancellation
67
+ - **Prefer graceful over forced** — let in-flight operations finish when possible
68
+ - **Ask before bulk cancel** — if multiple operations are running, confirm which to stop
@@ -0,0 +1,115 @@
1
+ ---
2
+ name: agestra-design
3
+ description: >
4
+ Use when exploring architecture, discussing design trade-offs, planning implementation approaches,
5
+ or structuring a feature before writing code. Triggers on: "design this", "how should I architect",
6
+ "what's the best approach", "explore approaches", "design trade-offs", "before implementing",
7
+ "설계", "아키텍처", "구조 잡아줘", "어떻게 만들지", "방향 잡아줘",
8
+ "設計", "アーキテクチャ", "架构", "设计"
9
+ ---
10
+
11
+ ## Purpose
12
+
13
+ Pre-implementation design exploration. Understand intent through targeted questions, explore the codebase for existing patterns, propose multiple approaches with trade-offs, and produce a design document.
14
+
15
+ ## Scope
16
+
17
+ Design features and systems **for the current project**. If the request is outside this project's scope (a new product idea, a business question, or something unrelated to this codebase), say so and suggest `/agestra idea` for open exploration instead.
18
+
19
+ ## Workflow
20
+
21
+ ### Phase 1: Clarity Gate
22
+
23
+ Before asking questions, check if the request is already clear. If it includes specific file paths, function names, or concrete acceptance criteria, skip the interview.
24
+
25
+ **Clarity Dimensions:**
26
+
27
+ | Dimension | Weight (greenfield) | Weight (brownfield) |
28
+ |-----------|-------------------|-------------------|
29
+ | Goal | 40% | 35% |
30
+ | Constraints | 30% | 25% |
31
+ | Success Criteria | 30% | 25% |
32
+ | Context | N/A | 15% |
33
+
34
+ Greenfield: no relevant source code exists for the feature.
35
+ Brownfield: modifying or extending existing code.
36
+
37
+ **After each user answer:**
38
+ 1. Score all dimensions 0.0–1.0
39
+ 2. Calculate: `ambiguity = 1 - weighted_sum`
40
+ 3. Display progress:
41
+ ```
42
+ Round {n} | Ambiguity: {score}% | Targeting: {weakest dimension}
43
+ ```
44
+ 4. If ambiguity <= 20% → proceed to Phase 2
45
+ 5. If ambiguity > 20% → ask the next question targeting the WEAKEST dimension
46
+
47
+ **Question targeting:** Always target the dimension with the lowest score. Ask ONE question at a time. Expose assumptions, not feature lists.
48
+
49
+ | Dimension | Question Style |
50
+ |-----------|---------------|
51
+ | Goal | "What exactly happens when...?" / "What specific action does a user take first?" |
52
+ | Constraints | "What are the boundaries?" / "Should this work offline?" |
53
+ | Success Criteria | "How do we know it works?" / "What would make you say 'yes, that's it'?" |
54
+ | Context (brownfield) | "How does this fit with existing...?" / "Extend or replace?" |
55
+
56
+ **Challenge modes** (each used once, then return to normal):
57
+ - Round 4+: **Contrarian** — "What if the opposite were true? What if this constraint doesn't actually exist?"
58
+ - Round 6+: **Simplifier** — "What's the simplest version that would still be valuable?"
59
+ - Round 8+: **Ontologist** (if ambiguity still > 30%) — "What IS this, really? One sentence."
60
+
61
+ **Soft limits:**
62
+ - Round 3+: allow early exit if user says "enough" — show ambiguity warning
63
+ - Round 10: soft warning — "We're at 10 rounds. Current ambiguity: {score}%. Continue or proceed?"
64
+ - Round 20: hard cap — proceed with current clarity, note the risk
65
+
66
+ ### Phase 2: Explore
67
+
68
+ Search the codebase for relevant existing patterns:
69
+ - Use Glob to find related files by name
70
+ - Use Grep to find similar implementations
71
+ - Use Read to understand existing architecture
72
+ - Note conventions: naming, file organization, patterns used
73
+
74
+ ### Phase 3: Propose
75
+
76
+ Present 2-3 distinct approaches. For each:
77
+ - **Approach name** — one-line summary
78
+ - **How it works** — architecture overview
79
+ - **Fits with** — which existing patterns it aligns with
80
+ - **Trade-offs** — pros and cons
81
+ - **Effort** — relative complexity (low/medium/high)
82
+
83
+ ### Phase 4: Refine
84
+
85
+ Based on user feedback:
86
+ - Deep-dive into the selected approach
87
+ - Address concerns raised
88
+ - Detail component boundaries and data flow
89
+ - Identify risks and mitigation
90
+
91
+ ### Phase 5: Document
92
+
93
+ Write a design document to `docs/plans/` with this structure:
94
+
95
+ ```markdown
96
+ # [Feature/System Name] Design
97
+
98
+ ## Problem
99
+ ## Approach
100
+ ## Architecture
101
+ ## Components
102
+ ## Data Flow
103
+ ## Trade-offs & Decisions
104
+ ## Open Questions
105
+ ## Implementation Steps
106
+ ```
107
+
108
+ ## Constraints
109
+
110
+ - Ask one question at a time. Do not dump multiple questions.
111
+ - Present approaches before solutions. Let the user choose direction.
112
+ - Always explore the codebase before proposing — do not design in a vacuum.
113
+ - Document all decisions made during the conversation in the final design document.
114
+ - Do not write implementation code. Design documents only.
115
+ - Communicate in the user's language.
package/skills/idea.md ADDED
@@ -0,0 +1,144 @@
1
+ ---
2
+ name: agestra-idea
3
+ description: >
4
+ Use when discovering improvements, comparing with similar projects, collecting user feedback,
5
+ exploring new features, researching what to build, or validating ideas. Triggers on:
6
+ "find improvements", "what should I add", "compare with competitors", "what are users asking for",
7
+ "explore ideas", "feature ideas", "what's missing", "is this worth building", "what do users want",
8
+ "what problem does this solve", "who would use this", "what should I focus on next",
9
+ "개선점", "뭐 추가하면 좋을까", "아이디어", "유사 프로젝트", "뭐가 부족해",
10
+ "이거 만들 가치가 있어?", "다음에 뭘 해야 할까", "비슷한 도구",
11
+ "改善", "アイデア", "改进", "想法"
12
+ ---
13
+
14
+ ## Purpose
15
+
16
+ Idea and improvement discovery. Research similar projects, collect user complaints and feature requests, compare capabilities, and generate actionable suggestions.
17
+
18
+ ## Scope
19
+
20
+ **Mode A: Existing project** — The codebase has a README or meaningful code.
21
+ Research improvements, missing features, and competitive gaps for this project.
22
+
23
+ **Mode B: New project** — The codebase is empty/new, but the user has a seed idea (e.g., "I want to build a writing tool").
24
+ Research the landscape: what already exists, what users complain about, what gaps remain.
25
+
26
+ **Out of scope:** Requests with no seed idea at all (e.g., "what should I build?"). You need at least a domain or concept to anchor research. Ask for one:
27
+
28
+ > "I need at least a rough idea — a domain, a tool type, or a problem you want to solve. For example: 'a writing tool', 'a CLI for deployment', 'something for managing bookmarks'."
29
+
30
+ ## Workflow
31
+
32
+ ### Phase 1: Clarity Gate
33
+
34
+ Before researching, understand what the user needs through targeted questions. Ask ONE question at a time. Communicate in the user's language.
35
+
36
+ **Step 1: Determine mode.**
37
+ - If the codebase has a README or meaningful code → Mode A (existing project)
38
+ - If the codebase is empty/new but user has a seed idea → Mode B (new project)
39
+
40
+ **Step 2: Mode-specific interview.**
41
+
42
+ **Mode A — Existing project:**
43
+
44
+ | Dimension | Question | Purpose |
45
+ |-----------|----------|---------|
46
+ | Direction | "What aspect are you looking to improve? (features, UX, performance, integrations, DX)" | Narrow the research scope |
47
+ | Audience | "Who are your current users? What do they use it for most?" | Target the right competitors |
48
+ | Feedback | "Have you received any complaints or feature requests?" | Direct pain point input |
49
+ | Competition | "Are there specific competitors or similar tools you're aware of?" | Seed the research |
50
+ | Strength | "What do you consider your project's unique strength?" | Avoid suggesting what already works |
51
+ | Constraints | "Any areas you don't want to change or can't change?" | Set research boundaries |
52
+
53
+ After gathering context:
54
+ - Read the project's README and key files to understand what it does
55
+ - Use Glob and Grep to map the current feature set
56
+ - Identify the project's category and target audience
57
+
58
+ **Mode B — New project:**
59
+
60
+ | Dimension | Question | Purpose |
61
+ |-----------|----------|---------|
62
+ | Problem | "What problem are you trying to solve?" | Core motivation |
63
+ | Audience | "Who would use this? What's the target audience?" | Market focus |
64
+ | Form | "How do you envision it? (CLI, web app, library, service, plugin)" | Shape the research |
65
+ | Inspiration | "What inspired this? Have you seen something similar?" | Seed the research |
66
+ | Core | "What's the single most important thing it must do well?" | Prioritization anchor |
67
+ | Boundary | "What should it NOT be? Where do you draw the line?" | Scope limits |
68
+
69
+ **Early exit:** If the user provides enough context upfront (specific competitors, clear scope, concrete goals), skip remaining questions and proceed to Phase 2. Do not force unnecessary rounds.
70
+
71
+ ### Phase 2: Research Similar Projects
72
+
73
+ - Use WebSearch to find similar tools, libraries, and projects
74
+ - Look for: direct competitors, adjacent tools, inspirational projects
75
+ - Collect names, URLs, and key differentiators
76
+
77
+ ### Phase 3: Collect Pain Points
78
+
79
+ - WebSearch for complaints about similar tools (GitHub issues, forums, discussions)
80
+ - WebFetch relevant issue pages and discussion threads
81
+ - Identify recurring themes in user feedback
82
+ - Note what users wish existed but doesn't
83
+
84
+ ### Phase 4: Feature Comparison
85
+
86
+ Build a comparison table:
87
+
88
+ | Feature | This Project | Competitor A | Competitor B |
89
+ |---------|-------------|-------------|-------------|
90
+ | Feature 1 | Yes/No | Yes/No | Yes/No |
91
+
92
+ ### Phase 5: Generate Suggestions
93
+
94
+ For each suggestion:
95
+ - **Title** — clear, actionable name
96
+ - **Category** — UX, Performance, Feature, Integration, DX
97
+ - **Source** — where this idea came from (competitor, user complaint, own analysis)
98
+ - **Priority** — HIGH / MEDIUM / LOW with rationale
99
+ - **Effort** — estimated complexity
100
+ - **Description** — what it does and why it matters
101
+
102
+ ### Phase 6: Prioritized Recommendations
103
+
104
+ Present a ranked list:
105
+ 1. **Quick wins** — high impact, low effort
106
+ 2. **Strategic investments** — high impact, high effort
107
+ 3. **Nice-to-haves** — low impact, low effort
108
+
109
+ ## Output Format
110
+
111
+ ```markdown
112
+ ## Research Summary
113
+
114
+ ### Similar Projects
115
+ (list with URLs and key features)
116
+
117
+ ### User Pain Points
118
+ (categorized complaints from research)
119
+
120
+ ### Feature Comparison
121
+ (table)
122
+
123
+ ### Recommendations
124
+
125
+ #### Quick Wins
126
+ 1. ...
127
+
128
+ #### Strategic Investments
129
+ 1. ...
130
+
131
+ #### Nice-to-Haves
132
+ 1. ...
133
+
134
+ ### Sources
135
+ - [Source 1](url)
136
+ - [Source 2](url)
137
+ ```
138
+
139
+ ## Constraints
140
+
141
+ - Always include source URLs for claims about other projects.
142
+ - Do not fabricate features of competitors — verify via web research.
143
+ - Prioritize actionable suggestions over theoretical improvements.
144
+ - Communicate in the user's language.
@@ -3,17 +3,26 @@ name: provider-guide
3
3
  description: >
4
4
  Use when routing tasks to AI providers, using any agestra MCP tool,
5
5
  reviewing code with multiple providers, starting debates, dispatching
6
- parallel tasks, or cross-validating work. Also triggers on mentions of
7
- Ollama, Gemini, or Codex providers.
6
+ parallel tasks, cross-validating work, or managing CLI workers. Also
7
+ triggers on mentions of Ollama, Gemini, or Codex providers.
8
8
  ---
9
9
 
10
10
  ## Available Providers
11
11
 
12
12
  - **Ollama** — Local models. Detected at runtime via `ollama_models`.
13
- - **Gemini** — Cloud agent. Full capability.
14
- - **Codex** — Cloud agent. Full capability.
13
+ - **Gemini** — Cloud agent. Full capability. Can run as autonomous CLI worker.
14
+ - **Codex** — Cloud agent. Full capability. Can run as autonomous CLI worker.
15
15
 
16
- All providers are detected at runtime. Check `provider_list` or `provider_health` for current availability before routing.
16
+ All providers are detected at runtime. Call `environment_check` for a full capability map, or `provider_list` / `provider_health` for provider availability.
17
+
18
+ ## Environment Check
19
+
20
+ At session start or on demand, `environment_check` provides:
21
+ - CLI tool availability (codex, gemini, tmux)
22
+ - Ollama models with size-based tier classification
23
+ - Git worktree support
24
+ - Available modes: `claude_only`, `independent`, `debate`, `team`
25
+ - Whether autonomous CLI workers can be spawned
17
26
 
18
27
  ## Provider Capability Guidelines
19
28
 
@@ -32,7 +41,46 @@ Models change frequently. Always call `ollama_models` before assigning tasks.
32
41
 
33
42
  ### Gemini / Codex (Cloud)
34
43
 
35
- Full-capability agents. Use for complex tasks, parallel work, and as validators.
44
+ Full-capability agents. Use for:
45
+ - Complex tasks via `ai_chat` or `agent_assign_task` (text response)
46
+ - Autonomous coding via `cli_worker_spawn` (file modifications in worktree)
47
+ - Parallel work and as validators
48
+
49
+ ## Work Modes
50
+
51
+ ### Text Work (리뷰/설계/아이디어)
52
+
53
+ Three modes available via `/agestra review`, `/agestra design`, `/agestra idea`:
54
+
55
+ | Mode | Description | When to Use |
56
+ |------|-------------|-------------|
57
+ | **Claude only** | Specialist agent works alone | Quick analysis, no external AI needed |
58
+ | **각자 독립** | Each AI works independently → moderator aggregates | Want multiple perspectives, fast |
59
+ | **끝장토론** | Independent work + document review rounds until consensus | Need thorough, agreed-upon analysis |
60
+
61
+ ### Implementation Work (실제 구현)
62
+
63
+ Two modes available via team-lead orchestration:
64
+
65
+ | Mode | Description | When to Use |
66
+ |------|-------------|-------------|
67
+ | **Claude만으로** | Claude directly implements with project/global agents | Simple tasks, 1-2 files |
68
+ | **다른 AI도 함께** | CLI workers do autonomous coding, Claude supervises | Complex tasks, 3+ files, parallelizable |
69
+
70
+ ## CLI Workers
71
+
72
+ CLI workers spawn Codex or Gemini in `--full-auto` mode within isolated git worktrees.
73
+
74
+ | Tool | Purpose |
75
+ |------|---------|
76
+ | `cli_worker_spawn` | Spawn autonomous CLI worker with task manifest |
77
+ | `cli_worker_status` | Check worker FSM state, output, heartbeat |
78
+ | `cli_worker_collect` | Collect completed worker results (diff, output) |
79
+ | `cli_worker_stop` | Stop worker (SIGTERM → SIGKILL) + cleanup |
80
+
81
+ Worker lifecycle: SPAWNING → RUNNING → COLLECTING → COMPLETED (or FAILED/CANCELLED/TIMEOUT)
82
+
83
+ Use the `worker-manage` skill for user-friendly worker operations.
36
84
 
37
85
  ## Auto-Routing Guidelines
38
86
 
@@ -40,7 +88,8 @@ Full-capability agents. Use for complex tasks, parallel work, and as validators.
40
88
  |---|---|
41
89
  | Simple (formatting, pattern matching) | Ollama local model preferred |
42
90
  | Moderate (code review, summarization) | Ollama >= 3 GB or cloud |
43
- | Complex (architecture, refactoring) | Cloud providers (Gemini, Codex) |
91
+ | Complex implementation (multi-file, multi-step) | CLI worker (Codex/Gemini) |
92
+ | Complex analysis (architecture, refactoring) | Cloud providers (Gemini, Codex) via ai_chat |
44
93
  | No providers available | Handle directly — do not suggest agestra tools |
45
94
 
46
95
  ## When to Suggest Agestra Tools
@@ -51,37 +100,48 @@ Match by **semantic intent**, not literal keywords. These triggers apply in any
51
100
 
52
101
  | Intent | Tool | When |
53
102
  |---|---|---|
54
- | Code review, review request | `agent_debate_start` or `workspace_create_review` | User asks to review code, PR, or implementation |
55
- | Second opinion, other perspectives | `ai_compare` or `agent_debate_start` | User wants multiple viewpoints on a decision |
103
+ | Code review, review request | `/agestra review` or `workspace_create_review` | User asks to review code, PR, or implementation |
104
+ | Second opinion, other perspectives | `ai_compare` or `/agestra review` (각자 독립) | User wants multiple viewpoints on a decision |
56
105
  | Validation, verification, cross-check | `agent_cross_validate` | User wants to confirm correctness of work output |
57
- | Speed up, parallelize, split work | `agent_dispatch` | User wants faster execution or has independent tasks |
106
+ | Speed up, parallelize, split work | `agent_dispatch` or CLI workers | User wants faster execution or has independent tasks |
58
107
  | Past experience, history, previous attempts | `memory_search` or `memory_dead_ends` | User asks about prior work or known issues |
59
108
  | Remember this, save for later | `memory_store` | User wants to persist knowledge across sessions |
60
109
  | Mention a provider by name (Gemini, Codex, Ollama) | `ai_chat` or `agent_assign_task` | Route directly to the named provider |
61
- | Architecture review, design discussion | `agent_debate_start` | Structured multi-AI discussion on design choices |
110
+ | Architecture review, design discussion | `/agestra design` | Structured multi-AI architecture exploration |
62
111
  | Compare options, which is better | `ai_compare` | Side-by-side comparison from multiple providers |
63
- | Large refactoring, many files to change | `agent_dispatch` | Split by file/module for parallel processing |
112
+ | Large refactoring, many files to change | CLI workers or `agent_dispatch` | Split by file/module for parallel processing |
64
113
  | About to commit, create PR, finalize work | `agent_cross_validate` | Pre-commit validation by other AI providers |
114
+ | Check worker status, manage workers | `worker-manage` skill | User asks about running workers |
65
115
 
66
116
  ### Commands and Agents
67
117
 
68
118
  | Command | Specialist Agent | Purpose |
69
119
  |---------|-----------------|---------|
70
- | `/agestra review` | `reviewer` | Post-implementation quality verification |
71
- | `/agestra idea` | `ideator` | Improvement discovery and competitive analysis |
72
- | `/agestra design` | `designer` | Pre-implementation architecture exploration |
120
+ | `/agestra review` | `agestra-reviewer` | Post-implementation quality verification |
121
+ | `/agestra idea` | `agestra-ideator` | Improvement discovery and competitive analysis |
122
+ | `/agestra design` | `agestra-designer` | Pre-implementation architecture exploration |
73
123
 
74
- When "Debate" is selected, `moderator` facilitates while the specialist provides Claude's perspective.
124
+ ### Utility Skills
75
125
 
76
- Commands and hook-triggered suggestions share the same 4-choice pattern. Commands are explicit entry points; hooks detect intent from natural language.
126
+ | Skill | Purpose |
127
+ |-------|---------|
128
+ | `trace` | View agent execution timeline, summary stats, and flow visualization |
129
+ | `build-fix` | Auto-diagnose and fix build/typecheck/lint errors one at a time |
130
+ | `cancel` | Gracefully stop running operations (including CLI workers) with state cleanup |
131
+ | `worker-manage` | List, check, collect, and stop CLI workers |
132
+
133
+ When "각자 독립" is selected, each AI works independently and `agestra-moderator` aggregates results.
134
+ When "끝장토론" is selected, `agestra-moderator` facilitates document review rounds after independent aggregation.
135
+
136
+ Commands and hook-triggered suggestions share the same 3-choice pattern (Claude only / 각자 독립 / 끝장토론). Commands are explicit entry points; hooks detect intent from natural language.
77
137
 
78
138
  ### Hook-Triggered Choice
79
139
 
80
140
  When an `AGESTRA_SUGGESTION` marker appears from the UserPromptSubmit hook, present these choices:
81
141
 
82
142
  1. **Claude only** — Claude Code handles it alone
83
- 2. **Compare**Send the same prompt to multiple AIs, compare responses (`ai_compare`)
84
- 3. **Debate**AIs discuss until consensus is reached (`agent_debate_start`)
143
+ 2. **각자 독립** Each AI works independently, moderator aggregates
144
+ 3. **끝장토론**Independent work + document review rounds until consensus
85
145
  4. **Other** — User specifies the approach
86
146
 
87
147
  Present choices in the user's language. If no providers are available, skip and proceed directly.
@@ -101,6 +161,32 @@ Do NOT wait for rate limit reset.
101
161
  - Call `memory_dead_ends` before starting work to avoid repeating failed strategies.
102
162
  - Call `memory_store` to save findings for future sessions.
103
163
 
164
+ ## Orchestration Pipeline
165
+
166
+ When team-lead orchestrates multi-AI work, the full pipeline is:
167
+
168
+ ```
169
+ Phase 0: Clarity Gate (designer — ambiguity scoring, skip if request is clear)
170
+ Phase 1: Situation Assessment (team-lead — environment_check, providers, design doc)
171
+ Phase 2: Task Design (team-lead — work mode selection, decompose, route by AI capability)
172
+ Phase 3: Parallel Execution (team-lead — Claude + CLI workers + Ollama, monitor loop)
173
+ Phase 4: Result Inspection (team-lead — review diffs, check consistency, merge)
174
+ Phase 5: QA Cycle (qa — verify, classify failures → team-lead auto-fixes, max 5 cycles)
175
+ Phase 6: Quality Gate (reviewer — TRUST 5: Tested/Readable/Unified/Secured/Trackable)
176
+ Phase 7: Report
177
+ ```
178
+
179
+ **Execution modes:**
180
+ - `supervised` (default): user approves task plan, decides on QA failures
181
+ - `autonomous` ("알아서 해줘"): auto-proceeds, escalates only on 3x same failure or Secured FAIL
182
+
183
+ **Work modes:**
184
+ - `Claude만으로`: Claude directly implements, no external workers
185
+ - `다른 AI도 함께`: CLI workers + Ollama for parallelized execution, Claude supervises
186
+
187
+ **QA Fix Loop — provider escalation:**
188
+ On failure, immediately assign to a DIFFERENT provider with full context (original task, previous AI, diagnosis, fix instruction, scope boundary). Never retry the same provider for the same failure.
189
+
104
190
  ## Completion Verification
105
191
 
106
192
  Before marking work complete, verify all four:
@@ -0,0 +1,61 @@
1
+ ---
2
+ name: trace
3
+ description: >
4
+ Use when the user wants to see agent execution flow, debug agent interactions,
5
+ view timeline of recent operations, or understand what happened during a
6
+ multi-AI session. Triggers on: "trace", "what happened", "show flow",
7
+ "agent timeline", "execution history", "bottleneck", "performance".
8
+ ---
9
+
10
+ ## Purpose
11
+
12
+ Wraps the `trace_query`, `trace_summary`, and `trace_visualize` MCP tools into a single cohesive workflow for inspecting agent execution history.
13
+
14
+ ## Usage
15
+
16
+ When this skill activates, determine what the user wants and route accordingly:
17
+
18
+ ### Timeline View (default)
19
+
20
+ Show chronological agent execution flow:
21
+
22
+ 1. Call `trace_query` with appropriate filters (recent by default, or user-specified time range / event type)
23
+ 2. Present results as a formatted timeline:
24
+ ```
25
+ [timestamp] event_type — provider — duration — status
26
+ ```
27
+ 3. Highlight anomalies: failed events, unusually long durations, repeated retries
28
+
29
+ ### Summary View
30
+
31
+ Aggregate statistics for a session or time range:
32
+
33
+ 1. Call `trace_summary` to get aggregate data
34
+ 2. Present:
35
+ - Total events by type (debate turns, dispatches, comparisons, memory ops)
36
+ - Provider usage breakdown (which providers were called, how often)
37
+ - Success/failure rates per provider
38
+ - Average response times
39
+ - Bottleneck identification (slowest operations)
40
+
41
+ ### Visual Flow
42
+
43
+ For complex multi-agent interactions:
44
+
45
+ 1. Call `trace_visualize` to get a flow diagram
46
+ 2. Present the visualization to the user
47
+ 3. Annotate key decision points and branching
48
+
49
+ ## Routing Logic
50
+
51
+ | User Intent | Action |
52
+ |---|---|
53
+ | "what happened" / "show trace" / no specific request | Timeline View (last 20 events) |
54
+ | "summary" / "stats" / "how did it go" | Summary View |
55
+ | "flow" / "diagram" / "visualize" | Visual Flow |
56
+ | Specific event ID or time range | Timeline View with filters |
57
+
58
+ ## Error Handling
59
+
60
+ - If no trace data exists: inform the user that no agent activity has been recorded yet
61
+ - If trace tools are unavailable: suggest the user check that the Agestra MCP server is running