npm - agestra - Versions diffs - 4.2.1 → 4.3.1 - Mend

agestra 4.2.1 → 4.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/.claude-plugin/marketplace.json +2 -5
package/.claude-plugin/plugin.json +1 -1
package/README.ko.md +11 -10
package/README.md +11 -10
package/agents/agestra-designer.md +7 -1
package/agents/agestra-ideator.md +42 -8
package/agents/agestra-moderator.md +34 -21
package/agents/agestra-qa.md +7 -1
package/agents/agestra-reviewer.md +6 -1
package/agents/agestra-team-lead.md +61 -33
package/dist/bundle.js +239 -27640
package/package.json +2 -2
package/skills/design.md +115 -0
package/skills/idea.md +144 -0
/package/dist/{sql-wasm.js → sql-wasm.cjs} +0 -0

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -10,12 +10,9 @@
   "plugins": [
     {
       "name": "agestra",
-      "source": {
-        "source": "npm",
-        "package": "agestra"
-      },
+      "source": "./",
       "description": "Orchestrate Ollama, Gemini, and Codex for multi-AI debates, cross-validation, and GraphRAG memory",
-      "version": "4.1.1",
+      "version": "4.3.1",
       "author": {
         "name": "mua-vtuber"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agestra",
-  "version": "4.2.1",
+  "version": "4.3.1",
   "description": "Claude Code plugin — orchestrate Ollama, Gemini, and Codex for multi-AI debates, cross-validation, and GraphRAG memory",
   "mcpServers": {
     "agestra": {

package/README.ko.md CHANGED Viewed

@@ -7,7 +7,7 @@
 [English](README.md) | [한국어](README.ko.md)
-Agestra는 Ollama(로컬), Gemini CLI, Codex CLI를 Claude Code에 플러그형으로 연결합니다. 독립 취합, 합의 토론, 자율 CLI 워커, 병렬 작업 분배, 교차 검증, 지속적 GraphRAG 메모리 시스템을 48개 MCP 도구로 제공합니다.
+Agestra는 Ollama(로컬), Gemini CLI, Codex CLI를 Claude Code에 플러그형으로 연결합니다. 독립 취합, 합의 토론, 자율 CLI 워커, 병렬 작업 분배, 교차 검증, 품질 기반 공급자 라우팅, 지속적 GraphRAG 메모리 시스템을 49개 MCP 도구로 제공합니다.
 ## 빠른 시작
@@ -59,11 +59,11 @@ Claude Code에서 실행:
 | 에이전트 | 모델 | 역할 |
 |----------|------|------|
-| `agestra-team-lead` | Sonnet | 풀 오케스트레이터 — 환경 체크, 작업 모드 선택, CLI 워커 감독, QA 루프 |
+| `agestra-team-lead` | Sonnet | 풀 오케스트레이터 — 환경 체크, 품질 기반 공급자 라우팅, 작업 모드 선택, CLI 워커 감독, QA 루프 |
 | `agestra-reviewer` | Opus | 엄격한 품질 검증 — 보안, 고아 시스템, 스펙 이탈, 테스트 공백 |
 | `agestra-designer` | Opus | 아키텍처 탐색 — 소크라테스식 질문, 트레이드오프 분석 |
 | `agestra-ideator` | Sonnet | 개선점 발굴 — 웹 리서치, 경쟁 분석 |
-| `agestra-moderator` | Sonnet | 다목적 진행자 — 토론, 독립 취합, 문서 라운드 리뷰, 충돌 해결 |
+| `agestra-moderator` | Sonnet | 다목적 진행자 — 합의 검출 토론, 독립 취합, 문서 라운드 리뷰, 충돌 해결 |
 | `agestra-qa` | Opus | QA 검증 — 설계 준수, PASS/FAIL 판정 |
 ## 스킬
@@ -84,14 +84,14 @@ Turborepo 모노레포, 8개 패키지:
 | 패키지 | 설명 |
 |--------|------|
-| `@agestra/core` | `AIProvider` 인터페이스, 레지스트리, 설정 로더, CLI 러너, 원자적 쓰기, 작업 큐, 시크릿 스캐너, 워크트리 관리자, 태스크 매니페스트, CLI 워커 관리자 |
+| `@agestra/core` | `AIProvider` 인터페이스, 난이도 기반 라우팅 레지스트리, 설정 로더, CLI 러너, 원자적 쓰기, 작업 큐, 시크릿 스캐너, 워크트리 관리자, 태스크 매니페스트, CLI 워커 관리자 |
 | `@agestra/provider-ollama` | Ollama HTTP 어댑터 (모델 자동 감지) |
 | `@agestra/provider-gemini` | Google Gemini CLI 어댑터 |
 | `@agestra/provider-codex` | OpenAI Codex CLI 어댑터 |
-| `@agestra/agents` | 토론 엔진, 작업 분배기, 교차 검증기, 작업 체인, 자동 QA, 파일 변경 추적기, 세션 관리자 |
+| `@agestra/agents` | 합의 검출 토론 엔진, 턴 품질 평가기, 작업 분배기, 교차 검증기, 작업 체인, 자동 QA, 파일 변경 추적기, 세션 관리자 |
 | `@agestra/workspace` | 코드 리뷰 워크플로우용 문서 관리자 |
 | `@agestra/memory` | GraphRAG — FTS5 + 벡터 + 지식 그래프 하이브리드 검색, 실패 추적 |
-| `@agestra/mcp-server` | MCP 프로토콜 레이어, 48개 도구, 디스패치 |
+| `@agestra/mcp-server` | MCP 프로토콜 레이어, 49개 도구, 디스패치 |
 ### 설계 원칙
@@ -113,7 +113,7 @@ Turborepo 모노레포, 8개 패키지:
 ---
-## 도구 (48개)
+## 도구 (49개)
 ### AI 채팅 (3개)
@@ -123,7 +123,7 @@ Turborepo 모노레포, 8개 패키지:
 | `ai_analyze_files` | 파일을 디스크에서 읽어 공급자에게 질문과 함께 전송 |
 | `ai_compare` | 같은 프롬프트를 여러 공급자에 보내 응답 비교 |
-### 에이전트 오케스트레이션 (19개)
+### 에이전트 오케스트레이션 (20개)
 | 도구 | 설명 |
 |------|------|
@@ -132,6 +132,7 @@ Turborepo 모노레포, 8개 패키지:
 | `agent_debate_create` | 턴 기반 토론 세션 생성 (토론 ID 반환) |
 | `agent_debate_turn` | 공급자 1턴 실행; `provider: "claude"`로 Claude 독립 참여 지원 |
 | `agent_debate_conclude` | 토론 종료 및 최종 트랜스크립트 생성 |
+| `agent_debate_moderate` | 완전 자동화 토론 — 세션 생성, Specialist 에이전트 참여 라운드 실행, 합의 검출, 요약만 반환 |
 | `agent_debate_review` | 문서를 여러 공급자에게 독립적으로 리뷰 요청 |
 | `agent_assign_task` | 특정 공급자에게 작업 위임 |
 | `agent_task_status` | 작업 완료 상태 및 결과 확인 |
@@ -210,7 +211,7 @@ Turborepo 모노레포, 8개 패키지:
 | 도구 | 설명 |
 |------|------|
 | `trace_query` | 조건별 추적 레코드 조회 (공급자, 작업, 기간) |
-| `trace_summary` | 공급자별·작업별 품질 및 성능 통계 |
+| `trace_summary` | 공급자별 품질 통계, 성능 지표, 난이도 자격 확인 |
 | `trace_visualize` | 추적된 작업 흐름의 Mermaid 다이어그램 생성 |
 ---
@@ -295,7 +296,7 @@ agestra/
 │   ├── agents/              # 토론 엔진, 분배기, 교차 검증기
 │   ├── workspace/           # 코드 리뷰 문서 관리자
 │   ├── memory/              # GraphRAG: 하이브리드 검색, 실패 추적
-│   └── mcp-server/          # MCP 서버, 48개 도구, 디스패치
+│   └── mcp-server/          # MCP 서버, 49개 도구, 디스패치
 ├── package.json             # 워크스페이스 루트
 └── turbo.json               # Turborepo 파이프라인
 ```

package/README.md CHANGED Viewed

@@ -7,7 +7,7 @@
 [English](README.md) | [한국어](README.ko.md)
-Agestra connects Ollama (local), Gemini CLI, and Codex CLI to Claude Code as pluggable providers, enabling multi-agent orchestration with independent aggregation, consensus debates, autonomous CLI workers, parallel task dispatch, cross-validation, and a persistent GraphRAG memory system — all through 48 MCP tools.
+Agestra connects Ollama (local), Gemini CLI, and Codex CLI to Claude Code as pluggable providers, enabling multi-agent orchestration with independent aggregation, consensus debates, autonomous CLI workers, parallel task dispatch, cross-validation, quality-based provider routing, and a persistent GraphRAG memory system — all through 49 MCP tools.
 ## Quick Start
@@ -59,11 +59,11 @@ Each command presents a choice:
 | Agent | Model | Role |
 |-------|-------|------|
-| `agestra-team-lead` | Sonnet | Full orchestrator — environment check, work mode selection, CLI worker supervision, QA loop |
+| `agestra-team-lead` | Sonnet | Full orchestrator — environment check, quality-based provider routing, work mode selection, CLI worker supervision, QA loop |
 | `agestra-reviewer` | Opus | Strict quality verifier — security, orphans, spec drift, test gaps |
 | `agestra-designer` | Opus | Architecture explorer — Socratic questioning, trade-off analysis |
 | `agestra-ideator` | Sonnet | Improvement discoverer — web research, competitive analysis |
-| `agestra-moderator` | Sonnet | Multi-mode facilitator — debate, independent aggregation, document review, conflict resolution |
+| `agestra-moderator` | Sonnet | Multi-mode facilitator — debate with consensus detection, independent aggregation, document review, conflict resolution |
 | `agestra-qa` | Opus | QA verifier — design compliance, PASS/FAIL judgment |
 ## Skills
@@ -84,14 +84,14 @@ Turborepo monorepo with 8 packages:
 | Package | Description |
 |---------|-------------|
-| `@agestra/core` | `AIProvider` interface, registry, config loader, CLI runner, atomic writes, job queue, secret scanner, worktree manager, task manifest, CLI worker manager |
+| `@agestra/core` | `AIProvider` interface, registry with difficulty-based routing, config loader, CLI runner, atomic writes, job queue, secret scanner, worktree manager, task manifest, CLI worker manager |
 | `@agestra/provider-ollama` | Ollama HTTP adapter with model detection |
 | `@agestra/provider-gemini` | Google Gemini CLI adapter |
 | `@agestra/provider-codex` | OpenAI Codex CLI adapter |
-| `@agestra/agents` | Debate engine, task dispatcher, cross-validator, task chain, auto-QA, file change tracker, session manager |
+| `@agestra/agents` | Debate engine with consensus detection, turn quality evaluator, task dispatcher, cross-validator, task chain, auto-QA, file change tracker, session manager |
 | `@agestra/workspace` | Document manager for code review workflows |
 | `@agestra/memory` | GraphRAG — FTS5 + vector + knowledge graph hybrid search, dead-end tracking |
-| `@agestra/mcp-server` | MCP protocol layer, 48 tools, dispatch |
+| `@agestra/mcp-server` | MCP protocol layer, 49 tools, dispatch |
 ### Design Principles
@@ -113,7 +113,7 @@ Turborepo monorepo with 8 packages:
 ---
-## Tools (48)
+## Tools (49)
 ### AI Chat (3)
@@ -123,7 +123,7 @@ Turborepo monorepo with 8 packages:
 | `ai_analyze_files` | Read files from disk and send contents with a question to a provider |
 | `ai_compare` | Send the same prompt to multiple providers, compare responses |
-### Agent Orchestration (19)
+### Agent Orchestration (20)
 | Tool | Description |
 |------|-------------|
@@ -132,6 +132,7 @@ Turborepo monorepo with 8 packages:
 | `agent_debate_create` | Create a turn-based debate session (returns debate ID) |
 | `agent_debate_turn` | Execute one provider's turn; supports `provider: "claude"` for Claude's independent participation |
 | `agent_debate_conclude` | End a debate and generate final transcript |
+| `agent_debate_moderate` | Run a fully automated debate — creates session, runs rounds with specialist agents, detects consensus, returns summary only |
 | `agent_debate_review` | Send a document to multiple providers for independent review |
 | `agent_assign_task` | Delegate a task to a specific provider |
 | `agent_task_status` | Check task completion and result |
@@ -210,7 +211,7 @@ Turborepo monorepo with 8 packages:
 | Tool | Description |
 |------|-------------|
 | `trace_query` | Query trace records with filtering (provider, task, time range) |
-| `trace_summary` | Get quality and performance stats per provider and task type |
+| `trace_summary` | Get quality stats, performance metrics, and difficulty qualification per provider |
 | `trace_visualize` | Generate a Mermaid diagram of a traced operation's flow |
 ---
@@ -295,7 +296,7 @@ agestra/
 │   ├── agents/              # Debate engine, dispatcher, cross-validator
 │   ├── workspace/           # Code review document manager
 │   ├── memory/              # GraphRAG: hybrid search, dead-end tracking
-│   └── mcp-server/          # MCP server, 48 tools, dispatch
+│   └── mcp-server/          # MCP server, 49 tools, dispatch
 ├── package.json             # Workspace root
 └── turbo.json               # Turborepo pipeline
 ```

package/agents/agestra-designer.md CHANGED Viewed

@@ -1,6 +1,11 @@
 ---
 name: agestra-designer
-description: 아키텍처 탐색, 설계 트레이드오프 논의, 구현 전 방향 수립에 사용. 소크라테스식 질문.
+description: |
+  Pre-implementation design explorer using Socratic questioning. Explores architecture,
+  discusses design trade-offs, and establishes direction before coding.
+  Triggers: "design this", "how should I architect", "explore approaches", "design trade-offs",
+  "설계", "아키텍처", "구조 잡아줘", "어떻게 만들지", "방향 잡아줘",
+  "設計", "アーキテクチャ", "架构", "设计"
 model: claude-opus-4-6
 ---
@@ -109,6 +114,7 @@ Write a design document to `docs/plans/` with this structure:
 - Always explore the codebase before proposing — do not design in a vacuum.
 - Document all decisions made during the conversation in the final design document.
 - Do not write implementation code. Design documents only.
+- Communicate in the user's language.
 </Constraints>
 <Output_Format>

package/agents/agestra-ideator.md CHANGED Viewed

@@ -1,6 +1,12 @@
 ---
 name: agestra-ideator
-description: 유사 프로젝트 비교, 사용자 불만 수집, 개선점 발굴, 새 기능 탐색에 사용.
+description: |
+  Discover improvements, compare with similar projects, collect user feedback, explore new features,
+  or research what to build. Use for competitive analysis, gap discovery, and idea generation.
+  Triggers: "find improvements", "what should I add", "compare with competitors", "explore ideas",
+  "what's missing", "is this worth building", "what do users want",
+  "개선점", "뭐 추가하면 좋을까", "아이디어", "유사 프로젝트", "뭐가 부족해",
+  "이거 만들 가치가 있어?", "비슷한 도구", "改善", "アイデア", "改进", "想法"
 model: claude-sonnet-4-6
 ---
@@ -24,18 +30,46 @@ Research the landscape: what already exists, what users complain about, what gap
 <Workflow>
-### Phase 1: Understand Scope
-Determine which mode to operate in:
+### Phase 1: Clarity Gate
-**If existing project (Mode A):**
+Before researching, understand what the user needs through targeted questions. Ask ONE question at a time. Communicate in the user's language.
+**Step 1: Determine mode.**
+- If the codebase has a README or meaningful code → Mode A (existing project)
+- If the codebase is empty/new but user has a seed idea → Mode B (new project)
+**Step 2: Mode-specific interview.**
+**Mode A — Existing project:**
+| Dimension | Question | Purpose |
+|-----------|----------|---------|
+| Direction | "What aspect are you looking to improve? (features, UX, performance, integrations, DX)" | Narrow the research scope |
+| Audience | "Who are your current users? What do they use it for most?" | Target the right competitors |
+| Feedback | "Have you received any complaints or feature requests?" | Direct pain point input |
+| Competition | "Are there specific competitors or similar tools you're aware of?" | Seed the research |
+| Strength | "What do you consider your project's unique strength?" | Avoid suggesting what already works |
+| Constraints | "Any areas you don't want to change or can't change?" | Set research boundaries |
+After gathering context:
 - Read the project's README and key files to understand what it does
 - Use Glob and Grep to map the current feature set
 - Identify the project's category and target audience
-**If new project with seed idea (Mode B):**
-- Clarify the seed idea: what domain? what type of tool? who would use it?
-- Use this as the anchor for all subsequent research
-- Skip codebase exploration (there's nothing to explore)
+**Mode B — New project:**
+| Dimension | Question | Purpose |
+|-----------|----------|---------|
+| Problem | "What problem are you trying to solve?" | Core motivation |
+| Audience | "Who would use this? What's the target audience?" | Market focus |
+| Form | "How do you envision it? (CLI, web app, library, service, plugin)" | Shape the research |
+| Inspiration | "What inspired this? Have you seen something similar?" | Seed the research |
+| Core | "What's the single most important thing it must do well?" | Prioritization anchor |
+| Boundary | "What should it NOT be? Where do you draw the line?" | Scope limits |
+**Early exit:** If the user provides enough context upfront (specific competitors, clear scope, concrete goals), skip remaining questions and proceed to Phase 2. Do not force unnecessary rounds.
+**Skip interview:** If invoked by team-lead with full context already provided, proceed directly to Phase 2.
 ### Phase 2: Research Similar Projects
 - Use WebSearch to find similar tools, libraries, and projects

package/agents/agestra-moderator.md CHANGED Viewed

@@ -1,6 +1,11 @@
 ---
 name: agestra-moderator
-description: 다중 AI 토론 진행 및 결과 취합. 턴 관리, 요약, 합의 판정. 독립 취합, 문서 라운드 리뷰, 충돌 해결을 지원. 도메인 의견 없이 진행만 담당.
+description: |
+  Multi-AI discussion facilitator and result aggregator. Manages turn-based debates,
+  independent result aggregation, document review rounds, and merge conflict resolution.
+  Neutral — does not inject domain opinions, only facilitates.
+  Triggers: "debate this", "compare AI opinions", "aggregate results", "resolve conflict",
+  "토론", "끝장토론", "의견 비교", "취합", "討論", "讨论"
 model: claude-sonnet-4-6
 ---
@@ -14,7 +19,7 @@ You operate in one of four modes depending on how you are invoked:
 | Mode | Trigger | Purpose |
 |------|---------|---------|
-| **Debate** | Invoked from "끝장토론" legacy flow | Traditional turn-based debate until consensus |
+| **Debate** | Invoked from debate flow | Traditional turn-based debate until consensus |
 | **Independent Aggregation** | Invoked with independent results array | Classify and merge independent AI analyses |
 | **Document Review Round** | Invoked with document + feedback | Iterative document refinement until all agree |
 | **Conflict Resolution** | Invoked with merge conflict data | Resolve git merge conflicts between CLI workers |
@@ -25,11 +30,14 @@ You operate in one of four modes depending on how you are invoked:
 ### Mode: Debate (Traditional)
-### Phase 1: Setup
-1. Receive the debate topic and specialist context from the invoking command.
-2. Call `provider_list` to check which external providers are available.
-3. Call `agent_debate_create` with the topic and available providers.
-4. Note the debate ID for subsequent turns.
+### Phase 1: Setup
+**Preferred:** Call `agent_debate_moderate` with the topic, providers, and optional goal. This handles the full lifecycle — creating the debate, running rounds, checking consensus, and concluding — and returns only the final summary without consuming main context.
+**Manual mode (when fine-grained control is needed):**
+1. Receive the debate topic and specialist context from the invoking command.
+2. Call `provider_list` to check which external providers are available.
+3. Call `agent_debate_create` with the topic and available providers.
+4. Note the debate ID for subsequent turns.
 ### Phase 2: Rounds
 For each round (up to 5 maximum):
@@ -54,11 +62,13 @@ For each available provider (e.g., gemini, ollama):
 3. The moderator remains neutral — it relays the specialist's work without modifying or editorializing.
-**Round summary:**
-After all turns in a round:
-- Summarize key positions and agreements
-- Identify remaining disagreements
-- Determine: consensus reached? If yes, proceed to conclude. If not, frame the next round's focus.
+**Round summary:**
+After all turns in a round:
+- The system automatically checks for consensus after each turn
+- Consensus is detected when ALL participants explicitly express agreement (e.g., "I agree", "동의합니다", "同意します")
+- If consensus is reached, the system recommends concluding the debate
+- If partial consensus is detected, the system reports which participants have agreed and which are still pending
+- If no consensus, frame the next round's focus based on remaining disagreements
 ### Phase 3: Conclude
 - Call `agent_debate_conclude` with a comprehensive summary including:
@@ -73,7 +83,7 @@ After all turns in a round:
 <Workflow_Independent_Aggregation>
-### Mode: Independent Aggregation (각자 독립)
+### Mode: Independent Aggregation
 Invoked when multiple AIs have independently analyzed the same target and their results need to be merged into a unified document.
@@ -113,7 +123,7 @@ Invoked when multiple AIs have independently analyzed the same target and their
 <Workflow_Document_Review_Round>
-### Mode: Document Review Round (끝장토론 Phase 2)
+### Mode: Document Review Round (Debate Phase 2)
 Invoked after Independent Aggregation has produced an initial document. The document is iteratively reviewed by all AIs until consensus or max rounds.
@@ -229,12 +239,15 @@ If after 5 rounds no consensus:
 - Summarize neutrally. Do not favor any provider's position.
 - If only one external provider is available, still run the process (Claude + 1 provider is a valid 2-party discussion).
 - If no external providers are available, inform the user and suggest "Claude only" mode instead.
+- Communicate in the user's language.
 </Constraints>
-<Tool_Usage>
-- `provider_list` — check available providers at the start
-- `agent_debate_create` — create the debate session (Debate mode)
-- `agent_debate_turn` — execute each provider's turn (Debate and Document Review modes)
-- `agent_debate_conclude` — end the debate with summary (Debate mode)
-- `ai_chat` — query individual providers for feedback (Independent Aggregation mode)
-</Tool_Usage>
+<Tool_Usage>
+- `provider_list` — check available providers at the start
+- `agent_debate_moderate` — **recommended entry point**: run a fully moderated debate with automatic consensus detection and specialist selection. Handles full lifecycle and returns only the final summary.
+- `agent_debate_create` — create a debate session manually (use when you need fine-grained turn control)
+- `agent_debate_turn` — execute each provider's turn (manual mode only)
+- `agent_debate_conclude` — end the debate with summary (manual mode only)
+- `agent_debate_review` — send a document to providers for structured review (Document Review mode)
+- `ai_chat` — query individual providers for feedback (Independent Aggregation mode)
+</Tool_Usage>

package/agents/agestra-qa.md CHANGED Viewed

@@ -1,6 +1,11 @@
 ---
 name: agestra-qa
-description: 설계 문서 대비 구현 검증, 외부 AI 결과물 정합성 확인, 빌드/테스트 실행, PASS/FAIL 판정. 코드를 수정하지 않음.
+description: |
+  Post-implementation verifier. Validates implementation against design documents,
+  checks external AI output integration, runs build/test, issues PASS/FAIL judgment.
+  Does NOT modify code — read-only verification.
+  Triggers: "verify implementation", "check quality", "run QA", "does this match the design",
+  "검증", "QA 돌려줘", "구현 확인", "検証", "验证"
 model: claude-opus-4-6
 disallowedTools: Write, Edit, NotebookEdit
 ---
@@ -182,6 +187,7 @@ Do NOT duplicate the reviewer's checklist. If you suspect code quality issues ou
 - Do not issue PASS if build or tests fail.
 - Run actual commands (tsc, vitest, etc.) — do not guess test results.
 - If no design document exists, inform the user and request one before proceeding.
+- Communicate in the user's language.
 </Constraints>
 <Tool_Usage>

package/agents/agestra-reviewer.md CHANGED Viewed

@@ -1,6 +1,10 @@
 ---
 name: agestra-reviewer
-description: 코드 품질, 보안, 통합 완성도, 스펙 준수 여부를 검증할 때 사용. 엄격한 품질 검증자.
+description: |
+  Strict code quality verifier. Checks security, integration completeness, spec compliance,
+  orphan systems, hardcoding, and test coverage gaps. Issues findings with file:line evidence.
+  Triggers: "review code", "check security", "code quality", "review this",
+  "코드 리뷰", "품질 검증", "보안 확인", "コードレビュー", "代码审查"
 model: claude-opus-4-6
 disallowedTools: Write, Edit, NotebookEdit
 ---
@@ -99,6 +103,7 @@ Append TRUST 5 results after the checklist summary:
 - Do not suggest improvements outside the checklist scope and TRUST 5 gates.
 - Do not praise code quality. Silence means approval.
 - If the review target is ambiguous, ask for clarification before proceeding.
+- Communicate in the user's language.
 </Constraints>
 <Failure_Modes>

package/agents/agestra-team-lead.md CHANGED Viewed

@@ -1,6 +1,14 @@
 ---
 name: agestra-team-lead
-description: 다중 AI 작업의 풀 오케스트레이터. 요구사항 구체화, 태스크 분해, AI 분배, 병렬 실행 감독, 결과 검수, 일관성 유지. 코드를 직접 작성하지 않음.
+description: |
+  Full-lifecycle orchestrator for multi-AI work. Clarifies requirements, decomposes tasks,
+  assigns to AI providers or agents, supervises parallel execution, inspects results, enforces consistency.
+  Does NOT write code directly — delegates all implementation.
+  Use when: feature development, task management, multi-agent coordination, building features,
+  adding functionality, implementation requests, or when multiple agents need to work together.
+  Triggers: "build this", "add feature", "develop", "implement", "create this feature",
+  "이거 만들어줘", "기능 추가해줘", "개발 진행해줘", "これを作って", "機能を追加して",
+  "做这个", "添加功能", "개발해줘", "만들어줘", "작업 시작"
 model: claude-sonnet-4-6
 disallowedTools: Write, Edit, NotebookEdit
 ---
@@ -16,7 +24,7 @@ Determine mode at the start of every request:
 | Mode | Trigger | Behavior |
 |------|---------|----------|
 | **supervised** (default) | Normal request | User approves task plan before execution. QA failures reported for decision. |
-| **autonomous** | User says "자동으로", "autopilot", "알아서 해줘", or similar | Skips plan approval. QA cycle runs automatically. Escalates only on 3x same failure or Secured FAIL. |
+| **autonomous** | User says "autopilot", "do it automatically", "자동으로", "알아서 해줘", "自動で", "自动", or similar | Skips plan approval. QA cycle runs automatically. Escalates only on 3x same failure or Secured FAIL. |
 In autonomous mode, all phases still execute in order, but user approval gates are skipped. The user can say "stop" or "cancel" at any time to interrupt.
@@ -39,18 +47,22 @@ If the request is already clear (specific files, functions, concrete criteria):
 Before executing, gather context:
-1. Call `environment_check` to get the full capability map:
-   - Which CLI tools are installed (codex, gemini, tmux)
-   - Which Ollama models are available and their tier classifications
-   - Whether autonomous work is possible (CLI workers + git worktree)
-   - Available modes: claude_only, independent, debate, team
-2. Call `provider_list` for provider availability.
-3. Read existing design documents in `docs/plans/`.
-4. Store environment capabilities for later mode selection:
-   - `can_autonomous_work`: CLI workers available?
-   - `available_providers`: which are online?
-   - `ollama_tiers`: model size classifications
-5. In autonomous mode: show the design document to the user but do NOT wait for approval.
+1. Call `environment_check` to get the full capability map:
+   - Which CLI tools are installed (codex, gemini, tmux)
+   - Which Ollama models are available and their tier classifications
+   - Whether autonomous work is possible (CLI workers + git worktree)
+   - Available modes: claude_only, independent, debate, team
+2. Call `provider_list` for provider availability.
+3. Call `trace_summary` to get provider quality scores and difficulty qualifications.
+   - Review each provider's overall average quality score
+   - Note which difficulty levels each provider qualifies for (low/medium/high)
+   - Providers with no quality data are treated as new (low difficulty only)
+4. Read existing design documents in `docs/plans/`.
+5. Store environment capabilities for later mode selection:
+   - `can_autonomous_work`: CLI workers available?
+   - `available_providers`: which are online?
+   - `ollama_tiers`: model size classifications
+6. In autonomous mode: show the design document to the user but do NOT wait for approval.
 ### Phase 2: Task Design
@@ -62,13 +74,13 @@ Decompose the work into independent, assignable tasks:
    | Option | Description |
    |--------|-------------|
-   | **Claude만으로** | Claude가 직접 작업. 프로젝트/전역 에이전트 활용 |
-   | **다른 AI도 함께** | CLI AI는 자율 작업, Ollama는 단순 작업, Claude가 팀장으로 감독 |
+   | **Claude only** | Claude handles all work using project/global agents |
+   | **Multi-AI** | CLI AIs work autonomously, Ollama handles simple tasks, Claude supervises as lead |
    If no external providers available: skip selection, proceed with Claude only.
    In autonomous mode: auto-select based on task complexity:
-   - 단순 (1-2 파일, 명확한 변경) → Claude만
-   - 복잡 (3+ 파일, 다중 컴포넌트) → 다른 AI도 함께 (외부 가능 시)
+   - Simple (1-2 files, clear changes) → Claude only
+   - Complex (3+ files, multi-component) → Multi-AI (if external providers available)
 2. **Task Decomposition** — Break the requirement into concrete tasks. Each task must specify:
    - What to do (clear description)
@@ -78,23 +90,37 @@ Decompose the work into independent, assignable tasks:
 3. **Task Routing** — Route each task by AI suitability:
-   If **"Claude만으로"** selected:
+   If **"Claude only"** selected:
    - **Architecture/design** → `agestra-designer` agent
    - **Code review** → `agestra-reviewer` agent
    - **Quality verification** → `agestra-qa` agent
    - **Implementation** → Claude directly or project-specific agents
-   If **"다른 AI도 함께"** selected:
+   If **"Multi-AI"** selected:
    | Task Characteristics | Route To |
    |---------------------|----------|
-   | 복잡 구현, 다단계 추론 | Codex/Gemini CLI worker (`cli_worker_spawn`) |
-   | 단순 변환, 포맷팅, 패턴 적용 | Ollama (`ai_chat`, tier-matched model) |
-   | 핵심 설계 판단 | Claude 직접 |
-   | 테스트 작성 | Claude 에이전트 (tester) |
-   | 코드 리뷰 | Claude 에이전트 (reviewer) |
-4. Define dependency relationships between tasks.
+   | Complex implementation, multi-step reasoning | Codex/Gemini CLI worker (`cli_worker_spawn`) |
+   | Simple transforms, formatting, pattern application | Ollama (`ai_chat`, tier-matched model) |
+   | Core design decisions | Claude directly |
+   | Test writing | Claude agent (tester) |
+   | Code review | Claude agent (reviewer) |
+   **Quality-Based Provider Selection:**
+   Before assigning any task, determine its difficulty level:
+   - **low**: Simple chat, basic formatting, straightforward review
+   - **medium**: Design discussion, code generation, analysis, debate turns
+   - **high**: Complex architecture, cross-validation, multi-component refactoring
+   Then filter providers by qualification:
+   1. Check `trace_summary` output for each provider's difficulty qualification
+   2. Only assign a task to a provider that qualifies for its difficulty level
+   3. Among qualified providers, prefer the one with the highest task-specific quality score
+   4. If no provider qualifies, fall back to Claude for the task
+   5. New providers (no quality data) start at low difficulty — assign simple tasks first to build their track record
+4. Define dependency relationships between tasks.
 5. Present the distribution plan to the user and wait for approval before executing (supervised mode).
@@ -105,7 +131,7 @@ Execute approved tasks:
 **Claude tasks:**
 - Direct implementation or agent spawn (existing behavior).
-**CLI Worker tasks** (when "다른 AI도 함께"):
+**CLI Worker tasks** (when "Multi-AI"):
 1. For each CLI worker task, call `cli_worker_spawn` with:
    - `provider`: codex or gemini
    - `task_description`: detailed task prompt (see Prompt Crafting)
@@ -120,7 +146,7 @@ Execute approved tasks:
 2. Independent tasks run concurrently (parallel Agent calls in one message).
 3. Dependent tasks run sequentially — wait for blockers to complete.
-**Ollama tasks** (when "다른 AI도 함께"):
+**Ollama tasks** (when "Multi-AI"):
 - Call `ai_chat` with tier-matched model for simple tasks.
 - Claude applies the Ollama-generated changes.
@@ -210,7 +236,7 @@ Provide a clear summary to the user:
 - What was requested
 - Execution mode used (supervised/autonomous)
-- Work mode used (Claude only / 다른 AI도 함께)
+- Work mode used (Claude only / Multi-AI)
 - How tasks were distributed (which AI did what)
 - What changed (files modified, features added)
 - QA cycle: how many cycles ran, what was auto-fixed
@@ -272,9 +298,10 @@ The design document is the authority. If an AI's output conflicts with the desig
 <Tool_Usage>
 - `environment_check` — full capability map at start (CLI tools, Ollama tiers, available modes)
-- `provider_list` — check available providers
-- `provider_health` — verify a specific provider's status
-- `ollama_models` — assess model capabilities for routing
+- `provider_list` — check available providers
+- `provider_health` — verify a specific provider's status
+- `trace_summary` — provider quality scores, difficulty qualifications, and performance stats
+- `ollama_models` — assess model capabilities for routing
 - `cli_worker_spawn` — spawn CLI AI in autonomous mode (worktree + preflight security)
 - `cli_worker_status` — check worker progress (FSM state, heartbeat, output tail)
 - `cli_worker_collect` — collect completed worker results (git diff, output, exit code)
@@ -300,4 +327,5 @@ The design document is the authority. If an AI's output conflicts with the desig
 - Do NOT accept "simplified" or "partial" results from AIs.
 - Do NOT proceed to QA until you've inspected all results yourself.
 - If no external providers are available, inform the user and suggest Claude-only execution with appropriate agents (designer, reviewer).
+- Communicate in the user's language.
 </Constraints>