npm - ltcai - Versions diffs - 0.1.16 → 0.1.17 - Mend

ltcai 0.1.16 → 0.1.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md CHANGED Viewed

@@ -14,10 +14,10 @@ Lattice AI는 개인 개발자가 로컬 모델, 클라우드 모델, 에이전
 ### 현재 배포 버전
-- `PyPI`: `ltcai==0.1.16`
-- `npm`: `ltcai@0.1.16`
-- `VS Code Marketplace`: `parktaesoo.ltcai@0.1.16`
-- `Open VSX`: `parktaesoo.ltcai@0.1.16`
+- `PyPI`: `ltcai==0.1.17`
+- `npm`: `ltcai@0.1.17`
+- `VS Code Marketplace`: `parktaesoo.ltcai@0.1.17`
+- `Open VSX`: `parktaesoo.ltcai@0.1.17`
 ### 왜 Lattice AI인가
@@ -248,7 +248,7 @@ docker run --rm -p 4825:4825 \
 ### 릴리스 체크
-`0.1.16 릴리스는 아래 네 채널을 동일 버전으로 맞춥니다.
+`0.1.17 릴리스는 아래 네 채널을 동일 버전으로 맞춥니다.
 - `npm`
 - `PyPI`
@@ -359,7 +359,7 @@ launchctl load ~/Library/LaunchAgents/com.ltcai.plist
 ## 릴리스 노트
-현재 버전: **0.1.16** — 자세한 변경 이력은 [docs/CHANGELOG.md](docs/CHANGELOG.md) 참고.
+현재 버전: **0.1.17** — 자세한 변경 이력은 [docs/CHANGELOG.md](docs/CHANGELOG.md) 참고.
 ## 라이선스

package/docs/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,29 @@
 # Changelog
+## [0.1.17] - 2026-05-22
+### Multi-LLM Pipeline
+- **파이프라인 UI 카드** — ops 대시보드의 ACTIVE MODEL 카드와 PRIVATE VPC 카드 사이에 PIPELINE 카드 추가
+  - 파이프라인 비활성 시: "멀티 LLM 파이프라인 / Plan → Execute → Review 모델 설정" 표시
+  - 파이프라인 활성 시: "Pipeline ON / P:모델명 E:모델명 R:모델명" 으로 현재 설정 표시
+- **멀티 LLM 에이전트 파이프라인** — Planning / Executing / Reviewing 3단계에 각각 다른 LLM 지정 가능
+  - 모달에서 각 단계별 모델 선택 (로드된 로컬 모델 + 클라우드 프로바이더 자동 목록 구성)
+  - 하나의 모델을 모든 단계에 사용해도 정상 동작
+- **Human-in-the-loop** — 파이프라인 활성화 시 Planning 완료 후 사용자 승인을 기다렸다가 Execute 단계로 진행
+  - 웹 UI: 플랜 승인 카드(`✅ 승인 / ❌ 취소`) 렌더링
+  - Telegram 봇: 인라인 버튼으로 플랜 승인/취소
+- **`/agent/resume` 엔드포인트** — `context_id`와 `approved` 필드로 대기 중인 에이전트 재개 또는 취소
+- **`AgentRequest` 확장** — `planning_model`, `executing_model`, `reviewing_model`, `human_in_loop` 파라미터 추가
+- **`LLMRouter.generate_as(model_id, ...)`** — 현재 모델을 임시 교체해 지정 모델로 생성 후 원복하는 헬퍼
+- **Telegram 봇 인증 수정** — 서버 호출 시 `~/.ltcai/sessions.json`에서 어드민 세션 토큰을 읽어 쿠키로 전달
+- **Telegram SSE 파싱** — `/chat` 스트리밍 응답(`text/event-stream`)을 올바르게 파싱하도록 수정
+- **`_sessions_file()` 버그 수정** — 정의 이전에 전역 `DATA_DIR` 참조하던 문제 해결 (함수 내 경로 직접 계산)
+### Release
+- 배포 버전을 `0.1.17`로 상향
+- 대상 채널: `npm`, `PyPI`, `VS Code Marketplace`, `Open VSX`
 ## [0.1.16] - 2026-05-22
 ### First-user admin bootstrap

package/llm_router.py CHANGED Viewed

@@ -472,6 +472,19 @@ class LLMRouter:
             print(f"⚠️ VLM chat template fallback: {e}")
             return self._build_prompt(message, context, processor)
+    async def generate_as(self, model_id: str | None, message: str, context: Optional[str] = None, max_tokens: int = 4096, temperature: float = 0.2) -> str:
+        """Generate using a specific model, temporarily switching if needed. Falls back to current model if model_id is None or not loaded."""
+        if not model_id or model_id == self._current:
+            return await self.generate(message, context, max_tokens, temperature)
+        if model_id not in self._cache:
+            raise ValueError(f"Model '{model_id}' is not loaded. Load it first via /models/load.")
+        prev = self._current
+        self._current = model_id
+        try:
+            return await self.generate(message, context, max_tokens, temperature)
+        finally:
+            self._current = prev
     async def generate(self, message: str, context: Optional[str] = None, max_tokens: int = 4096, temperature: float = 0.2, image_data: Optional[str] = None) -> str:
         if not self._current: return "No model."
         self._touch()

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ltcai",
-  "version": "0.1.16",
+  "version": "0.1.17",
   "description": "Lattice AI local MLX/cloud LLM workspace server",
   "homepage": "https://github.com/TaeSooPark-PTS/LatticeAI#readme",
   "repository": {

package/server.py CHANGED Viewed

@@ -243,7 +243,9 @@ _SESSION_REFRESH_THRESHOLD = 60 * 15  # only persist if >15 min since last bump
 _sessions_lock = threading.Lock()
 def _sessions_file() -> Path:
-    return DATA_DIR / "sessions.json"
+    data_dir = Path(os.getenv("LATTICEAI_DATA_DIR") or (Path.home() / ".ltcai"))
+    data_dir.mkdir(parents=True, exist_ok=True)
+    return data_dir / "sessions.json"
 def _load_sessions() -> Dict[str, tuple]:
     try:
@@ -2033,6 +2035,20 @@ class AgentRequest(BaseModel):
     temperature: float = 0.1
     user_email: Optional[str] = None
     user_nickname: Optional[str] = None
+    # Multi-LLM pipeline: per-phase model override (None = use current loaded model)
+    planning_model: Optional[str] = None
+    executing_model: Optional[str] = None
+    reviewing_model: Optional[str] = None
+    # When True: pause after planning and wait for /agent/resume
+    human_in_loop: bool = False
+class AgentResumeRequest(BaseModel):
+    context_id: str
+    approved: bool = True
+    modified_plan: Optional[dict] = None
+    executing_model: Optional[str] = None
+    reviewing_model: Optional[str] = None
 class AgentEvalRequest(BaseModel):
@@ -2058,17 +2074,25 @@ AGENT_TERMINAL_STATES = frozenset({AgentState.DONE, AgentState.FAILED})
 class AgentRunContext:
     """Mutable state carrier passed through all agent phases."""
     __slots__ = ("state", "plan", "transcript", "retry_count",
-                 "state_history", "corrections", "final_message", "rollback_log")
+                 "state_history", "corrections", "final_message", "rollback_log",
+                 "executing_model", "reviewing_model")
     def __init__(self) -> None:
-        self.state:         AgentState = AgentState.IDLE
-        self.plan:          dict       = {}
-        self.transcript:    list       = []
-        self.retry_count:   int        = 0
-        self.state_history: list       = []
-        self.corrections:   list       = []
-        self.final_message: str        = ""
-        self.rollback_log:  list       = []
+        self.state:           AgentState   = AgentState.IDLE
+        self.plan:            dict         = {}
+        self.transcript:      list         = []
+        self.retry_count:     int          = 0
+        self.state_history:   list         = []
+        self.corrections:     list         = []
+        self.final_message:   str          = ""
+        self.rollback_log:    list         = []
+        self.executing_model: str | None   = None
+        self.reviewing_model: str | None   = None
+# Pending agent contexts waiting for human approval: context_id → (ctx, req, lang_hint, current_user)
+_pending_agents: dict[str, tuple] = {}
+_pending_agents_lock = threading.Lock()
 class ToolPathRequest(BaseModel):
@@ -4323,6 +4347,7 @@ def _extract_agent_action(raw: str) -> Dict:
 async def _phase_plan(
     ctx: AgentRunContext, req: AgentRequest, router, lang_hint: str, current_user: str,
+    model_id: str | None = None,
 ) -> None:
     """PLAN: Planner role produces a structured plan JSON."""
     context = (
@@ -4331,7 +4356,8 @@ async def _phase_plan(
         f"Workspace root: {AGENT_ROOT}\n\n"
         f"User request: {req.message}"
     )
-    raw = await router.generate(
+    raw = await router.generate_as(
+        model_id,
         message="Produce a JSON execution plan for this request.",
         context=context, max_tokens=1024, temperature=0.1,
     )
@@ -4377,7 +4403,7 @@ def _phase_approval(ctx: AgentRunContext, current_user: str) -> None:
 async def _phase_execute(
     ctx: AgentRunContext, req: AgentRequest, router, lang_hint: str,
-    current_user: str, max_steps: int,
+    current_user: str, max_steps: int, model_id: str | None = None,
 ) -> None:
     """EXECUTE: Executor role calls tools one at a time until final or budget exhausted."""
     exec_count = sum(1 for s in ctx.transcript if s.get("state") == AgentState.EXECUTING.value)
@@ -4398,7 +4424,8 @@ async def _phase_execute(
             f"User request: {req.message}{corrections_hint}\n\n"
             f"Execution transcript:\n{json.dumps(ctx.transcript, ensure_ascii=False, indent=2)}"
         )
-        raw = await router.generate(
+        raw = await router.generate_as(
+            model_id,
             message="Execute the next step.",
             context=context, max_tokens=4096, temperature=req.temperature,
         )
@@ -4492,7 +4519,7 @@ async def _phase_execute(
 async def _phase_verify(
     ctx: AgentRunContext, req: AgentRequest, router, lang_hint: str, current_user: str,
-    max_retry: int = 3,
+    max_retry: int = 3, model_id: str | None = None,
 ) -> None:
     """VERIFYING: Critic role evaluates transcript → DONE / EXECUTING (retry) / ROLLBACK / FAILED."""
     context = (
@@ -4502,7 +4529,8 @@ async def _phase_verify(
         f"Plan goal: {ctx.plan.get('goal', req.message)}\n\n"
         f"Full transcript:\n{json.dumps(ctx.transcript, ensure_ascii=False, indent=2)}"
     )
-    raw = await router.generate(
+    raw = await router.generate_as(
+        model_id,
         message="Review the execution transcript and return your verdict JSON.",
         context=context, max_tokens=512, temperature=0.1,
     )
@@ -4685,29 +4713,54 @@ async def agent(req: AgentRequest, request: Request):
     max_retry = 3
     ctx = AgentRunContext()
+    ctx.executing_model = req.executing_model
+    ctx.reviewing_model = req.reviewing_model
+    # PLANNING phase
+    ctx.state = AgentState.PLANNING
+    ctx.state_history.append(ctx.state.value)
+    await _phase_plan(ctx, req, router, lang_hint, current_user, model_id=req.planning_model)
+    # Human-in-the-loop: pause after planning, return plan to UI
+    if req.human_in_loop:
+        context_id = secrets.token_urlsafe(16)
+        with _pending_agents_lock:
+            _pending_agents[context_id] = (ctx, req, lang_hint, current_user)
+        return {
+            "status": "waiting_approval",
+            "context_id": context_id,
+            "plan": ctx.plan,
+            "steps": ctx.transcript,
+            "state_history": ctx.state_history,
+            "planning_model": req.planning_model or router.current_model_id,
+            "executing_model": req.executing_model or router.current_model_id,
+            "reviewing_model": req.reviewing_model or router.current_model_id,
+        }
+    # Auto-approve and run to completion (default behaviour)
+    _phase_approval(ctx, current_user)
+    return await _agent_run_to_completion(ctx, req, router, lang_hint, current_user, max_steps, max_retry)
+async def _agent_run_to_completion(
+    ctx: AgentRunContext, req: AgentRequest, router, lang_hint: str,
+    current_user: str, max_steps: int, max_retry: int,
+) -> dict:
+    """Run EXECUTING → VERIFYING loop until terminal state."""
     while ctx.state not in AGENT_TERMINAL_STATES:
         ctx.state_history.append(ctx.state.value)
-        # Hard guard against infinite state loops
         if len(ctx.state_history) > 200:
             ctx.final_message = "에이전트 상태 머신이 최대 반복(200)에 도달해 중단했습니다."
             ctx.state = AgentState.FAILED
             break
-        if ctx.state == AgentState.IDLE:
-            ctx.state = AgentState.PLANNING
-        elif ctx.state == AgentState.PLANNING:
-            await _phase_plan(ctx, req, router, lang_hint, current_user)
-        elif ctx.state == AgentState.WAITING_APPROVAL:
-            _phase_approval(ctx, current_user)
-        elif ctx.state == AgentState.EXECUTING:
-            await _phase_execute(ctx, req, router, lang_hint, current_user, max_steps)
+        if ctx.state == AgentState.EXECUTING:
+            await _phase_execute(ctx, req, router, lang_hint, current_user, max_steps,
+                                 model_id=ctx.executing_model)
         elif ctx.state == AgentState.VERIFYING:
-            await _phase_verify(ctx, req, router, lang_hint, current_user, max_retry)
+            await _phase_verify(ctx, req, router, lang_hint, current_user, max_retry,
+                                model_id=ctx.reviewing_model)
         elif ctx.state == AgentState.ROLLBACK:
             _phase_rollback(ctx, current_user)
@@ -4715,10 +4768,7 @@ async def agent(req: AgentRequest, request: Request):
         else:
             ctx.state = AgentState.FAILED
-    # Record terminal state in history for clients
     ctx.state_history.append(ctx.state.value)
-    # Fire-and-forget memory update — does not block the response
     asyncio.create_task(_phase_memory_update(ctx, req, router, current_user))
     message = ctx.final_message or "작업을 완료했습니다."
@@ -4736,6 +4786,36 @@ async def agent(req: AgentRequest, request: Request):
     }
+@app.post("/agent/resume")
+async def agent_resume(req: AgentResumeRequest, request: Request):
+    """Resume a paused agent after human approval of the plan."""
+    current_user = require_user(request)
+    with _pending_agents_lock:
+        entry = _pending_agents.pop(req.context_id, None)
+    if not entry:
+        raise HTTPException(status_code=404, detail="Agent context not found or expired. Start a new request.")
+    ctx, orig_req, lang_hint, _orig_user = entry
+    if not req.approved:
+        return {"status": "cancelled", "response": "사용자가 계획을 취소했습니다."}
+    if req.modified_plan:
+        ctx.plan = req.modified_plan
+        ctx.transcript[-1].update(ctx.plan)  # keep transcript in sync
+    # Apply model overrides from resume request (takes priority over original request)
+    ctx.executing_model = req.executing_model or ctx.executing_model
+    ctx.reviewing_model = req.reviewing_model or ctx.reviewing_model
+    _phase_approval(ctx, current_user)
+    max_steps = max(1, min(orig_req.max_steps, 50))
+    max_retry = 3
+    return await _agent_run_to_completion(ctx, orig_req, router, lang_hint, current_user, max_steps, max_retry)
 # ── Direct Tool API ───────────────────────────────────────────────────────────
 def _tool_response(fn, *args):

package/static/chat.html CHANGED Viewed

@@ -1048,6 +1048,47 @@
             background: rgba(34,211,160,0.07);
         }
+        /* ── 멀티 LLM 파이프라인 ── */
+        .pipeline-phase-row {
+            display: flex; flex-direction: column; gap: 8px;
+            background: rgba(255,255,255,0.02);
+            border: 1px solid var(--border);
+            border-radius: 10px; padding: 12px;
+        }
+        .pipeline-phase-label { display: flex; align-items: center; gap: 8px; flex-wrap: wrap; }
+        .pipeline-phase-badge {
+            font-size: 12px; font-weight: 600; padding: 3px 10px;
+            border-radius: 20px; white-space: nowrap;
+        }
+        .pipeline-select {
+            width: 100%; background: var(--surface); border: 1px solid var(--border);
+            color: var(--text); padding: 8px 10px; border-radius: 8px;
+            font-size: 13px; cursor: pointer; outline: none;
+        }
+        .pipeline-select:focus { border-color: var(--accent); }
+        .pipeline-active-badge {
+            display: inline-flex; align-items: center; gap: 4px;
+            font-size: 11px; background: rgba(34,211,160,0.1);
+            border: 1px solid rgba(34,211,160,0.25); color: var(--accent);
+            padding: 2px 8px; border-radius: 20px; margin-left: 6px; cursor: pointer;
+        }
+        .plan-approval-card {
+            background: rgba(99,102,241,0.07); border: 1px solid rgba(99,102,241,0.2);
+            border-radius: 12px; padding: 16px; margin-top: 4px;
+        }
+        .plan-approval-card h4 { margin: 0 0 10px; color: #818cf8; font-size: 14px; }
+        .plan-approval-card ol { margin: 0 0 12px; padding-left: 20px; color: var(--text); font-size: 13px; line-height: 1.8; }
+        .plan-approval-card .plan-meta { font-size: 11px; color: var(--muted); margin-bottom: 12px; }
+        .plan-approval-actions { display: flex; gap: 8px; }
+        .plan-approve-btn {
+            flex: 1; background: var(--accent); border: none; color: #05120d;
+            padding: 9px; border-radius: 8px; font-size: 13px; font-weight: 600; cursor: pointer;
+        }
+        .plan-cancel-btn {
+            background: rgba(248,113,113,0.1); border: 1px solid rgba(248,113,113,0.25);
+            color: var(--danger); padding: 9px 16px; border-radius: 8px; font-size: 13px; cursor: pointer;
+        }
         /* ── 첨부 파일 칩 ── */
         .attach-chip {
             display: inline-flex;
@@ -3134,6 +3175,14 @@
                     </div>
                     <div class="ops-icon"><i class="ti ti-cpu-2"></i></div>
                 </div>
+                <div class="ops-card interactive" id="pipeline-ops-card" onclick="openPipelineModal()">
+                    <div>
+                        <div class="ops-label">PIPELINE</div>
+                        <div id="ops-pipeline-value" class="ops-value">멀티 LLM 파이프라인</div>
+                        <div id="ops-pipeline-meta" class="ops-meta">Plan → Execute → Review 모델 설정</div>
+                    </div>
+                    <div class="ops-icon"><i class="ti ti-git-branch"></i></div>
+                </div>
                 <div class="ops-card interactive" onclick="openVpcPanel()">
                     <div>
                         <div class="ops-label">PRIVATE VPC</div>
@@ -3226,6 +3275,65 @@
         </div>
     </div>
+    <!-- ── 멀티 LLM 파이프라인 모달 ── -->
+    <div id="pipeline-overlay" class="admin-overlay" style="display:none" onclick="if(event.target===this)closePipelineModal()">
+        <section class="admin-panel" style="max-width:520px">
+            <div class="model-panel-header">
+                <div>
+                    <h2><i class="ti ti-git-branch" style="color:var(--accent)"></i> 멀티 LLM 파이프라인</h2>
+                    <p style="color:var(--muted);font-size:12px;margin-top:4px">각 단계에 사용할 LLM을 지정합니다. 동일 모델을 여러 단계에 써도 됩니다.</p>
+                </div>
+                <button class="admin-close" onclick="closePipelineModal()"><i class="ti ti-x"></i></button>
+            </div>
+            <div style="display:flex;flex-direction:column;gap:16px;padding:0 4px 4px">
+                <!-- Planning -->
+                <div class="pipeline-phase-row">
+                    <div class="pipeline-phase-label">
+                        <span class="pipeline-phase-badge" style="background:rgba(99,102,241,0.15);color:#818cf8">📋 Planning</span>
+                        <span style="color:var(--muted);font-size:11px">계획 수립 · 유저와 함께 검토</span>
+                    </div>
+                    <select id="pipeline-planning-select" class="pipeline-select">
+                        <option value="">현재 로드된 모델 (기본)</option>
+                    </select>
+                </div>
+                <!-- Executing -->
+                <div class="pipeline-phase-row">
+                    <div class="pipeline-phase-label">
+                        <span class="pipeline-phase-badge" style="background:rgba(34,197,94,0.12);color:#4ade80">⚙️ Executing</span>
+                        <span style="color:var(--muted);font-size:11px">코드 작성 · 파일 생성 · 툴 호출</span>
+                    </div>
+                    <select id="pipeline-executing-select" class="pipeline-select">
+                        <option value="">현재 로드된 모델 (기본)</option>
+                    </select>
+                </div>
+                <!-- Reviewing -->
+                <div class="pipeline-phase-row">
+                    <div class="pipeline-phase-label">
+                        <span class="pipeline-phase-badge" style="background:rgba(251,146,60,0.12);color:#fb923c">🔍 Reviewing</span>
+                        <span style="color:var(--muted);font-size:11px">결과 검증 · 최종 답변 생성</span>
+                    </div>
+                    <select id="pipeline-reviewing-select" class="pipeline-select">
+                        <option value="">현재 로드된 모델 (기본)</option>
+                    </select>
+                </div>
+                <div style="background:rgba(99,102,241,0.07);border:1px solid rgba(99,102,241,0.15);border-radius:10px;padding:12px;font-size:12px;color:var(--muted)">
+                    <i class="ti ti-info-circle" style="color:var(--accent)"></i>
+                    모달을 닫고 채팅창에 작업을 입력하면 <b style="color:var(--text)">Pipeline 모드</b>로 실행됩니다.<br>
+                    계획이 완성되면 UI에서 검토 후 <b style="color:var(--accent)">✅ Done</b>을 눌러야 실행이 시작됩니다.
+                </div>
+                <div style="display:flex;gap:8px;justify-content:flex-end">
+                    <button onclick="resetPipeline()" style="background:none;border:1px solid var(--border);color:var(--muted);padding:8px 16px;border-radius:8px;cursor:pointer;font-size:13px">초기화</button>
+                    <button onclick="savePipelineAndClose()" style="background:var(--accent);border:none;color:#05120d;padding:8px 20px;border-radius:8px;cursor:pointer;font-size:13px;font-weight:600"><i class="ti ti-check"></i> 저장</button>
+                </div>
+            </div>
+        </section>
+    </div>
     <!-- ── 파일 생성 패널 ── -->
     <div id="file-create-overlay" class="admin-overlay" style="display:none">
         <section class="admin-panel" style="max-width:480px">
@@ -3542,6 +3650,148 @@
         let currentUserNickname = "Guest";
         let currentUserEmail = "";
         let isAdmin = false;
+        // ── 멀티 LLM 파이프라인 상태 ──
+        let pipelineConfig = { planning: null, executing: null, reviewing: null };
+        let pipelineActive = false; // true이면 전송 시 pipeline 모드
+        function openPipelineModal() {
+            document.getElementById('pipeline-overlay').style.display = 'flex';
+            loadPipelineModelOptions();
+        }
+        function closePipelineModal() {
+            document.getElementById('pipeline-overlay').style.display = 'none';
+        }
+        function resetPipeline() {
+            pipelineConfig = { planning: null, executing: null, reviewing: null };
+            pipelineActive = false;
+            ['planning','executing','reviewing'].forEach(p =>
+                document.getElementById(`pipeline-${p}-select`).value = '');
+            updatePipelineBadge();
+        }
+        function savePipelineAndClose() {
+            pipelineConfig = {
+                planning:  document.getElementById('pipeline-planning-select').value  || null,
+                executing: document.getElementById('pipeline-executing-select').value || null,
+                reviewing: document.getElementById('pipeline-reviewing-select').value || null,
+            };
+            pipelineActive = true;
+            updatePipelineBadge();
+            closePipelineModal();
+        }
+        function updatePipelineBadge() {
+            const card = document.getElementById('pipeline-ops-card');
+            const val  = document.getElementById('ops-pipeline-value');
+            const meta = document.getElementById('ops-pipeline-meta');
+            if (!card) return;
+            if (pipelineActive) {
+                card.style.borderColor = 'rgba(34,211,160,0.35)';
+                card.style.boxShadow = '0 0 0 1px rgba(34,211,160,0.18)';
+                const pLabel = pipelineConfig.planning  ? pipelineConfig.planning.split('/').pop()  : '—';
+                const eLabel = pipelineConfig.executing ? pipelineConfig.executing.split('/').pop() : '—';
+                const rLabel = pipelineConfig.reviewing ? pipelineConfig.reviewing.split('/').pop() : '—';
+                if (val)  val.textContent  = 'Pipeline ON';
+                if (meta) meta.textContent = `P:${pLabel} E:${eLabel} R:${rLabel}`;
+            } else {
+                card.style.borderColor = '';
+                card.style.boxShadow   = '';
+                if (val)  val.textContent  = '멀티 LLM 파이프라인';
+                if (meta) meta.textContent = 'Plan → Execute → Review 모델 설정';
+            }
+        }
+        async function loadPipelineModelOptions() {
+            let models = [];
+            try {
+                const res = await apiFetch('/models');
+                const data = await res.json();
+                const loaded = data.loaded || [];
+                const cloud  = data.providers || [];
+                loaded.forEach(m => models.push({ id: m, label: `[로컬] ${m.split('/').pop()}` }));
+                cloud.forEach(m => {
+                    if (m.available !== false)
+                        models.push({ id: m.id || m.model_id, label: `[클라우드] ${m.name || m.id}` });
+                });
+            } catch(e) { /* silent */ }
+            ['planning','executing','reviewing'].forEach(phase => {
+                const sel = document.getElementById(`pipeline-${phase}-select`);
+                const cur = pipelineConfig[phase] || '';
+                sel.innerHTML = '<option value="">현재 로드된 모델 (기본)</option>';
+                models.forEach(m => {
+                    const opt = document.createElement('option');
+                    opt.value = m.id; opt.textContent = m.label;
+                    if (m.id === cur) opt.selected = true;
+                    sel.appendChild(opt);
+                });
+            });
+        }
+        // ── 플랜 승인 카드 렌더링 ──
+        async function renderPlanApprovalCard(bubble, data) {
+            const plan  = data.plan || {};
+            const steps = plan.steps || [];
+            const pM = data.planning_model  || '현재 모델';
+            const eM = data.executing_model || '현재 모델';
+            const rM = data.reviewing_model || '현재 모델';
+            const contextId = data.context_id;
+            let stepsHtml = steps.map((s,i) =>
+                `<li>${escapeHtml(s.description || s.action || JSON.stringify(s))}</li>`).join('');
+            bubble.innerHTML = `
+                <div class="plan-approval-card">
+                    <h4>📋 플래닝 완료 — 실행 전 확인해주세요</h4>
+                    ${plan.goal ? `<p style="color:var(--text);font-size:13px;margin:0 0 10px"><b>목표:</b> ${escapeHtml(plan.goal)}</p>` : ''}
+                    <ol>${stepsHtml || '<li>(단계 없음)</li>'}</ol>
+                    <div class="plan-meta">
+                        🧠 Planning: <b>${escapeHtml(compactModelName(pM))}</b> &nbsp;·&nbsp;
+                        ⚙️ Executing: <b>${escapeHtml(compactModelName(eM))}</b> &nbsp;·&nbsp;
+                        🔍 Reviewing: <b>${escapeHtml(compactModelName(rM))}</b>
+                    </div>
+                    <div class="plan-approval-actions">
+                        <button class="plan-approve-btn" onclick="resumeAgent('${contextId}', this)">
+                            <i class="ti ti-player-play"></i> ✅ Done — 실행 시작
+                        </button>
+                        <button class="plan-cancel-btn" onclick="cancelAgent('${contextId}', this)">❌ 취소</button>
+                    </div>
+                </div>`;
+        }
+        async function resumeAgent(contextId, btn) {
+            btn.disabled = true;
+            btn.textContent = '⚙️ 실행 중...';
+            const card = btn.closest('.plan-approval-card');
+            try {
+                const res = await apiFetch('/agent/resume', {
+                    method: 'POST',
+                    headers: { 'Content-Type': 'application/json' },
+                    body: JSON.stringify({
+                        context_id: contextId,
+                        approved: true,
+                        executing_model: pipelineConfig.executing || null,
+                        reviewing_model: pipelineConfig.reviewing || null,
+                    })
+                });
+                const data = await res.json();
+                if (!res.ok) throw new Error(data.detail || `서버 오류 (${res.status})`);
+                const bubble = btn.closest('.bubble');
+                renderAiBubble(bubble, data.response || '완료되었습니다.');
+                const files = data.created_files || [];
+                files.forEach(f => renderFileDownloadCard(f.filename, f.path, f.bytes || 0));
+            } catch(e) {
+                card.innerHTML += `<p style="color:var(--danger);font-size:12px;margin-top:8px">❌ ${escapeHtml(e.message)}</p>`;
+                btn.disabled = false;
+                btn.textContent = '다시 시도';
+            }
+        }
+        async function cancelAgent(contextId, btn) {
+            await apiFetch('/agent/resume', {
+                method: 'POST',
+                headers: { 'Content-Type': 'application/json' },
+                body: JSON.stringify({ context_id: contextId, approved: false })
+            }).catch(()=>{});
+            btn.closest('.bubble').innerHTML = '<span style="color:var(--muted)">작업이 취소되었습니다.</span>';
+        }
         let latestVpcConfig = null;
         const mirroredHistoryKeys = new Set();
         const API_BASE = window.location.protocol === 'file:' ? 'http://localhost:4825' : '';
@@ -5577,7 +5827,10 @@
             sendBtn.disabled = true;
             const aiMsgDiv = document.createElement('div');
             aiMsgDiv.className = 'message ai';
-            aiMsgDiv.innerHTML = `<div class="sender-label">Lattice AI <span style="color:var(--accent);font-size:11px">⚙ 에이전트 모드</span></div><div class="bubble">파일을 생성하고 있습니다...</div>`;
+            const modeLabel = pipelineActive
+                ? '⚙ 파이프라인 모드'
+                : '⚙ 에이전트 모드';
+            aiMsgDiv.innerHTML = `<div class="sender-label">Lattice AI <span style="color:var(--accent);font-size:11px">${modeLabel}</span></div><div class="bubble">${pipelineActive ? '📋 계획 수립 중입니다...' : '파일을 생성하고 있습니다...'}</div>`;
             chatViewport.appendChild(aiMsgDiv);
             chatViewport.scrollTop = chatViewport.scrollHeight;
             const bubble = aiMsgDiv.querySelector('.bubble');
@@ -5593,7 +5846,11 @@
                         max_steps: 4,
                         temperature: 0.1,
                         user_email: currentUserEmail,
-                        user_nickname: currentUserNickname
+                        user_nickname: currentUserNickname,
+                        human_in_loop: pipelineActive,
+                        planning_model:  pipelineActive ? (pipelineConfig.planning  || null) : null,
+                        executing_model: pipelineActive ? (pipelineConfig.executing || null) : null,
+                        reviewing_model: pipelineActive ? (pipelineConfig.reviewing || null) : null,
                     })
                 });
@@ -5602,6 +5859,13 @@
                 if (!res.ok) throw new Error(data.detail || `서버 오류 (${res.status})`);
+                // Pipeline mode: show plan for approval
+                if (data.status === 'waiting_approval') {
+                    await renderPlanApprovalCard(bubble, data);
+                    loadHistory();
+                    return;
+                }
                 renderAiBubble(bubble, data.response || '완료되었습니다.');
                 const files = data.created_files || [];

package/telegram_bot.py CHANGED Viewed

@@ -6,6 +6,7 @@ import os
 import socket
 import subprocess
 import tempfile
+import time
 import zipfile
 import json
 from pathlib import Path
@@ -38,7 +39,11 @@ MODELS_URL     = f"{BASE_URL}/models"
 GRAPH_STATS_URL = f"{BASE_URL}/knowledge-graph/stats"
 UPLOAD_DOC_URL  = f"{BASE_URL}/upload/document"
+AGENT_RESUME_URL      = f"{BASE_URL}/agent/resume"
 AGENT_WORKSPACE       = Path(env_value("LATTICEAI_AGENT_ROOT", "agent_workspace")).resolve()
+# Pending plan approvals: context_id → (chat_id, executing_model, reviewing_model)
+_bot_pending_plans: dict[str, dict] = {}
 MAX_TELEGRAM_FILE_BYTES = 45 * 1024 * 1024
 SERVER_PORT           = int(env_value("LATTICEAI_SERVER_PORT", "4825"))
 INVITE_CODE           = env_value("LATTICEAI_INVITE_CODE", "gemma-lattice-ai")
@@ -50,6 +55,41 @@ CHAT_IDS_FILE = Path(env_value("LATTICEAI_TELEGRAM_CHATS_FILE", str(DATA_DIR / "
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
+# ── Server session auth ───────────────────────────────────────────────────────
+def _get_server_session() -> str:
+    """Read the most recent valid admin session from sessions.json (web login)."""
+    sessions_file = DATA_DIR / "sessions.json"
+    users_file = DATA_DIR / "users.json"
+    try:
+        if not sessions_file.exists():
+            return ""
+        sessions = json.loads(sessions_file.read_text())
+        admin_emails: set[str] = set()
+        if users_file.exists():
+            users = json.loads(users_file.read_text())
+            admin_emails = {e for e, u in users.items() if u.get("role") == "admin" and not u.get("disabled")}
+        now = time.time()
+        # Pick the newest non-expired admin session
+        best_token, best_ts = "", 0.0
+        for token, entry in sessions.items():
+            email, created_at = entry[0], float(entry[1])
+            if admin_emails and email not in admin_emails:
+                continue
+            if now - created_at > 7 * 86400:
+                continue
+            if created_at > best_ts:
+                best_token, best_ts = token, created_at
+        return best_token
+    except Exception:
+        return ""
+def _server_client(**kwargs) -> httpx.AsyncClient:
+    """httpx client pre-loaded with the web session cookie."""
+    token = _get_server_session()
+    cookies = {"session_token": token} if token else {}
+    return httpx.AsyncClient(cookies=cookies, **kwargs)
 # ── Chat ID registry ─────────────────────────────────────────────────────────
 def load_chat_ids():
@@ -281,7 +321,7 @@ async def _mac_ram_used_gb() -> str:
 async def show_status(client, chat_id):
     await send_chat_action(client, chat_id, "typing")
     try:
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             res = await lc.get(STATUS_URL, timeout=5.0)
             data = res.json() if res.status_code == 200 else {}
     except Exception:
@@ -306,7 +346,7 @@ async def show_status(client, chat_id):
 async def show_model_info(client, chat_id):
     await send_chat_action(client, chat_id, "typing")
     try:
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             res = await lc.get(MODELS_URL, timeout=5.0)
             data = res.json() if res.status_code == 200 else {}
     except Exception:
@@ -330,7 +370,7 @@ async def show_model_info(client, chat_id):
 async def do_unload_model(client, chat_id, model_id: str = ""):
     await send_chat_action(client, chat_id, "typing")
     try:
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             if model_id:
                 res = await lc.delete(f"{BASE_URL}/models/unload/{model_id}", timeout=15.0)
             else:
@@ -353,7 +393,7 @@ async def do_unload_model(client, chat_id, model_id: str = ""):
 async def show_graph_stats(client, chat_id):
     await send_chat_action(client, chat_id, "typing")
     try:
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             res = await lc.get(GRAPH_STATS_URL, timeout=5.0)
             data = res.json() if res.status_code == 200 else {}
     except Exception:
@@ -414,7 +454,7 @@ async def take_screenshot(client, chat_id):
 async def show_history_summary(client, chat_id, n: int = 5):
     await send_chat_action(client, chat_id, "typing")
     try:
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             res = await lc.get(HISTORY_URL, timeout=10.0)
             items = res.json() if res.status_code == 200 else []
     except Exception:
@@ -435,7 +475,7 @@ async def show_history_summary(client, chat_id, n: int = 5):
 async def clear_server_history(client, chat_id, keep_last=0):
     try:
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             res = await lc.delete(HISTORY_URL, params={"keep_last": keep_last}, timeout=10.0)
             data = res.json() if res.headers.get("content-type", "").startswith("application/json") else {}
         if res.status_code == 200:
@@ -466,7 +506,7 @@ async def send_web_link(client, chat_id):
         },
     }
     try:
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             await lc.post(f"{API_URL}/sendMessage", json=payload)
     except Exception as e:
         logger.error("웹 링크 전송 실패: %s", e)
@@ -475,7 +515,7 @@ async def send_web_link(client, chat_id):
 async def send_mcp_tools(client, chat_id):
     try:
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             res = await lc.get(MCP_TOOLS_URL, timeout=10.0)
             if res.status_code != 200:
                 await send_message(client, chat_id, f"MCP 도구 목록을 가져오지 못했습니다: {res.status_code}")
@@ -506,7 +546,7 @@ async def process_document_file(client, chat_id, file_id: str, filename: str, ca
     tmp = Path(tempfile.mktemp(suffix=suffix))
     try:
         tmp.write_bytes(raw)
-        async with httpx.AsyncClient() as lc:
+        async with _server_client() as lc:
             with open(tmp, "rb") as f:
                 res = await lc.post(
                     UPLOAD_DOC_URL,
@@ -536,15 +576,37 @@ async def process_document_file(client, chat_id, file_id: str, filename: str, ca
 # ── AI chat ───────────────────────────────────────────────────────────────────
-async def ask_ai(client, message, image_data=None, agent_mode=True):
+async def ask_ai(client, message, image_data=None, agent_mode=False,
+                 planning_model=None, executing_model=None, reviewing_model=None):
     try:
-        url = CHAT_URL if image_data or not agent_mode else AGENT_URL
-        payload = {"message": message, "source": "telegram"}
-        if image_data:
-            payload["stream"] = False
-            payload["image_data"] = image_data
-        res = await client.post(url, json=payload, timeout=300.0)
+        if agent_mode and not image_data:
+            url = AGENT_URL
+            payload = {
+                "message": message, "source": "telegram",
+                "human_in_loop": True,
+                "planning_model": planning_model,
+                "executing_model": executing_model,
+                "reviewing_model": reviewing_model,
+            }
+        else:
+            url = CHAT_URL
+            payload = {"message": message, "source": "telegram", "stream": False}
+            if image_data:
+                payload["image_data"] = image_data
+        async with _server_client() as sc:
+            res = await sc.post(url, json=payload, timeout=300.0)
         if res.status_code == 200:
+            ct = res.headers.get("content-type", "")
+            if "text/event-stream" in ct:
+                text = ""
+                for line in res.text.splitlines():
+                    if line.startswith("data:"):
+                        try:
+                            chunk = json.loads(line[5:].strip()).get("chunk", "")
+                            text += chunk
+                        except Exception:
+                            pass
+                return {"response": text.strip() or "⚠️ 빈 응답"}
             return res.json()
         try:
             detail = res.json().get("detail", "")
@@ -625,12 +687,88 @@ async def send_generated_files(client, chat_id, generated_files):
     finally:
         zip_path.unlink(missing_ok=True)
+# ── Plan approval (Human-in-the-loop) ────────────────────────────────────────
+async def send_plan_for_approval(client, chat_id, data: dict) -> None:
+    """Show the agent plan to the user and present Done/Cancel buttons."""
+    context_id = data.get("context_id", "")
+    plan = data.get("plan", {})
+    goal = plan.get("goal", "")
+    steps = plan.get("steps", [])
+    p_model = data.get("planning_model", "current")
+    e_model = data.get("executing_model", "current")
+    r_model = data.get("reviewing_model", "current")
+    lines = [f"📋 *플래닝 완료* — 실행 전 확인해주세요\n"]
+    if goal:
+        lines.append(f"*목표:* {goal}\n")
+    for i, step in enumerate(steps, 1):
+        desc = step.get("description") or step.get("action") or str(step)
+        lines.append(f"{i}. {desc}")
+    lines.append(f"\n🧠 플래닝: `{p_model}`")
+    lines.append(f"⚙️ 실행: `{e_model}`")
+    lines.append(f"🔍 검토: `{r_model}`")
+    _bot_pending_plans[context_id] = {
+        "chat_id": chat_id,
+        "executing_model": data.get("executing_model"),
+        "reviewing_model": data.get("reviewing_model"),
+    }
+    keyboard = {"inline_keyboard": [[
+        {"text": "✅ Done — 실행 시작", "callback_data": f"plan:approve:{context_id}"},
+        {"text": "❌ 취소", "callback_data": f"plan:cancel:{context_id}"},
+    ]]}
+    await send_message(client, chat_id, "\n".join(lines), reply_markup=keyboard)
+async def handle_plan_callback(client, chat_id, data: str) -> None:
+    """Handle Done/Cancel callback from plan approval buttons."""
+    parts = data.split(":", 2)
+    if len(parts) != 3:
+        return
+    _, action, context_id = parts
+    pending = _bot_pending_plans.pop(context_id, None)
+    if action == "cancel" or not pending:
+        await send_message(client, chat_id, "❌ 작업이 취소되었습니다.")
+        return
+    await send_message(client, chat_id, "⚙️ 실행 중입니다. 잠시 기다려주세요...")
+    await send_chat_action(client, chat_id, "typing")
+    try:
+        async with _server_client() as sc:
+            res = await sc.post(AGENT_RESUME_URL, json={
+                "context_id": context_id,
+                "approved": True,
+                "executing_model": pending.get("executing_model"),
+                "reviewing_model": pending.get("reviewing_model"),
+            }, timeout=300.0)
+        data_r = res.json() if res.status_code == 200 else {}
+        ans = data_r.get("response", f"❌ 서버 에러 ({res.status_code})")
+        await send_message(client, chat_id, str(ans))
+        if isinstance(data_r, dict):
+            await send_generated_files(client, chat_id, collect_generated_files(data_r))
+            await send_preview_links(client, chat_id, collect_preview_urls(data_r))
+    except Exception as e:
+        await send_message(client, chat_id, f"❌ 실행 중 오류: {e}")
 # ── AI request task ───────────────────────────────────────────────────────────
 async def process_ai_request(client, chat_id, user_text, image_data=None):
     try:
         await send_chat_action(client, chat_id, "upload_photo" if image_data else "typing")
+        logger.info("ask_ai 호출 시작: chat_id=%s text=%r", chat_id, user_text[:30])
         data  = await ask_ai(client, user_text, image_data, agent_mode=not image_data)
+        logger.info("ask_ai 완료: chat_id=%s result_keys=%s", chat_id, list(data.keys()) if isinstance(data, dict) else type(data))
+        # Human-in-the-loop: show plan and wait for approval
+        if isinstance(data, dict) and data.get("status") == "waiting_approval":
+            await send_plan_for_approval(client, chat_id, data)
+            return
         ans   = data.get("response", str(data)) if isinstance(data, dict) else str(data)
         if not ans or not str(ans).strip():
             ans = "⚠️ AI가 답변을 생성하지 못했습니다."
@@ -639,7 +777,7 @@ async def process_ai_request(client, chat_id, user_text, image_data=None):
             await send_generated_files(client, chat_id, collect_generated_files(data))
             await send_preview_links(client, chat_id, collect_preview_urls(data))
     except Exception as e:
-        logger.error("process_ai_request 실패 (chat_id=%s): %s", chat_id, e)
+        logger.error("process_ai_request 실패 (chat_id=%s): %s", chat_id, e, exc_info=True)
         try:
             await send_message(client, chat_id, "⚠️ 처리 중 오류가 발생했습니다. 잠시 후 다시 시도해 주세요.")
         except Exception:
@@ -662,6 +800,9 @@ HELP_TEXT = """\
 /mcp — MCP 도구 목록
 /help — 이 도움말
+/agent <작업> — 멀티 LLM 에이전트 (계획 확인 후 실행)
+/agent <작업> --exec <모델> --review <모델> — 실행/검토 LLM 지정
 일반 텍스트 → AI에게 질문
 사진 전송 → AI 이미지 분석
 문서 전송(PDF, DOCX, XLSX, PPTX, TXT, CSV) → Knowledge Graph 수집
@@ -697,6 +838,30 @@ async def handle_command(client, chat_id, command: str, args: str):
         await send_mcp_tools(client, chat_id)
     elif cmd in {"help", "h"}:
         await send_message(client, chat_id, HELP_TEXT)
+    elif cmd == "agent":
+        if not args:
+            await send_message(client, chat_id, "사용법: /agent <작업 내용>\n예: /agent 쇼핑몰 메인 페이지 HTML 만들어줘\n\n특정 LLM 지정:\n/agent <작업> --exec openai/gpt-4o --review deepseek/deepseek-chat")
+            return
+        # Parse optional --exec / --review flags
+        exec_model = reviewing_model = None
+        task_text = args
+        import re as _re
+        em = _re.search(r'--exec\s+(\S+)', args)
+        rm = _re.search(r'--review\s+(\S+)', args)
+        if em:
+            exec_model = em.group(1)
+            task_text = task_text.replace(em.group(0), "").strip()
+        if rm:
+            reviewing_model = rm.group(1)
+            task_text = task_text.replace(rm.group(0), "").strip()
+        await send_chat_action(client, chat_id, "typing")
+        data = await ask_ai(client, task_text, agent_mode=True,
+                            executing_model=exec_model, reviewing_model=reviewing_model)
+        if isinstance(data, dict) and data.get("status") == "waiting_approval":
+            await send_plan_for_approval(client, chat_id, data)
+        else:
+            ans = data.get("response", str(data)) if isinstance(data, dict) else str(data)
+            await send_message(client, chat_id, ans)
     else:
         await send_message(client, chat_id, f"알 수 없는 명령어: /{cmd}\n/help 로 명령어 목록을 확인하세요.")
@@ -730,6 +895,9 @@ async def handle_callback_query(client, callback_query):
     elif data.startswith("model:unload:"):
         model_id = data[len("model:unload:"):]
         await do_unload_model(client, chat_id, model_id)
+    elif data.startswith("plan:"):
+        task = asyncio.create_task(handle_plan_callback(client, chat_id, data))
+        task.add_done_callback(_log_task_exception)
 # ── Main loop ─────────────────────────────────────────────────────────────────