ltcai 2.1.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,156 +1,148 @@
1
- # Lattice AI — 아키텍처
2
-
3
- ## 전체 구조
4
-
5
- ```
6
- ┌─────────────────────────────────────────────────────────┐
7
- │ 클라이언트 레이어 │
8
- │ 웹 UI (chat.html) │ VS Code 확장 │ Telegram 봇 │
9
- └──────────────────────────┬──────────────────────────────┘
10
- HTTP / SSE
11
- ┌──────────────────────────▼──────────────────────────────┐
12
- │ server.py FastAPI (port 4825) │
13
- │ │
14
- │ /chat /agent /models /tools/* /mcp/* /garden │
15
- │ /account /admin /auth/sso /knowledge-graph /graph │
16
- └────┬──────────┬──────────┬──────────┬───────────────────┘
17
- │ │ │ │
18
- ▼ ▼ ▼ ▼
19
- llm_router tools.py knowledge_ p_reinforce
20
- .py graph.py .py
21
-
22
- ├── MLX (mlx_lm / mlx_vlm) ← Apple Silicon 로컬
23
- ├── OpenAI SDK ← openai / groq / together / openrouter
24
- └── Ollama / vLLM REST ← 로컬 서버 연동
1
+ # Lattice AI Architecture
2
+
3
+ Lattice AI v2.2.0 is a local-first **AI Knowledge OS**. The architecture is
4
+ organized around one durable center: the Knowledge Graph. Models, tools,
5
+ agents, workflows, and UI modes are replaceable layers that operate on top of
6
+ the graph.
7
+
8
+ ## Architecture Goals
9
+
10
+ - Keep user knowledge local-first by default.
11
+ - Treat multimodal input as the normal path, not an add-on.
12
+ - Preserve evidence, decisions, files, artifacts, and work history.
13
+ - Keep models replaceable and policy-governed.
14
+ - Explain risk and source facts instead of hiding capability.
15
+ - Keep basic and advanced modes feature-equivalent.
16
+ - Keep admin-only capabilities explicit and auditable.
17
+
18
+ ## System View
19
+
20
+ ```mermaid
21
+ flowchart TD
22
+ User["User files, screenshots, chats, notes, code, work logs"]
23
+ Ingestion["Multimodal ingestion"]
24
+ Extract["Entity, relation, evidence extraction"]
25
+ Graph["Knowledge Graph"]
26
+ Context["Graph context builder"]
27
+ Models["Multimodal model runtime"]
28
+ Agents["Agent runtime and workflows"]
29
+ Outputs["Advice, analysis, documents, automation"]
30
+ Admin["Admin policy and audit"]
31
+
32
+ User --> Ingestion
33
+ Ingestion --> Extract
34
+ Extract --> Graph
35
+ Graph --> Context
36
+ Context --> Models
37
+ Models --> Agents
38
+ Agents --> Outputs
39
+ Admin --> Models
40
+ Admin --> Graph
25
41
  ```
26
42
 
27
- ## 파일별 역할
28
-
29
- | 파일 | 역할 |
30
- |------|------|
31
- | `server.py` | FastAPI 앱, 모든 HTTP 엔드포인트, 인증/세션/CORS/rate limit |
32
- | `ltcai_cli.py` | CLI 엔트리포인트 (`LTCAI` 명령), `doctor` 서브커맨드, uvicorn 실행 |
33
- | `llm_router.py` | 로컬(MLX/Ollama) ↔ 클라우드(OpenAI/Groq/…) 라우팅, 스트리밍 SSE |
34
- | `tools.py` | 에이전트 도구 구현: read_file, edit_file, grep, run_command, todo_write/read, 스크린샷 등 |
35
- | `knowledge_graph.py` | SQLite 지식 그래프 (노드/엣지/청크), Graph RAG 컨텍스트 주입 |
36
- | `p_reinforce.py` | P-Reinforce 지식 정원 엔진, `~/.ltcai-brain/` 분류 저장 |
37
- | `telegram_bot.py` | 로컬 AI Telegram 미러 봇 |
38
- | `codex_telegram_bot.py` | 클라우드 Codex Telegram 봇 (GPT + GitHub 이슈) |
39
- | `vscode-extension/` | TypeScript VS Code 확장 |
40
- | `static/` | 웹 UI HTML (chat, account, admin, graph), PWA manifest/SW |
41
- | `bin/ltcai.js` | npm CLI 엔트리포인트 (Python 환경 자동 부트스트랩) |
42
-
43
- ## 데이터 흐름
44
-
45
- ### 채팅 요청
43
+ ## Durable Core
46
44
 
47
- ```
48
- 브라우저 → POST /chat
49
- → server.py: 인증 확인, rate limit
50
- → llm_router.py: 모델 선택 (로컬/클라우드)
51
- → knowledge_graph.py: Graph RAG 컨텍스트 조회 + 주입
52
- → LLM 스트리밍 응답 (SSE)
53
- → knowledge_graph.py: 메시지/응답 인제스트
54
- ```
45
+ The Knowledge Graph stores the durable user and organization memory:
55
46
 
56
- ### 에이전트 요청
47
+ - files and document evidence
48
+ - images and screenshots
49
+ - conversations and notes
50
+ - user decisions
51
+ - work history
52
+ - generated artifacts
53
+ - agent and workflow events
57
54
 
58
- ```
59
- 브라우저/VS Code POST /agent
60
- → server.py: 인증 확인, rate limit (6/분)
61
- → llm_router.py: Discover→Plan→Implement→Verify 루프 (max 25스텝)
62
- → tools.py: read_file / edit_file / grep / run_command / todo_*
63
- → 각 스텝 결과 스트리밍
64
- ```
55
+ The LLM is not the product core. It is an execution worker that can be replaced
56
+ when hardware, policy, or user preference changes.
65
57
 
66
- ### 문서 업로드
58
+ ## Multimodal Ingestion
67
59
 
68
- ```
69
- 브라우저 POST /upload
70
- → server.py: magic-number 검증, rate limit (12/분)
71
- → tools.py: PDF/DOCX/XLSX/PPTX 파싱
72
- → knowledge_graph.py: Chunk/Page/Sheet/Slide 노드 인제스트
73
- → blob 저장: ~/.ltcai/knowledge_graph_blobs/
74
- ```
60
+ Lattice AI assumes users will provide source material directly. The expected
61
+ input set includes:
75
62
 
76
- ## 데이터 저장소
63
+ - PDF
64
+ - Word
65
+ - Excel
66
+ - PowerPoint
67
+ - images
68
+ - screenshots
69
+ - chat history
70
+ - notes
71
+ - web content
72
+ - code
73
+ - work logs
77
74
 
78
- ```
79
- ~/.ltcai/
80
- ├── users.json # 사용자 계정 (scrypt 해시)
81
- ├── sessions.json # 세션 토큰 (24h TTL)
82
- ├── chat_history.json # 채팅 히스토리
83
- ├── knowledge_graph.sqlite # Graph RAG SQLite DB
84
- ├── knowledge_graph_blobs/ # 원본 업로드 파일
85
- ├── mcp_installs.json # MCP 서버 설치 목록
86
- └── todos.json # 에이전트 TODO 리스트
87
-
88
- ~/.ltcai-brain/
89
- ├── INDEX.md
90
- ├── 00_Raw/
91
- ├── 10_Wiki/
92
- ├── 20_Skills/
93
- ├── 30_Projects/
94
- └── 40_Log/
95
- ```
96
-
97
- ## 인증 흐름
98
-
99
- ```
100
- POST /login (username + password)
101
- → scrypt 검증
102
- → 세션 토큰 생성 (UUID, 24h TTL)
103
- → Set-Cookie: session=<token>; HttpOnly; SameSite=Lax
104
-
105
- 모든 민감 엔드포인트:
106
- → _require_auth(): 쿠키 검증 → User 반환 또는 401
107
- ```
108
-
109
- SSO (OIDC):
110
-
111
- ```
112
- GET /auth/sso/login → 리디렉션 (Entra ID / Okta)
113
- GET /auth/sso/callback?code=... → 토큰 교환 → 세션 생성
114
- ```
75
+ The architecture must not ask users to convert these to plain text before AI can
76
+ work on them.
115
77
 
116
- ## MCP 연동
78
+ ## Model Runtime Policy
117
79
 
118
- `/mcp/tools` 에이전트 도구 카탈로그를 MCP 형식으로 노출
119
- Claude Desktop / Cursor의 MCP 설정에 `http://localhost:4825/mcp` 추가 시 직접 도구 사용 가능.
80
+ Local recommended models must be multimodal. The v2.2 local runtime policy is:
120
81
 
121
- 자세한 내용: [mcp-tools.md](mcp-tools.md)
82
+ - macOS Apple Silicon: MLX-VLM first
83
+ - Windows: llama.cpp multimodal path, with LM Studio as a user-friendly option
84
+ - Linux: llama.cpp or vLLM multimodal path depending on GPU support
85
+ - Ollama: kept as an option, not the default priority
122
86
 
123
- ---
87
+ The removed path is the old text-only MLX-LM recommendation route. Low-spec
88
+ machines use smaller or quantized multimodal models.
124
89
 
125
- ## PPT 명세와의 정렬 (2026-05 추가)
90
+ ## Model Source Disclosure
126
91
 
127
- `lattice_ai_full_spec.pptx` (UI 명세서) 맞춰 가지 보강 모듈이 추가됐다.
128
- 어떤 슬라이드가 어떤 파일에 매핑되는지 한눈에:
92
+ Model catalog entries carry source disclosure fields:
129
93
 
130
- | PPT 슬라이드 | 의미 | 구현 파일 |
131
- |--------------|------|-----------|
132
- | 14 (세 가지 약속) | Cross-platform · Auto-setup · Graph 원칙 | (전체 아키텍처) |
133
- | 15·19 (크로스플랫폼·디자인 토큰) | 공유 토큰 = 단일 진실 근원 | [`static/css/tokens.css`](../static/css/tokens.css) |
134
- | 16·17 (자동 환경 매트릭스·5단계) | OS·HW 감지 → 모델 추천 → 설치 → 검증 → 프리셋 | [`auto_setup.py`](../auto_setup.py) |
135
- | 20·21·22 (KG 노드·엣지·데이터 모델) | 10 NodeType / 12 EdgeType + embedding + confidence | [`kg_schema.py`](../kg_schema.py), [`docs/kg-schema.md`](kg-schema.md) |
136
- | 24 (통합 아키텍처) | 6 레이어 (UI / Logic / AI Core / KG / Storage / Auto-Setup) | 이 문서 + 위 파일들 |
94
+ 1. `source_country`
95
+ 2. `source_company`
96
+ 3. `execution_method`
97
+ 4. `internet_requirement`
98
+ 5. `model_name`
137
99
 
138
- ### 신규 모듈 빠른 참조
100
+ These are first-class model facts, not advanced-only metadata.
139
101
 
140
- ```bash
141
- # 자동 환경 세팅 5단계
142
- python3 auto_setup.py probe # ① 시스템 감지
143
- python3 auto_setup.py recommend # ② 모델 추천
144
- python3 auto_setup.py plan # ③ 설치 계획 (실행 안 함)
145
- python3 auto_setup.py plan --apply # ③ 실제 설치 (위험)
146
- python3 auto_setup.py verify # ④ 검증
147
- python3 auto_setup.py preset # ⑤ 프리셋
148
- python3 auto_setup.py all # 전체 흐름
102
+ ## Recommendation Flow
149
103
 
150
- # KG v2 스키마
151
- python3 kg_schema.py init ~/.ltcai/kg_v2.db
152
- python3 kg_schema.py migrate ~/.ltcai/knowledge_graph.db # legacy → v2
153
- python3 kg_schema.py stats ~/.ltcai/knowledge_graph.db
104
+ ```text
105
+ hardware scan
106
+ -> CPU/GPU/RAM/disk/OS analysis
107
+ -> multimodal model shortlist
108
+ -> same-family old generation removal
109
+ -> source disclosure
110
+ -> recommendation reason
111
+ -> download/install/load/verify
154
112
  ```
155
113
 
156
- 전체 명세 구현 매핑은 [`spec-vs-impl.md`](spec-vs-impl.md) 참고.
114
+ The current default recommendation family is Gemma 4. Qwen3-VL and Llama 4
115
+ remain current multimodal alternatives.
116
+
117
+ ## Modes
118
+
119
+ Basic mode and advanced mode have the same feature access.
120
+
121
+ - Basic mode uses plain language and source facts.
122
+ - Advanced mode adds execution, memory, quantization, and load/unload detail.
123
+ - Admin mode adds actual authority: user management, permissions, audit logs,
124
+ organization policy, security policy, sensitive-data monitoring, model approval
125
+ policy, and Private VPC.
126
+
127
+ ## Main Modules
128
+
129
+ | Module | Responsibility |
130
+ | --- | --- |
131
+ | `latticeai/services/model_catalog.py` | Multimodal model catalog, source metadata, aliases |
132
+ | `latticeai/services/model_recommendation.py` | Hardware-aware multimodal recommendation |
133
+ | `latticeai/services/model_runtime.py` | Download, load, server, and runtime orchestration |
134
+ | `llm_router.py` | MLX-VLM and OpenAI-compatible model routing |
135
+ | `knowledge_graph.py` | Graph storage, extraction, local folder graph RAG |
136
+ | `latticeai/core/context_builder.py` | Graph context for generation |
137
+ | `latticeai/core/workspace_os.py` | Workspace state, timeline, snapshots, memory |
138
+ | `latticeai/core/multi_agent.py` | Planner/executor/reviewer/researcher orchestration |
139
+ | `latticeai/core/workflow_engine.py` | Workflow definitions and run history |
140
+ | `latticeai/core/plugins.py` | Plugin manifest, registry, permission boundary |
141
+ | `latticeai/core/security.py` | Local security primitives |
142
+
143
+ ## Compatibility
144
+
145
+ v2.2.0 preserves the additive Workspace OS and API compatibility posture from
146
+ v2.x. Existing graph/workspace data is migrated non-destructively. The release
147
+ does remove current recommendation entries for old or text-only model paths, but
148
+ it does not destructively mutate existing user graph data.
package/docs/kg-schema.md CHANGED
@@ -56,7 +56,7 @@ Edge {
56
56
  weight float [0..1] // 관계의 ‘강도’
57
57
  confidence float [0..1] // 추출/추론의 ‘신뢰도’
58
58
  evidence string[] // 근거 (메시지/청크 ID 리스트)
59
- created_by string // extractor:llm-gemma-3-12b | rule:regex | user
59
+ created_by string // extractor:llm-gemma-4-12b | rule:regex | user
60
60
  created_at ISO8601 UTC
61
61
  }
62
62
  ```
@@ -106,7 +106,7 @@ Edge {
106
106
  "weight": 0.82,
107
107
  "confidence": 0.91,
108
108
  "evidence": ["chunk:01HX7K…#p3", "chunk:01HX7K…#p11"],
109
- "created_by": "extractor:llm-gemma-3-12b"
109
+ "created_by": "extractor:llm-gemma-4-12b"
110
110
  }
111
111
  }
112
112
  ```
@@ -197,7 +197,7 @@ store.upsert_edge(Edge(
197
197
  type=EdgeType.MENTIONS,
198
198
  weight=0.82, confidence=0.91,
199
199
  evidence=["chunk:01HX7K…#p3"],
200
- created_by="extractor:llm-gemma-3-12b",
200
+ created_by="extractor:llm-gemma-4-12b",
201
201
  ))
202
202
 
203
203
  # 이웃 탐색
@@ -131,7 +131,6 @@ yourdomain.com {
131
131
  openai:gpt-4o-mini
132
132
  openai:gpt-4o
133
133
  openrouter:openai/gpt-4o-mini
134
- groq:llama-3.1-8b-instant
135
- groq:llama-3.3-70b-versatile
136
- together:meta-llama/Llama-3.3-70B-Instruct-Turbo
134
+ openrouter:qwen/qwen3-vl-235b-a22b-instruct
135
+ together:Qwen/Qwen3-VL-32B-Instruct
137
136
  ```
@@ -523,7 +523,7 @@ def _extract_concepts_rules(text: str, limit: int = 12) -> List[str]:
523
523
  2. Multi-word proper nouns (Lattice AI, GPT-4o, Claude Sonnet)
524
524
  3. Single capitalized proper nouns not at sentence start (Claude, Python, FastAPI)
525
525
  4. Korean compound technical terms (멀티모달, 에이전트, 그래프RAG)
526
- 5. Hyphenated / versioned identifiers (gpt-4o, mlx-lm, llama-3.3)
526
+ 5. Hyphenated / versioned identifiers (gpt-4o, mlx-vlm, gemma-4)
527
527
  """
528
528
  text = str(text or "")
529
529
  seen: dict = {} # concept_lower → original form
@@ -586,7 +586,7 @@ def _extract_concepts_rules(text: str, limit: int = 12) -> List[str]:
586
586
  if len(m) >= 3 or cnt >= 2:
587
587
  _add(m)
588
588
 
589
- # 6. Hyphenated / versioned identifiers (gpt-4o, llama-3.3, mlx-lm)
589
+ # 6. Hyphenated / versioned identifiers (gpt-4o, gemma-4, mlx-vlm)
590
590
  for m in re.findall(r'\b([a-zA-Z][a-zA-Z0-9]*(?:-[a-zA-Z0-9.]+)+)\b', text):
591
591
  if len(m) >= 4:
592
592
  _add(m)
@@ -1,3 +1,3 @@
1
1
  """Lattice AI - modular server package."""
2
2
 
3
- __version__ = "2.1.0"
3
+ __version__ = "2.2.0"
@@ -100,9 +100,17 @@ def create_models_router(
100
100
  base = {
101
101
  "id": item["id"],
102
102
  "name": item["name"],
103
+ "model_name": item.get("model_name") or item.get("name"),
103
104
  "tag": item["tag"],
104
105
  "size": item["size"],
105
106
  "display_name": item.get("name") or item.get("id"),
107
+ "modality": item.get("modality") or "multimodal",
108
+ "source_country": item.get("source_country"),
109
+ "source_company": item.get("source_company"),
110
+ "execution_method": item.get("execution_method"),
111
+ "run_location": item.get("run_location"),
112
+ "internet_requirement": item.get("internet_requirement"),
113
+ "source_display_order": item.get("source_display_order"),
106
114
  }
107
115
  short_id = str(item["id"]).lower()
108
116
  aliases = MODEL_ENGINE_ALIASES.get(short_id) or {}
@@ -131,7 +131,7 @@ class Config:
131
131
  admin_emails = [item.strip().lower() for item in _value(env, "LATTICEAI_ADMIN_EMAILS", "").split(",") if item.strip()]
132
132
 
133
133
  public_model = _value(env, "LATTICEAI_PUBLIC_MODEL", _value(env, "LATTICEAI_DEFAULT_MODEL", "openai:gpt-4o-mini"))
134
- local_model = _value(env, "LATTICEAI_LOCAL_MODEL", "mlx-community/gemma-4-26b-a4b-it-4bit")
134
+ local_model = _value(env, "LATTICEAI_LOCAL_MODEL", "mlx-community/gemma-4-12b-it-4bit")
135
135
 
136
136
  data_dir = Path(_value(env, "LATTICEAI_DATA_DIR", str(Path.home() / ".ltcai")))
137
137
  static_dir = Path(_value(env, "LATTICEAI_STATIC_DIR", str(base_dir / "static")))
@@ -231,9 +231,9 @@ def extract_topic_candidates(
231
231
 
232
232
  DEFAULT_ALIAS_GROUPS: List[List[str]] = [
233
233
  ["lattice ai", "latticeai", "래티스 ai", "래티스ai", "내 앱", "내 ai"],
234
- ["gpt-oss", "gpt oss", "openai gpt-oss"],
234
+ ["gemma-4", "gemma 4", "google gemma"],
235
235
  ["gemma 4", "gemma4", "google gemma 4"],
236
- ["llama 3", "llama3", "meta llama 3"],
236
+ ["llama 4", "llama4", "meta llama 4", "llama scout"],
237
237
  ]
238
238
 
239
239
 
@@ -11,7 +11,7 @@ from copy import deepcopy
11
11
  from typing import Any, Dict, List, Optional
12
12
 
13
13
 
14
- MARKETPLACE_VERSION = "2.1.0"
14
+ MARKETPLACE_VERSION = "2.2.0"
15
15
  TEMPLATE_KINDS = ("plugin", "workflow", "agent")
16
16
 
17
17
 
@@ -33,7 +33,7 @@ BUILTIN_TEMPLATES: Dict[str, List[Dict[str, Any]]] = {
33
33
  "id": "plugin-review-action",
34
34
  "name": "Plugin Review Action",
35
35
  "version": "1.0.0",
36
- "lattice_version": ">=2.1.0",
36
+ "lattice_version": ">=2.2.0",
37
37
  "permissions": ["read_workspace", "run_skills"],
38
38
  "provides": {"skills": ["review_action"]},
39
39
  }
@@ -25,18 +25,13 @@ logger = logging.getLogger(__name__)
25
25
  # ── Model family detection ────────────────────────────────────────────────────
26
26
 
27
27
  FAMILY_PATTERNS: List[Tuple[str, re.Pattern]] = [
28
- ("gpt-oss", re.compile(r"gpt[-_]?oss", re.I)),
29
28
  ("gemma", re.compile(r"gemma", re.I)),
30
29
  ("qwen", re.compile(r"qwen", re.I)),
31
30
  ("llama", re.compile(r"\bllama|meta[-_]?llama", re.I)),
32
- ("mistral", re.compile(r"mistral|mixtral", re.I)),
33
- ("phi", re.compile(r"\bphi[-_]?\d", re.I)),
34
- ("deepseek", re.compile(r"deepseek", re.I)),
35
- ("yi", re.compile(r"\byi[-_]?\d", re.I)),
36
31
  ("claude", re.compile(r"claude", re.I)),
37
- ("gpt-4", re.compile(r"gpt[-_]?4", re.I)),
38
- ("gpt-3.5", re.compile(r"gpt[-_]?3\.?5", re.I)),
39
- ("o1", re.compile(r"\bo1[-_]?", re.I)),
32
+ ("gpt", re.compile(r"gpt[-_]?(?:4|5)|openai", re.I)),
33
+ ("gemini", re.compile(r"gemini", re.I)),
34
+ ("grok", re.compile(r"grok|x[-_]?ai", re.I)),
40
35
  ]
41
36
 
42
37
 
@@ -59,20 +54,6 @@ def detect_model_family(model_id: str) -> str:
59
54
  DEFAULT_STOP = ["<|im_end|>", "<|endoftext|>", "</s>", "<|user|>", "<|assistant|>"]
60
55
 
61
56
  FAMILY_PROFILES: Dict[str, Dict[str, Any]] = {
62
- "gpt-oss": {
63
- "family": "gpt-oss",
64
- "supports_system": True,
65
- "supports_vision": False,
66
- "chat_template": "gpt_oss",
67
- "preferred_engines": ["ollama", "llamacpp", "vllm", "local_mlx"],
68
- "temperature": 0.1,
69
- "top_p": 0.9,
70
- "max_tokens": 2048,
71
- "stop_sequences": ["<|im_end|>", "<|end|>", "</s>", "<|user|>", "<|assistant|>"],
72
- "disable_draft": True,
73
- # trim_after_user_marker는 <|user|>가 살아있어야 동작하므로 strip_role_tokens보다 먼저 실행.
74
- "postprocess": ["trim_after_user_marker", "strip_role_tokens"],
75
- },
76
57
  "gemma": {
77
58
  "family": "gemma",
78
59
  "supports_system": True,
@@ -89,7 +70,7 @@ FAMILY_PROFILES: Dict[str, Dict[str, Any]] = {
89
70
  "qwen": {
90
71
  "family": "qwen",
91
72
  "supports_system": True,
92
- "supports_vision": False,
73
+ "supports_vision": True,
93
74
  "chat_template": "qwen_chatml",
94
75
  "preferred_engines": ["ollama", "local_mlx", "vllm"],
95
76
  "temperature": 0.2,
@@ -102,7 +83,7 @@ FAMILY_PROFILES: Dict[str, Dict[str, Any]] = {
102
83
  "llama": {
103
84
  "family": "llama",
104
85
  "supports_system": True,
105
- "supports_vision": False,
86
+ "supports_vision": True,
106
87
  "chat_template": "tokenizer_default",
107
88
  "preferred_engines": ["ollama", "local_mlx", "llamacpp", "vllm"],
108
89
  "temperature": 0.2,
@@ -112,45 +93,6 @@ FAMILY_PROFILES: Dict[str, Dict[str, Any]] = {
112
93
  "disable_draft": False,
113
94
  "postprocess": ["strip_role_tokens"],
114
95
  },
115
- "mistral": {
116
- "family": "mistral",
117
- "supports_system": False,
118
- "supports_vision": False,
119
- "chat_template": "tokenizer_default",
120
- "preferred_engines": ["ollama", "local_mlx", "llamacpp"],
121
- "temperature": 0.2,
122
- "top_p": 0.9,
123
- "max_tokens": 4096,
124
- "stop_sequences": ["</s>", "[INST]", "[/INST]"],
125
- "disable_draft": False,
126
- "postprocess": ["strip_role_tokens"],
127
- },
128
- "phi": {
129
- "family": "phi",
130
- "supports_system": True,
131
- "supports_vision": False,
132
- "chat_template": "tokenizer_default",
133
- "preferred_engines": ["ollama", "local_mlx"],
134
- "temperature": 0.2,
135
- "top_p": 0.9,
136
- "max_tokens": 2048,
137
- "stop_sequences": ["<|end|>", "<|endoftext|>"],
138
- "disable_draft": False,
139
- "postprocess": ["strip_role_tokens"],
140
- },
141
- "deepseek": {
142
- "family": "deepseek",
143
- "supports_system": True,
144
- "supports_vision": False,
145
- "chat_template": "tokenizer_default",
146
- "preferred_engines": ["ollama", "local_mlx", "vllm"],
147
- "temperature": 0.2,
148
- "top_p": 0.9,
149
- "max_tokens": 4096,
150
- "stop_sequences": ["<|EOT|>", "</s>"],
151
- "disable_draft": False,
152
- "postprocess": ["strip_role_tokens"],
153
- },
154
96
  "unknown": {
155
97
  "family": "unknown",
156
98
  "supports_system": True,
@@ -316,6 +258,7 @@ class CompatProfile:
316
258
  engine: Optional[str]
317
259
  family: str
318
260
  template: str
261
+ supports_vision: bool
319
262
  stop: List[str]
320
263
  temperature: float
321
264
  top_p: float
@@ -362,6 +305,7 @@ def ensure_profile(model_id: str, engine: Optional[str] = None) -> CompatProfile
362
305
  engine=(engine or "").strip().lower() or None,
363
306
  family=base["family"],
364
307
  template=base["chat_template"],
308
+ supports_vision=bool(base.get("supports_vision", False)),
365
309
  stop=list(base["stop_sequences"]),
366
310
  temperature=float(base["temperature"]),
367
311
  top_p=float(base["top_p"]),
@@ -120,7 +120,7 @@ class ModelResolution:
120
120
  if not provider:
121
121
  provider = engine_hint or "local_mlx"
122
122
 
123
- # alias 테이블 (예: {"gpt-oss-20b": {"local_mlx": "mlx-community/...","ollama":"gpt-oss:20b"}})
123
+ # alias 테이블 (예: {"gemma-4-12b-it-4bit": {"local_mlx": "mlx-community/...", "ollama": "hf.co/..."}})
124
124
  resolved_model = model_name
125
125
  if engine_aliases:
126
126
  aliases = engine_aliases.get(model_name.lower())
@@ -14,7 +14,7 @@ from datetime import datetime
14
14
  from typing import Any, Callable, Dict, List, Optional
15
15
 
16
16
 
17
- MULTI_AGENT_VERSION = "2.1.0"
17
+ MULTI_AGENT_VERSION = "2.2.0"
18
18
 
19
19
  AGENT_ROLES = ("researcher", "planner", "executor", "reviewer", "release")
20
20
  CORE_PIPELINE = ("planner", "executor", "reviewer")
@@ -30,7 +30,7 @@ from pathlib import Path
30
30
  from typing import Any, Callable, Dict, List, Optional, Tuple
31
31
 
32
32
 
33
- PLUGIN_SDK_VERSION = "2.1.0"
33
+ PLUGIN_SDK_VERSION = "2.2.0"
34
34
 
35
35
  # Capability-style permissions a plugin can request. Kept deliberately small so
36
36
  # the Enterprise seam can layer finer-grained policy on top without changing the
@@ -32,7 +32,7 @@ from datetime import datetime
32
32
  from typing import Any, AsyncIterator, Dict, List, Optional, Set
33
33
 
34
34
 
35
- REALTIME_VERSION = "2.1.0"
35
+ REALTIME_VERSION = "2.2.0"
36
36
  _FEED_LIMIT = 200
37
37
  _QUEUE_MAX = 100
38
38
 
@@ -28,7 +28,7 @@ from datetime import datetime
28
28
  from typing import Any, Callable, Dict, List, Optional
29
29
 
30
30
 
31
- WORKFLOW_ENGINE_VERSION = "2.1.0"
31
+ WORKFLOW_ENGINE_VERSION = "2.2.0"
32
32
 
33
33
  # The node vocabulary a workflow can be built from. ``trigger`` and ``output``
34
34
  # are structural; the rest dispatch to an injected runner of the same family.
@@ -18,7 +18,7 @@ from pathlib import Path
18
18
  from typing import Any, Callable, Dict, Iterable, List, Optional
19
19
 
20
20
 
21
- WORKSPACE_OS_VERSION = "2.1.0"
21
+ WORKSPACE_OS_VERSION = "2.2.0"
22
22
 
23
23
  # Workspace types separate single-user Personal workspaces from shared
24
24
  # Organization workspaces. Both keep the same local-first JSON store; the type
@@ -1,6 +1,6 @@
1
1
  """
2
2
  Lattice AI MLX — Local LLM Bridge Server
3
- Apple Silicon (M1-M5) 전용 | mlx-lm 기반
3
+ Apple Silicon (M1-M5) 전용 | MLX-VLM 기반
4
4
  """
5
5
 
6
6
  import asyncio