sqlseed-ai 0.2.0__tar.gz → 0.2.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -33,6 +33,7 @@ ENV/
33
33
  .idea/
34
34
  .trae/
35
35
  .claude/
36
+ .gemini/
36
37
  .sonarlint/
37
38
  *.swp
38
39
  *.swo
@@ -62,6 +63,9 @@ examples/notebooks/batch_config.yaml
62
63
  *.nbconvert.ipynb
63
64
  .ipynb_checkpoints/
64
65
 
66
+ # AI-generated config outputs
67
+ *_config.yaml
68
+
65
69
  # OS
66
70
  .DS_Store
67
71
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: sqlseed-ai
3
- Version: 0.2.0
3
+ Version: 0.2.3
4
4
  Summary: AI-powered data generation plugin for sqlseed
5
5
  Project-URL: Homepage, https://github.com/sunbos/sqlseed
6
6
  Project-URL: Repository, https://github.com/sunbos/sqlseed/tree/main/plugins/sqlseed-ai
@@ -14,7 +14,6 @@ Classifier: Programming Language :: Python :: 3.11
14
14
  Classifier: Programming Language :: Python :: 3.12
15
15
  Classifier: Programming Language :: Python :: 3.13
16
16
  Requires-Python: >=3.10
17
- Requires-Dist: google-generativeai>=0.8
18
17
  Requires-Dist: openai>=1.0
19
18
  Requires-Dist: sqlseed>=0.0.1
20
19
  Description-Content-Type: text/markdown
@@ -46,7 +45,7 @@ sqlseed ai-suggest app.db --table users --output users.yaml
46
45
  sqlseed ai-suggest app.db --table users --output users.yaml --verify
47
46
 
48
47
  # Specify model (defaults to Gemma 4 26B via Google AI Studio)
49
- sqlseed ai-suggest app.db --table users -o users.yaml --model gemma-4-26b-it
48
+ sqlseed ai-suggest app.db --table users -o users.yaml --model gemma-4-26b-a4b-it
50
49
 
51
50
  # Use local LM Studio
52
51
  sqlseed ai-suggest app.db --table users -o users.yaml --backend lm_studio --model google/gemma-4-e4b
@@ -73,7 +72,7 @@ sqlseed ai-suggest app.db --table users -o users.yaml --no-cache
73
72
 
74
73
  When using the `google_ai_studio` backend (default), the `GemmaModel` enum provides pre-configured Gemma 4 variants. The model is selected based on the backend:
75
74
 
76
- 1. **Google AI Studio**: Defaults to `gemma-4-26b-it` (recommended balance of quality and speed).
75
+ 1. **Google AI Studio**: Defaults to `gemma-4-26b-a4b-it` (recommended balance of quality and speed).
77
76
  2. **LM Studio / Ollama**: User must specify a loaded model via `--model` or `SQLSEED_AI_MODEL`.
78
77
  3. **OpenAI-compatible** (OpenRouter, DeepSeek, etc.): User must specify both `--model` and `--base-url`.
79
78
 
@@ -90,10 +89,11 @@ When using the `google_ai_studio` backend, the `GemmaModel` enum provides pre-co
90
89
 
91
90
  | Enum Value | Model ID | Description |
92
91
  |:-----------|:---------|:------------|
93
- | `GemmaModel.GEMMA_4_2B` | `gemma-4-2b` | Lightweight, fast inference |
94
- | `GemmaModel.GEMMA_4_4B` | `gemma-4-4b` | Balanced speed and quality |
95
- | `GemmaModel.GEMMA_4_26B` | `gemma-4-26b` | High quality, recommended |
96
- | `GemmaModel.GEMMA_4_31B` | `gemma-4-31b` | Best quality, largest model |
92
+ | `GemmaModel.GEMMA_4_E2B` | `gemma-4-e2b-it` | 2B Effective, Edge Ultra-light edge deployment |
93
+ | `GemmaModel.GEMMA_4_E4B` | `gemma-4-e4b-it` | 4B Effective, Edge Lightweight local inference |
94
+ | `GemmaModel.GEMMA_4_12B` | `gemma-4-12b-it` | 12B Unified, Laptop — Balanced quality and speed |
95
+ | `GemmaModel.GEMMA_4_26B_A4B` | `gemma-4-26b-a4b-it` | 26B A4B MoE — High quality, recommended |
96
+ | `GemmaModel.GEMMA_4_31B` | `gemma-4-31b-it` | 31B Dense — Best quality, largest model |
97
97
 
98
98
  The `AIBackend` enum selects the API backend:
99
99
 
@@ -102,6 +102,7 @@ The `AIBackend` enum selects the API backend:
102
102
  | `AIBackend.GOOGLE_AI_STUDIO` | Google AI Studio | `https://generativelanguage.googleapis.com/v1beta/openai/` |
103
103
  | `AIBackend.LM_STUDIO` | LM Studio | `http://localhost:1234/v1` |
104
104
  | `AIBackend.OLLAMA` | Ollama | `http://localhost:11434/v1` |
105
+ | `AIBackend.OPENAI_COMPAT` | OpenAI-compatible | (must set `SQLSEED_AI_BASE_URL`) |
105
106
 
106
107
  ### Template Pool
107
108
 
@@ -119,7 +120,7 @@ AI configs cached in platform-specific cache directory (`~/Library/Caches/sqlsee
119
120
  |:---------|:---------|:--------|:------------|
120
121
  | `SQLSEED_AI_API_KEY` | `OPENAI_API_KEY` | — | API key (required) |
121
122
  | `SQLSEED_AI_BASE_URL` | `OPENAI_BASE_URL` | (auto by backend) | API endpoint |
122
- | `SQLSEED_AI_MODEL` | — | `gemma-4-26b-it` | Model name |
123
+ | `SQLSEED_AI_MODEL` | — | `gemma-4-26b-a4b-it` | Model name |
123
124
  | `SQLSEED_AI_TIMEOUT` | — | `60` | API timeout (seconds) |
124
125
  | `SQLSEED_AI_BACKEND` | — | `google_ai_studio` | AI backend: `google_ai_studio`, `lm_studio`, `ollama`, `openai_compat` |
125
126
  | `GOOGLE_API_KEY` | — | — | Google AI Studio API key (required when backend is `google_ai_studio`) |
@@ -152,7 +153,6 @@ This plugin registers via `[project.entry-points."sqlseed"]` and implements:
152
153
  - Python >= 3.10
153
154
  - `sqlseed >= 0.1.0`
154
155
  - `openai >= 1.0`
155
- - `google-generativeai >= 0.8`
156
156
  - An OpenAI-compatible API key or Google AI Studio API key
157
157
 
158
158
  ## Gemma 4 Integration
@@ -25,7 +25,7 @@ sqlseed ai-suggest app.db --table users --output users.yaml
25
25
  sqlseed ai-suggest app.db --table users --output users.yaml --verify
26
26
 
27
27
  # Specify model (defaults to Gemma 4 26B via Google AI Studio)
28
- sqlseed ai-suggest app.db --table users -o users.yaml --model gemma-4-26b-it
28
+ sqlseed ai-suggest app.db --table users -o users.yaml --model gemma-4-26b-a4b-it
29
29
 
30
30
  # Use local LM Studio
31
31
  sqlseed ai-suggest app.db --table users -o users.yaml --backend lm_studio --model google/gemma-4-e4b
@@ -52,7 +52,7 @@ sqlseed ai-suggest app.db --table users -o users.yaml --no-cache
52
52
 
53
53
  When using the `google_ai_studio` backend (default), the `GemmaModel` enum provides pre-configured Gemma 4 variants. The model is selected based on the backend:
54
54
 
55
- 1. **Google AI Studio**: Defaults to `gemma-4-26b-it` (recommended balance of quality and speed).
55
+ 1. **Google AI Studio**: Defaults to `gemma-4-26b-a4b-it` (recommended balance of quality and speed).
56
56
  2. **LM Studio / Ollama**: User must specify a loaded model via `--model` or `SQLSEED_AI_MODEL`.
57
57
  3. **OpenAI-compatible** (OpenRouter, DeepSeek, etc.): User must specify both `--model` and `--base-url`.
58
58
 
@@ -69,10 +69,11 @@ When using the `google_ai_studio` backend, the `GemmaModel` enum provides pre-co
69
69
 
70
70
  | Enum Value | Model ID | Description |
71
71
  |:-----------|:---------|:------------|
72
- | `GemmaModel.GEMMA_4_2B` | `gemma-4-2b` | Lightweight, fast inference |
73
- | `GemmaModel.GEMMA_4_4B` | `gemma-4-4b` | Balanced speed and quality |
74
- | `GemmaModel.GEMMA_4_26B` | `gemma-4-26b` | High quality, recommended |
75
- | `GemmaModel.GEMMA_4_31B` | `gemma-4-31b` | Best quality, largest model |
72
+ | `GemmaModel.GEMMA_4_E2B` | `gemma-4-e2b-it` | 2B Effective, Edge Ultra-light edge deployment |
73
+ | `GemmaModel.GEMMA_4_E4B` | `gemma-4-e4b-it` | 4B Effective, Edge Lightweight local inference |
74
+ | `GemmaModel.GEMMA_4_12B` | `gemma-4-12b-it` | 12B Unified, Laptop — Balanced quality and speed |
75
+ | `GemmaModel.GEMMA_4_26B_A4B` | `gemma-4-26b-a4b-it` | 26B A4B MoE — High quality, recommended |
76
+ | `GemmaModel.GEMMA_4_31B` | `gemma-4-31b-it` | 31B Dense — Best quality, largest model |
76
77
 
77
78
  The `AIBackend` enum selects the API backend:
78
79
 
@@ -81,6 +82,7 @@ The `AIBackend` enum selects the API backend:
81
82
  | `AIBackend.GOOGLE_AI_STUDIO` | Google AI Studio | `https://generativelanguage.googleapis.com/v1beta/openai/` |
82
83
  | `AIBackend.LM_STUDIO` | LM Studio | `http://localhost:1234/v1` |
83
84
  | `AIBackend.OLLAMA` | Ollama | `http://localhost:11434/v1` |
85
+ | `AIBackend.OPENAI_COMPAT` | OpenAI-compatible | (must set `SQLSEED_AI_BASE_URL`) |
84
86
 
85
87
  ### Template Pool
86
88
 
@@ -98,7 +100,7 @@ AI configs cached in platform-specific cache directory (`~/Library/Caches/sqlsee
98
100
  |:---------|:---------|:--------|:------------|
99
101
  | `SQLSEED_AI_API_KEY` | `OPENAI_API_KEY` | — | API key (required) |
100
102
  | `SQLSEED_AI_BASE_URL` | `OPENAI_BASE_URL` | (auto by backend) | API endpoint |
101
- | `SQLSEED_AI_MODEL` | — | `gemma-4-26b-it` | Model name |
103
+ | `SQLSEED_AI_MODEL` | — | `gemma-4-26b-a4b-it` | Model name |
102
104
  | `SQLSEED_AI_TIMEOUT` | — | `60` | API timeout (seconds) |
103
105
  | `SQLSEED_AI_BACKEND` | — | `google_ai_studio` | AI backend: `google_ai_studio`, `lm_studio`, `ollama`, `openai_compat` |
104
106
  | `GOOGLE_API_KEY` | — | — | Google AI Studio API key (required when backend is `google_ai_studio`) |
@@ -131,7 +133,6 @@ This plugin registers via `[project.entry-points."sqlseed"]` and implements:
131
133
  - Python >= 3.10
132
134
  - `sqlseed >= 0.1.0`
133
135
  - `openai >= 1.0`
134
- - `google-generativeai >= 0.8`
135
136
  - An OpenAI-compatible API key or Google AI Studio API key
136
137
 
137
138
  ## Gemma 4 Integration
@@ -25,7 +25,7 @@ sqlseed ai-suggest app.db --table users --output users.yaml
25
25
  sqlseed ai-suggest app.db --table users --output users.yaml --verify
26
26
 
27
27
  # 指定模型(默认使用 Gemma 4 26B via Google AI Studio)
28
- sqlseed ai-suggest app.db --table users -o users.yaml --model gemma-4-26b-it
28
+ sqlseed ai-suggest app.db --table users -o users.yaml --model gemma-4-26b-a4b-it
29
29
 
30
30
  # 使用本地 LM Studio
31
31
  sqlseed ai-suggest app.db --table users -o users.yaml --backend lm_studio --model google/gemma-4-e4b
@@ -52,7 +52,7 @@ sqlseed ai-suggest app.db --table users -o users.yaml --no-cache
52
52
 
53
53
  使用 `google_ai_studio` 后端(默认)时,`GemmaModel` 枚举提供预配置的 Gemma 4 变体。模型根据后端自动选择:
54
54
 
55
- 1. **Google AI Studio**:默认使用 `gemma-4-26b-it`(推荐的质量与速度平衡)。
55
+ 1. **Google AI Studio**:默认使用 `gemma-4-26b-a4b-it`(推荐的质量与速度平衡)。
56
56
  2. **LM Studio / Ollama**:用户需通过 `--model` 或 `SQLSEED_AI_MODEL` 指定已加载的模型。
57
57
  3. **OpenAI-compatible**(OpenRouter、DeepSeek 等):用户需同时指定 `--model` 和 `--base-url`。
58
58
 
@@ -69,10 +69,11 @@ export SQLSEED_AI_MODEL=<免费模型名>
69
69
 
70
70
  | 枚举值 | 模型 ID | 说明 |
71
71
  |:-------|:--------|:-----|
72
- | `GemmaModel.GEMMA_4_2B` | `gemma-4-2b` | 轻量级,推理速度快 |
73
- | `GemmaModel.GEMMA_4_4B` | `gemma-4-4b` | 速度与质量均衡 |
74
- | `GemmaModel.GEMMA_4_26B` | `gemma-4-26b` | 高质量,推荐使用 |
75
- | `GemmaModel.GEMMA_4_31B` | `gemma-4-31b` | 最佳质量,最大模型 |
72
+ | `GemmaModel.GEMMA_4_E2B` | `gemma-4-e2b-it` | 2B Effective, Edge — 超轻量边缘部署 |
73
+ | `GemmaModel.GEMMA_4_E4B` | `gemma-4-e4b-it` | 4B Effective, Edge — 轻量本地推理 |
74
+ | `GemmaModel.GEMMA_4_12B` | `gemma-4-12b-it` | 12B Unified, Laptop — 速度与质量均衡 |
75
+ | `GemmaModel.GEMMA_4_26B_A4B` | `gemma-4-26b-a4b-it` | 26B A4B MoE — 高质量,推荐使用 |
76
+ | `GemmaModel.GEMMA_4_31B` | `gemma-4-31b-it` | 31B Dense — 最佳质量,最大模型 |
76
77
 
77
78
  `AIBackend` 枚举用于选择 API 后端:
78
79
 
@@ -81,6 +82,7 @@ export SQLSEED_AI_MODEL=<免费模型名>
81
82
  | `AIBackend.GOOGLE_AI_STUDIO` | Google AI Studio | `https://generativelanguage.googleapis.com/v1beta/openai/` |
82
83
  | `AIBackend.LM_STUDIO` | LM Studio | `http://localhost:1234/v1` |
83
84
  | `AIBackend.OLLAMA` | Ollama | `http://localhost:11434/v1` |
85
+ | `AIBackend.OPENAI_COMPAT` | OpenAI 兼容端点 | (需设置 `SQLSEED_AI_BASE_URL`) |
84
86
 
85
87
  ### 模板池
86
88
 
@@ -98,7 +100,7 @@ AI 配置缓存在平台标准缓存目录(macOS: `~/Library/Caches/sqlseed/ai
98
100
  |:-----|:-----|:-------|:-----|
99
101
  | `SQLSEED_AI_API_KEY` | `OPENAI_API_KEY` | — | API Key(必填) |
100
102
  | `SQLSEED_AI_BASE_URL` | `OPENAI_BASE_URL` | (按后端自动设置) | API 端点 |
101
- | `SQLSEED_AI_MODEL` | — | `gemma-4-26b-it` | 模型名称 |
103
+ | `SQLSEED_AI_MODEL` | — | `gemma-4-26b-a4b-it` | 模型名称 |
102
104
  | `SQLSEED_AI_TIMEOUT` | — | `60` | API 超时(秒) |
103
105
  | `SQLSEED_AI_BACKEND` | — | `google_ai_studio` | AI 后端:`google_ai_studio`、`lm_studio`、`ollama`、`openai_compat` |
104
106
  | `GOOGLE_API_KEY` | — | — | Google AI Studio API Key(后端为 `google_ai_studio` 时必填) |
@@ -131,7 +133,6 @@ AI 配置缓存在平台标准缓存目录(macOS: `~/Library/Caches/sqlseed/ai
131
133
  - Python >= 3.10
132
134
  - `sqlseed >= 0.1.0`
133
135
  - `openai >= 1.0`
134
- - `google-generativeai >= 0.8`
135
136
  - OpenAI 兼容 API Key 或 Google AI Studio API Key
136
137
 
137
138
  ## Gemma 4 集成
@@ -24,7 +24,6 @@ classifiers = [
24
24
  dependencies = [
25
25
  "sqlseed>=0.0.1",
26
26
  "openai>=1.0",
27
- "google-generativeai>=0.8",
28
27
  ]
29
28
 
30
29
  [project.urls]
@@ -65,13 +65,5 @@ class AISqlseedPlugin:
65
65
  except (ValueError, RuntimeError, OSError):
66
66
  return None
67
67
 
68
- @hookimpl
69
- def sqlseed_register_providers(self, registry: Any) -> None:
70
- _ = registry
71
-
72
- @hookimpl
73
- def sqlseed_register_column_mappers(self, mapper: Any) -> None:
74
- _ = mapper
75
-
76
68
 
77
69
  plugin = AISqlseedPlugin()
@@ -0,0 +1,33 @@
1
+ from __future__ import annotations
2
+
3
+ from typing import Any
4
+
5
+ import httpx
6
+ from openai import OpenAI
7
+ from sqlseed_ai.config import AIBackend, AIConfig
8
+
9
+ from sqlseed._utils.logger import get_logger
10
+
11
+ logger = get_logger(__name__)
12
+
13
+
14
+ def get_openai_client(config: AIConfig | None = None) -> Any:
15
+ if config is None:
16
+ config = AIConfig.from_env()
17
+
18
+ kwargs = config.to_openai_kwargs()
19
+ # For local backends, use a shorter connection timeout but longer read timeout.
20
+ # This prevents hanging on connection while allowing slow inference.
21
+ if config.backend in (AIBackend.LM_STUDIO, AIBackend.OLLAMA):
22
+ kwargs["timeout"] = httpx_timeout(config.resolve_timeout())
23
+ logger.info("Creating OpenAI client", **{"backend": config.backend.value, "base_url": kwargs["base_url"]})
24
+ return OpenAI(**kwargs)
25
+
26
+
27
+ def httpx_timeout(total: float) -> Any:
28
+ """Build an httpx.Timeout with separate connect/read timeouts.
29
+
30
+ For local inference: fast connect (5s) but long read (total) to
31
+ accommodate slow GPU inference without hanging on dead connections.
32
+ """
33
+ return httpx.Timeout(connect=10.0, read=total, write=30.0, pool=10.0)
@@ -0,0 +1,80 @@
1
+ from __future__ import annotations
2
+
3
+ import json
4
+ import re
5
+ from typing import Any
6
+
7
+
8
+ def parse_json_response(content: str) -> dict[str, Any]:
9
+ """Parse JSON from LLM response using 3-strategy fallback."""
10
+ cleaned = content.strip()
11
+
12
+ return _try_direct_parse(cleaned) or _try_markdown_fence_parse(cleaned) or _try_raw_decode(cleaned) or {}
13
+
14
+
15
+ def _try_direct_parse(content: str) -> dict[str, Any] | None:
16
+ """Strategy 1: Direct parse (ideal case — model outputs raw JSON)."""
17
+ try:
18
+ result = json.loads(content)
19
+ if isinstance(result, dict):
20
+ _sanitize_names(result)
21
+ return result
22
+ except json.JSONDecodeError:
23
+ pass
24
+ return None
25
+
26
+
27
+ def _try_markdown_fence_parse(content: str) -> dict[str, Any] | None:
28
+ """Strategy 2: Strip markdown code fences (```json\\n{...}\\n```)."""
29
+ open_idx = content.find("```")
30
+ if open_idx < 0:
31
+ return None
32
+ after_open = content[open_idx + 3 :]
33
+ nl_pos = after_open.find("\n")
34
+ if nl_pos < 0:
35
+ return None
36
+ content_start = nl_pos + 1
37
+ close_idx = after_open.find("```", content_start)
38
+ if close_idx < 0:
39
+ return None
40
+ fence_content = after_open[content_start:close_idx].strip()
41
+ try:
42
+ result = json.loads(fence_content)
43
+ if isinstance(result, dict):
44
+ _sanitize_names(result)
45
+ return result
46
+ except json.JSONDecodeError:
47
+ pass
48
+ return None
49
+
50
+
51
+ def _try_raw_decode(content: str) -> dict[str, Any] | None:
52
+ """Strategy 3: Find first '{' and use json.JSONDecoder.raw_decode().
53
+
54
+ Handles explanatory text before/after JSON without code fences.
55
+ raw_decode() correctly handles braces inside JSON strings.
56
+ """
57
+ first_brace = content.find("{")
58
+ if first_brace < 0:
59
+ return None
60
+ try:
61
+ decoder = json.JSONDecoder()
62
+ result, _ = decoder.raw_decode(content, idx=first_brace)
63
+ if isinstance(result, dict):
64
+ _sanitize_names(result)
65
+ return result
66
+ except json.JSONDecodeError:
67
+ pass
68
+ return None
69
+
70
+
71
+ def _sanitize_names(data: dict[str, Any]) -> None:
72
+ name = data.get("name")
73
+ if isinstance(name, str):
74
+ data["name"] = re.sub(r"^[:.]+", "", name)
75
+
76
+ for col in data.get("columns", []):
77
+ if isinstance(col, dict):
78
+ col_name = col.get("name")
79
+ if isinstance(col_name, str):
80
+ col["name"] = re.sub(r"^[:.]+", "", col_name)
@@ -0,0 +1,127 @@
1
+ from __future__ import annotations
2
+
3
+ import re
4
+
5
+ from sqlseed_ai.config import AIBackend, GemmaModel
6
+
7
+ from sqlseed._utils.logger import get_logger
8
+
9
+ logger = get_logger(__name__)
10
+
11
+
12
+ def _normalize_model_id(model_id: str) -> str:
13
+ """Normalize a model ID for comparison.
14
+
15
+ Strips platform-specific formatting so that model IDs from
16
+ different sources can be compared:
17
+
18
+ "google/gemma-4-e4b" → "gemma-4-e4b" (LM Studio)
19
+ "gemma-4-e4b-it" → "gemma-4-e4b" (Google AI Studio)
20
+ "gemma4:e4b" → "gemma-4-e4b" (Ollama)
21
+ "google/gemma-4-e4b-it" → "gemma-4-e4b" (OpenRouter)
22
+ "google/gemma-4-26b-a4b-it:free" → "gemma-4-26b-a4b" (OpenRouter free)
23
+ """
24
+ result = model_id.lower().strip()
25
+
26
+ # Strip OpenRouter free tier suffix (e.g., ":free")
27
+ result = re.sub(r":free$", "", result)
28
+
29
+ # Convert Ollama format: "gemma4:xxb" → "gemma-4-xxb"
30
+ # e.g., "gemma4:e4b" → "gemma-4-e4b", "gemma4:26b" → "gemma-4-26b"
31
+ ollama_match = re.match(r"^gemma4:(.+)$", result)
32
+ if ollama_match:
33
+ result = f"gemma-4-{ollama_match.group(1)}"
34
+
35
+ # Strip provider prefix (e.g., "google/" from LM Studio/OpenRouter IDs)
36
+ result = re.sub(r"^[a-z]+/", "", result)
37
+
38
+ # Strip "-it" suffix (instruction-tuned variant indicator)
39
+ return re.sub(r"-it$", "", result)
40
+
41
+
42
+ # ── Gemma 4 model selection priority ────────────────────────────────
43
+ # Ordered by capability: 26B A4B MoE (best balance) > 31B Dense > 12B Unified > E4B > E2B
44
+ _GEMMA_MODEL_PRIORITY: tuple[GemmaModel, ...] = (
45
+ GemmaModel.GEMMA_4_26B_A4B,
46
+ GemmaModel.GEMMA_4_31B,
47
+ GemmaModel.GEMMA_4_12B,
48
+ GemmaModel.GEMMA_4_E4B,
49
+ GemmaModel.GEMMA_4_E2B,
50
+ )
51
+
52
+ # Map backend to preferred model size
53
+ _BACKEND_DEFAULT_MODEL: dict[AIBackend, GemmaModel] = {
54
+ AIBackend.GOOGLE_AI_STUDIO: GemmaModel.GEMMA_4_26B_A4B,
55
+ AIBackend.LM_STUDIO: GemmaModel.GEMMA_4_E4B, # local inference, prefer smaller
56
+ AIBackend.OLLAMA: GemmaModel.GEMMA_4_E4B, # smaller for local inference
57
+ AIBackend.OPENAI_COMPAT: GemmaModel.GEMMA_4_26B_A4B,
58
+ }
59
+
60
+
61
+ def select_gemma_model(
62
+ backend: AIBackend = AIBackend.GOOGLE_AI_STUDIO,
63
+ prefer_small: bool = False,
64
+ ) -> str:
65
+ """Select the best Gemma 4 model for the given backend.
66
+
67
+ Returns the platform-specific model ID for the selected backend.
68
+
69
+ Args:
70
+ backend: The LLM backend provider.
71
+ prefer_small: If True, prefer smaller models (useful for Edge/local).
72
+
73
+ Returns:
74
+ The model identifier string in the backend's format.
75
+ """
76
+ if prefer_small or backend in (AIBackend.OLLAMA, AIBackend.LM_STUDIO):
77
+ # For local inference (Ollama/LM Studio), prefer smaller models
78
+ model = GemmaModel.GEMMA_4_E4B
79
+ logger.info("Selected compact Gemma 4 model for local inference", model=model.to_backend_id(backend))
80
+ return model.to_backend_id(backend)
81
+
82
+ model = _BACKEND_DEFAULT_MODEL.get(backend, GemmaModel.GEMMA_4_26B_A4B)
83
+ logger.info("Selected Gemma 4 model", model=model.to_backend_id(backend), backend=backend.value)
84
+ return model.to_backend_id(backend)
85
+
86
+
87
+ def select_next_gemma_model(failed_model: str, backend: AIBackend | None = None) -> str | None:
88
+ """Select the next smaller Gemma 4 model as fallback.
89
+
90
+ Skips models that are not available on the given backend
91
+ (e.g., 12B is local-only and not available on Google AI Studio/OpenRouter).
92
+
93
+ Args:
94
+ failed_model: The model that failed.
95
+ backend: The current backend (used to skip unavailable models).
96
+ If None, all models are considered available.
97
+
98
+ Returns:
99
+ The next model in the priority list (in backend-specific format), or None if all exhausted.
100
+ """
101
+ failed_norm = _normalize_model_id(failed_model)
102
+ for i, m in enumerate(_GEMMA_MODEL_PRIORITY):
103
+ if _normalize_model_id(m.value) == failed_norm:
104
+ # Walk down the priority list to find the next available model
105
+ for j in range(i + 1, len(_GEMMA_MODEL_PRIORITY)):
106
+ next_model = _GEMMA_MODEL_PRIORITY[j]
107
+ # Skip local-only models for cloud backends
108
+ if next_model.is_local_only and backend not in (
109
+ AIBackend.LM_STUDIO,
110
+ AIBackend.OLLAMA,
111
+ None, # None means "don't filter"
112
+ ):
113
+ continue
114
+ logger.info(
115
+ "Falling back to smaller Gemma 4 model",
116
+ from_model=failed_model,
117
+ to_model=next_model.to_backend_id(backend) if backend else next_model.value,
118
+ )
119
+ return next_model.to_backend_id(backend) if backend else next_model.value
120
+
121
+ logger.warning("No more Gemma 4 models available for fallback", failed_model=failed_model)
122
+ return None
123
+
124
+
125
+ def get_available_gemma_models() -> list[dict[str, str]]:
126
+ """Return list of available Gemma 4 models with display info."""
127
+ return [{"id": m.value, "display_name": m.display_name} for m in _GEMMA_MODEL_PRIORITY]