@vglu/tele-gpu-pilot-mcp 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Vitaliy Glushchenko
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,162 @@
1
+ # @vglu/tele-gpu-pilot-mcp
2
+
3
+ MCP server exposing **self-hosted [tele-gpu-pilot](https://tele-gpu-pilot.sims-service.com)** pod inference (chat / image / status) to Claude Code, Cursor, и любому MCP-compatible клиенту через stdio.
4
+
5
+ Single-tenant: тебе нужен access token к чьему-то pod'у. Если у тебя нет — этот package бесполезен. Это **не публичный inference SaaS**, это transport-mechanism к pod'у Виталия.
6
+
7
+ ## Tools
8
+
9
+ | Tool | Что делает | Когда вызывать |
10
+ |---|---|---|
11
+ | `pod_image_generate` | POST `/v1/images/generations` → Flux schnell. Optional `save_path` пишет PNG на disk; без него — inline base64 content (для Zed/Desktop/web где рендер работает). | «нарисуй / сгенерируй картинку» |
12
+ | `pod_chat` | POST `/v1/chat/completions` → text. Models: gemma-4-26b / gemma-4-e4b / qwen-2.5-32b-coder / qwen3-4b | Tech-writer drafts, rephrase, classification — где cloud overkill |
13
+ | `pod_status` | GET `/v1/models` — pod alive? какие модели в whitelist'е токена | Debug / discovery |
14
+
15
+ ## Install + setup
16
+
17
+ ### 1. Получи token
18
+
19
+ Создаёт владелец pod'а — на pod'е через LiteLLM admin API:
20
+
21
+ ```bash
22
+ ssh <pod-admin>@vps "MASTER=\$(grep LITELLM_MASTER_KEY pod-runtime/.env | cut -d= -f2)
23
+ curl -X POST http://127.0.0.1:8000/key/generate \
24
+ -H \"Authorization: Bearer \$MASTER\" \
25
+ -H 'Content-Type: application/json' \
26
+ -d '{\"key_alias\":\"<your-name>\",\"models\":[\"gemma-4-26b\",\"flux-schnell\"],\"rpm_limit\":60,\"max_budget\":5}'"
27
+ ```
28
+
29
+ Token приходит в response поле `key` (`sk-...`).
30
+
31
+ ### 2. Положи token куда-нибудь
32
+
33
+ Server ищет `TELE_GPU_PILOT_TOKEN` в порядке приоритета:
34
+
35
+ 1. **`process.env`** — задано в shell или в `env: {}` блоке `.mcp.json`
36
+ 2. **`.env` в repo** — local dev (cwd-based + relative к dist/)
37
+ 3. **`~/.tele-gpu-pilot.env`** — global home-dir fallback (рекомендуется для npx-installed)
38
+ 4. **`~/.config/tele-gpu-pilot/.env`** — XDG-style alternative
39
+
40
+ Простейший способ — global config (работает из любого проекта):
41
+
42
+ ```bash
43
+ # Linux/Mac
44
+ cat > ~/.tele-gpu-pilot.env << EOF
45
+ TELE_GPU_PILOT_TOKEN=sk-yourtoken
46
+ TELE_GPU_PILOT_URL=https://tele-gpu-pilot.sims-service.com/v1
47
+ EOF
48
+ chmod 600 ~/.tele-gpu-pilot.env
49
+
50
+ # Windows PowerShell
51
+ @'
52
+ TELE_GPU_PILOT_TOKEN=sk-yourtoken
53
+ TELE_GPU_PILOT_URL=https://tele-gpu-pilot.sims-service.com/v1
54
+ '@ | Set-Content -Path $HOME\.tele-gpu-pilot.env
55
+ ```
56
+
57
+ ### 3. Register в Claude Code
58
+
59
+ `.mcp.json` в repo root (любого проекта, где хочешь использовать pod):
60
+
61
+ ```json
62
+ {
63
+ "mcpServers": {
64
+ "tele-gpu-pilot": {
65
+ "type": "stdio",
66
+ "command": "npx",
67
+ "args": ["-y", "@vglu/tele-gpu-pilot-mcp"]
68
+ }
69
+ }
70
+ }
71
+ ```
72
+
73
+ `npx -y` скачает и закеширует package при первом запуске (~10s), потом fast.
74
+
75
+ После рестарта Claude Code:
76
+ - `/mcp` → должен показать `tele-gpu-pilot · connected · 3 tools`
77
+ - В чате: «сгенерируй картинку: ... и сохрани как cat.png»
78
+
79
+ ### 4. Register в Cursor
80
+
81
+ Cursor → Settings → MCP → Add server:
82
+
83
+ ```
84
+ Type: stdio
85
+ Command: npx
86
+ Args: -y @vglu/tele-gpu-pilot-mcp
87
+ ```
88
+
89
+ Или Cursor подхватит `.mcp.json` если в Settings включено project MCP discovery.
90
+
91
+ ## Tool examples
92
+
93
+ ### `pod_image_generate`
94
+
95
+ ```json
96
+ {
97
+ "prompt": "a tabby cat coding on a laptop, warm sunset",
98
+ "size": "1024x1024",
99
+ "n": 1,
100
+ "model": "flux-schnell",
101
+ "save_path": "out.png"
102
+ }
103
+ ```
104
+
105
+ - С `save_path` → PNG записывается на disk, tool возвращает абсолютный путь в `structuredContent.saved_paths`. Использовать в terminal Claude Code (host не рендерит inline images).
106
+ - Без `save_path` → inline base64 content (для Zed Agent Panel / Claude Desktop / claude.ai).
107
+ - N>1 + save_path → суффиксы `_0`, `_1`, … перед `.ext`.
108
+
109
+ ### `pod_chat`
110
+
111
+ ```json
112
+ {
113
+ "prompt": "Rephrase more concisely: ...",
114
+ "system": "You are a technical writer. Output only the rephrased text.",
115
+ "model": "gemma-4-26b",
116
+ "max_tokens": 500,
117
+ "temperature": 0.3
118
+ }
119
+ ```
120
+
121
+ ### `pod_status`
122
+
123
+ `{}` → alive flag + список моделей per current token whitelist.
124
+
125
+ ## Error semantics
126
+
127
+ Tool returns `isError: true` + текстовое сообщение (не throw'ит) при:
128
+
129
+ - `auth_error 401/403` — token неверный / model не в whitelist'е
130
+ - `rate_limit 429` — RPM или daily budget exceeded
131
+ - `pod_offline 5xx / network` — pod sleeping. Owner делает `/wake` в Telegram бота.
132
+ - `client_error 4xx` — invalid params
133
+
134
+ Это даёт MCP-host'у возможность cascade'нуться на cloud-модель.
135
+
136
+ ## Timeouts
137
+
138
+ - Image-gen: 240s (cold-start gemma + flux load до 60s, потом ~6s/image)
139
+ - Chat: 120s
140
+ - Status: 10s
141
+
142
+ ## Logs
143
+
144
+ В **stderr** (stdout зарезервирован для MCP JSON-RPC frames). Debug:
145
+
146
+ ```bash
147
+ npx -y @vglu/tele-gpu-pilot-mcp 2>/tmp/mcp.log
148
+ # Открой /tmp/mcp.log в parallel
149
+ ```
150
+
151
+ ## Что внутри
152
+
153
+ - `@modelcontextprotocol/sdk` — официальный TS SDK
154
+ - `zod` — input/output schemas
155
+ - 4 tools (`image`, `chat`, `status`) + 1 client helper + entry
156
+ - ~530 LoC, без runtime deps кроме SDK + zod
157
+
158
+ ## Source
159
+
160
+ [github.com/vglu/tele-gpu-pilot/tree/master/tools/mcp](https://github.com/vglu/tele-gpu-pilot/tree/master/tools/mcp). MCP server — часть проекта `tele-gpu-pilot` (Telegram bot + RunPod scheduler + pod-runtime stack). Сам package distributable отдельно.
161
+
162
+ License — MIT.
package/dist/client.js ADDED
@@ -0,0 +1,154 @@
1
+ /**
2
+ * tele-gpu-pilot pod client — thin fetch wrapper c единым auth/error handling.
3
+ *
4
+ * URL: TELE_GPU_PILOT_URL env (default https://tele-gpu-pilot.sims-service.com/v1)
5
+ * Token: TELE_GPU_PILOT_TOKEN env (virtual key claude-code)
6
+ *
7
+ * MCP-server-side ошибки маппятся в человеко-читаемые сообщения для модели,
8
+ * не в throw. Модель видит причину и решает что делать (retry, cascade,
9
+ * сказать пользователю).
10
+ */
11
+ import { readFileSync, existsSync } from "node:fs";
12
+ import { homedir } from "node:os";
13
+ import { dirname, resolve } from "node:path";
14
+ import { fileURLToPath } from "node:url";
15
+ const DEFAULT_URL = "https://tele-gpu-pilot.sims-service.com/v1";
16
+ /**
17
+ * Поиск .env в порядке приоритета:
18
+ * 1. Repo-local: relative к dist/index.js (in-repo build, dev workflow)
19
+ * 2. cwd + parents (если MCP запущен из конкретного проекта с локальным .env)
20
+ * 3. Home-dir fallback (для npx-installed usage без per-project setup):
21
+ * - $HOME/.tele-gpu-pilot.env
22
+ * - $HOME/.config/tele-gpu-pilot/.env
23
+ * - $XDG_CONFIG_HOME/tele-gpu-pilot/.env
24
+ *
25
+ * Если найден — populate process.env только теми полями, что ещё не заданы
26
+ * (env vars в .mcp.json `env: {...}` имеют приоритет).
27
+ */
28
+ function loadDotenv() {
29
+ const here = dirname(fileURLToPath(import.meta.url));
30
+ const home = homedir();
31
+ const xdg = process.env["XDG_CONFIG_HOME"];
32
+ const candidates = [
33
+ // In-repo dev workflow
34
+ resolve(here, "../../../.env"), // tools/mcp/dist → repo root
35
+ resolve(here, "../../.env"), // tools/mcp/dist → tools/mcp/.env
36
+ // cwd-based (project-local .env)
37
+ resolve(process.cwd(), ".env"),
38
+ resolve(process.cwd(), "../.env"),
39
+ resolve(process.cwd(), "../../.env"),
40
+ // Home-dir fallback (npx-installed / global usage)
41
+ resolve(home, ".tele-gpu-pilot.env"),
42
+ resolve(home, ".config/tele-gpu-pilot/.env"),
43
+ ...(xdg ? [resolve(xdg, "tele-gpu-pilot/.env")] : []),
44
+ ];
45
+ for (const path of candidates) {
46
+ if (!existsSync(path))
47
+ continue;
48
+ try {
49
+ const content = readFileSync(path, "utf-8");
50
+ for (const line of content.split(/\r?\n/)) {
51
+ const m = /^([A-Z_][A-Z0-9_]*)=(.*)$/.exec(line);
52
+ if (!m)
53
+ continue;
54
+ const [, key, rawValue] = m;
55
+ if (key === undefined || rawValue === undefined)
56
+ continue;
57
+ if (process.env[key] !== undefined)
58
+ continue;
59
+ const value = rawValue.replace(/^"(.*)"$/, "$1").replace(/^'(.*)'$/, "$1");
60
+ process.env[key] = value;
61
+ }
62
+ process.stderr.write(`tele-gpu-pilot-mcp: loaded .env from ${path}\n`);
63
+ return;
64
+ }
65
+ catch {
66
+ // continue to next candidate
67
+ }
68
+ }
69
+ }
70
+ export function loadConfig() {
71
+ loadDotenv();
72
+ const url = (process.env["TELE_GPU_PILOT_URL"] ?? DEFAULT_URL).replace(/\/$/, "");
73
+ const token = process.env["TELE_GPU_PILOT_TOKEN"] ?? "";
74
+ if (token === "") {
75
+ throw new Error("TELE_GPU_PILOT_TOKEN env not set, и .env с этим полем не найден. См. tools/mcp/README.md для настройки.");
76
+ }
77
+ return { url, token };
78
+ }
79
+ export async function podFetch(cfg, path, init = {}) {
80
+ const { timeoutMs = 120_000, ...rest } = init;
81
+ const controller = new AbortController();
82
+ const timer = setTimeout(() => controller.abort(), timeoutMs);
83
+ let res;
84
+ try {
85
+ res = await fetch(`${cfg.url}${path}`, {
86
+ ...rest,
87
+ signal: controller.signal,
88
+ headers: {
89
+ Authorization: `Bearer ${cfg.token}`,
90
+ "Content-Type": "application/json",
91
+ ...(rest.headers ?? {}),
92
+ },
93
+ });
94
+ }
95
+ catch (err) {
96
+ clearTimeout(timer);
97
+ const message = err instanceof Error ? err.message : String(err);
98
+ return {
99
+ ok: false,
100
+ status: 0,
101
+ kind: "network_error",
102
+ message: `Pod unreachable: ${message}. Возможно pod offline (нужен /wake) или нет интернета.`,
103
+ };
104
+ }
105
+ finally {
106
+ clearTimeout(timer);
107
+ }
108
+ if (res.status === 401 || res.status === 403) {
109
+ return {
110
+ ok: false,
111
+ status: res.status,
112
+ kind: "auth_error",
113
+ message: `Pod returned ${res.status}. TELE_GPU_PILOT_TOKEN неверный или модель не в whitelist'е этого ключа.`,
114
+ };
115
+ }
116
+ if (res.status === 429) {
117
+ return {
118
+ ok: false,
119
+ status: 429,
120
+ kind: "rate_limit",
121
+ message: "Pod rate limit exceeded (RPM или daily budget). Подожди или используй другой client token.",
122
+ };
123
+ }
124
+ if (res.status >= 500) {
125
+ const body = await res.text().catch(() => "");
126
+ return {
127
+ ok: false,
128
+ status: res.status,
129
+ kind: "pod_offline",
130
+ message: `Pod returned ${res.status}. Pod offline или upstream service down. Body: ${body.slice(0, 200)}`,
131
+ };
132
+ }
133
+ if (!res.ok) {
134
+ const body = await res.text().catch(() => "");
135
+ return {
136
+ ok: false,
137
+ status: res.status,
138
+ kind: "client_error",
139
+ message: `Pod returned ${res.status}: ${body.slice(0, 200)}`,
140
+ };
141
+ }
142
+ try {
143
+ const data = (await res.json());
144
+ return { ok: true, data };
145
+ }
146
+ catch (err) {
147
+ return {
148
+ ok: false,
149
+ status: res.status,
150
+ kind: "invalid_json",
151
+ message: `Pod returned 200 but body not JSON: ${err instanceof Error ? err.message : err}`,
152
+ };
153
+ }
154
+ }
package/dist/index.js ADDED
@@ -0,0 +1,64 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * tele-gpu-pilot MCP server — exposes self-hosted pod inference (chat /
4
+ * image / status) to Claude Code & Cursor via stdio.
5
+ *
6
+ * Stable URL живёт по ADR-009 (`tele-gpu-pilot.sims-service.com`).
7
+ *
8
+ * Tools:
9
+ * pod_chat — text completion на gemma/qwen
10
+ * pod_image_generate — Flux schnell image generation
11
+ * pod_status — check pod alive + which models available
12
+ *
13
+ * Config (env vars):
14
+ * TELE_GPU_PILOT_TOKEN — virtual key (claude-code recommended)
15
+ * TELE_GPU_PILOT_URL — stable URL (default
16
+ * https://tele-gpu-pilot.sims-service.com/v1)
17
+ *
18
+ * Transport: stdio (Claude Code default). Future: SSE / HTTP если
19
+ * понадобится shared remote MCP.
20
+ */
21
+ import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
22
+ import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
23
+ import { loadConfig } from "./client.js";
24
+ import { imageInputSchema, imageOutputSchema, makeImageHandler, } from "./tools/image.js";
25
+ import { chatInputSchema, chatOutputSchema, makeChatHandler, } from "./tools/chat.js";
26
+ import { statusInputSchema, statusOutputSchema, makeStatusHandler, } from "./tools/status.js";
27
+ async function main() {
28
+ const cfg = loadConfig();
29
+ // Логи в stderr (stdout зарезервирован для MCP JSON-RPC frames).
30
+ process.stderr.write(`tele-gpu-pilot-mcp: URL=${cfg.url} token=${cfg.token.slice(0, 8)}***\n`);
31
+ const server = new McpServer({
32
+ name: "tele-gpu-pilot",
33
+ version: "0.1.0",
34
+ });
35
+ server.registerTool("pod_image_generate", {
36
+ title: "Generate image via tele-gpu-pilot pod",
37
+ description: "Generates image(s) using self-hosted Flux schnell on tele-gpu-pilot RunPod A40 GPU. ~6s per 1024x1024. Use when user asks to draw, generate, render, or visualize something.\n\n" +
38
+ "ВАЖНО: если работаешь в terminal Claude Code (host не рендерит inline images) — ВСЕГДА передавай `save_path` чтобы PNG записался на диск. Без save_path image-content уйдёт в host для inline-рендера, и ты потеряешь к нему доступ (нельзя сохранить retroactively).\n\n" +
39
+ "С `save_path`: PNG пишется на disk, tool возвращает абсолютный путь — можно открыть/показать через другие tools.\n" +
40
+ "Без `save_path`: base64 inline — работает только в Zed Agent Panel / Claude Desktop / claude.ai где image content рендерится.",
41
+ inputSchema: imageInputSchema,
42
+ outputSchema: imageOutputSchema,
43
+ }, makeImageHandler(cfg));
44
+ server.registerTool("pod_chat", {
45
+ title: "Chat completion via tele-gpu-pilot pod (cheap-tier LLM)",
46
+ description: "Sends a chat completion request to self-hosted Gemma-4 (general) / Qwen-2.5-Coder (code) / Qwen3-4b (ultra-cheap classification) on tele-gpu-pilot pod. Use for tech-writer drafts, rephrase, classification, bulk doc edits — basically anything where cloud Sonnet is overkill. Marginal cost is $0 while pod is in session window.",
47
+ inputSchema: chatInputSchema,
48
+ outputSchema: chatOutputSchema,
49
+ }, makeChatHandler(cfg));
50
+ server.registerTool("pod_status", {
51
+ title: "Check tele-gpu-pilot pod status",
52
+ description: "Pings pod via stable URL and lists models available to current token. Use to debug 'pod offline' situations or discover which models are in the current virtual key's whitelist.",
53
+ inputSchema: statusInputSchema,
54
+ outputSchema: statusOutputSchema,
55
+ }, makeStatusHandler(cfg));
56
+ const transport = new StdioServerTransport();
57
+ await server.connect(transport);
58
+ process.stderr.write("tele-gpu-pilot-mcp: ready (stdio)\n");
59
+ }
60
+ main().catch((err) => {
61
+ const message = err instanceof Error ? err.stack ?? err.message : String(err);
62
+ process.stderr.write(`tele-gpu-pilot-mcp fatal: ${message}\n`);
63
+ process.exit(1);
64
+ });
@@ -0,0 +1,75 @@
1
+ import { z } from "zod";
2
+ import { podFetch } from "../client.js";
3
+ export const chatInputSchema = {
4
+ prompt: z.string().min(1).describe("User prompt / задача для self-hosted модели"),
5
+ system: z
6
+ .string()
7
+ .optional()
8
+ .describe("Optional system prompt (роль / стиль / constraints)"),
9
+ model: z
10
+ .enum(["gemma-4-26b", "gemma-4-e4b", "qwen-2.5-32b-coder", "qwen3-4b"])
11
+ .default("gemma-4-26b")
12
+ .describe("Model alias. gemma-4-26b = general STANDARD-tier; gemma-4-e4b = CHEAP fast classification; qwen-2.5-32b-coder = code-heavy; qwen3-4b = ultra-cheap classification"),
13
+ max_tokens: z.number().int().min(1).max(8192).default(2048),
14
+ temperature: z.number().min(0).max(2).default(0.7),
15
+ };
16
+ export const chatOutputSchema = {
17
+ reply: z.string(),
18
+ model_used: z.string(),
19
+ finish_reason: z.string(),
20
+ tokens_in: z.number(),
21
+ tokens_out: z.number(),
22
+ };
23
+ export function makeChatHandler(cfg) {
24
+ return async (input) => {
25
+ const messages = [];
26
+ if (input.system)
27
+ messages.push({ role: "system", content: input.system });
28
+ messages.push({ role: "user", content: input.prompt });
29
+ const result = await podFetch(cfg, "/chat/completions", {
30
+ method: "POST",
31
+ timeoutMs: 120_000,
32
+ body: JSON.stringify({
33
+ model: input.model,
34
+ messages,
35
+ max_tokens: input.max_tokens,
36
+ temperature: input.temperature,
37
+ }),
38
+ });
39
+ if (!result.ok) {
40
+ return {
41
+ content: [
42
+ {
43
+ type: "text",
44
+ text: `❌ Pod chat failed: ${result.kind} — ${result.message}`,
45
+ },
46
+ ],
47
+ isError: true,
48
+ };
49
+ }
50
+ const data = result.data;
51
+ const choice = data.choices[0];
52
+ if (!choice) {
53
+ return {
54
+ content: [
55
+ {
56
+ type: "text",
57
+ text: "❌ Pod returned 0 choices",
58
+ },
59
+ ],
60
+ isError: true,
61
+ };
62
+ }
63
+ const reply = choice.message.content;
64
+ return {
65
+ content: [{ type: "text", text: reply }],
66
+ structuredContent: {
67
+ reply,
68
+ model_used: data.model,
69
+ finish_reason: choice.finish_reason,
70
+ tokens_in: data.usage.prompt_tokens,
71
+ tokens_out: data.usage.completion_tokens,
72
+ },
73
+ };
74
+ };
75
+ }
@@ -0,0 +1,125 @@
1
+ import { writeFileSync, mkdirSync } from "node:fs";
2
+ import { dirname, isAbsolute, resolve } from "node:path";
3
+ import { z } from "zod";
4
+ import { podFetch } from "../client.js";
5
+ export const imageInputSchema = {
6
+ prompt: z.string().min(1).describe("Image generation prompt (English или RU — Flux schnell oба понимает)"),
7
+ size: z
8
+ .string()
9
+ .regex(/^\d+x\d+$/, "format WxH, e.g. 1024x1024")
10
+ .default("1024x1024")
11
+ .describe("Image dimensions, 64-4096 per axis, multiple of 8 recommended"),
12
+ n: z.number().int().min(1).max(4).default(1).describe("Number of images to generate (1-4)"),
13
+ model: z
14
+ .enum(["flux-schnell"])
15
+ .default("flux-schnell")
16
+ .describe("Model alias. Currently only flux-schnell available; flux-dev будет если pull-flux.sh --dev"),
17
+ save_path: z
18
+ .string()
19
+ .optional()
20
+ .describe("Опционально: путь куда сохранить PNG на диск. Relative paths резолвятся от process.cwd() (=repo root при запуске MCP из .mcp.json). При n>1 — суффиксы _0/_1/... перед расширением. Использовать когда host (terminal Claude Code) не рендерит inline images. Если не задан — content возвращается inline (Zed/Desktop рендер)."),
21
+ };
22
+ export const imageOutputSchema = {
23
+ images_count: z.number(),
24
+ width: z.number(),
25
+ height: z.number(),
26
+ size_bytes_total: z.number(),
27
+ model: z.string(),
28
+ saved_paths: z.array(z.string()),
29
+ };
30
+ /**
31
+ * Build saved path for N-th image. If `save_path` ends with .png — use as-is for index=0,
32
+ * insert `_<i>` before extension for i>0. If no extension — append `_<i>.png`.
33
+ */
34
+ function buildSavePath(savePath, index, n) {
35
+ const abs = isAbsolute(savePath) ? savePath : resolve(process.cwd(), savePath);
36
+ if (n === 1)
37
+ return abs;
38
+ const m = /^(.+)(\.[^.\\/]+)$/.exec(abs);
39
+ if (m)
40
+ return `${m[1]}_${index}${m[2]}`;
41
+ return `${abs}_${index}.png`;
42
+ }
43
+ export function makeImageHandler(cfg) {
44
+ return async (input) => {
45
+ const result = await podFetch(cfg, "/images/generations", {
46
+ method: "POST",
47
+ timeoutMs: 240_000, // image-gen может cold-start до 60s + 4 images × 6s
48
+ body: JSON.stringify({
49
+ model: input.model,
50
+ prompt: input.prompt,
51
+ size: input.size,
52
+ n: input.n,
53
+ response_format: "b64_json",
54
+ }),
55
+ });
56
+ if (!result.ok) {
57
+ return {
58
+ content: [
59
+ {
60
+ type: "text",
61
+ text: `❌ Image generation failed: ${result.kind} — ${result.message}`,
62
+ },
63
+ ],
64
+ isError: true,
65
+ };
66
+ }
67
+ const data = result.data;
68
+ const [wStr, hStr] = input.size.split("x");
69
+ const width = Number(wStr);
70
+ const height = Number(hStr);
71
+ let totalBytes = 0;
72
+ const savedPaths = [];
73
+ const content = [];
74
+ for (let i = 0; i < data.data.length; i += 1) {
75
+ const item = data.data[i];
76
+ if (!item?.b64_json)
77
+ continue;
78
+ const buf = Buffer.from(item.b64_json, "base64");
79
+ totalBytes += buf.length;
80
+ if (input.save_path) {
81
+ const outPath = buildSavePath(input.save_path, i, data.data.length);
82
+ try {
83
+ mkdirSync(dirname(outPath), { recursive: true });
84
+ writeFileSync(outPath, buf);
85
+ savedPaths.push(outPath);
86
+ }
87
+ catch (err) {
88
+ return {
89
+ content: [
90
+ {
91
+ type: "text",
92
+ text: `❌ Saved to disk failed: ${err instanceof Error ? err.message : err} (path=${outPath})`,
93
+ },
94
+ ],
95
+ isError: true,
96
+ };
97
+ }
98
+ }
99
+ else {
100
+ // Inline mode: возвращаем base64 как image content для рендеринга
101
+ // host'ом (Zed Agent Panel / Claude Desktop / claude.ai).
102
+ content.push({
103
+ type: "image",
104
+ data: item.b64_json,
105
+ mimeType: "image/png",
106
+ });
107
+ }
108
+ }
109
+ const summary = input.save_path
110
+ ? `Сохранено ${savedPaths.length} файл(ов) ${width}×${height} (${Math.round(totalBytes / 1024)} KB total) на диск:\n${savedPaths.map((p) => ` ${p}`).join("\n")}\n\nЗапрос: ${input.prompt}`
111
+ : `Generated ${data.data.length} image(s) ${width}×${height} via ${input.model} on tele-gpu-pilot pod. Total ~${Math.round(totalBytes / 1024)} KB.`;
112
+ content.push({ type: "text", text: summary });
113
+ return {
114
+ content,
115
+ structuredContent: {
116
+ images_count: data.data.length,
117
+ width,
118
+ height,
119
+ size_bytes_total: totalBytes,
120
+ model: input.model,
121
+ saved_paths: savedPaths,
122
+ },
123
+ };
124
+ };
125
+ }
@@ -0,0 +1,50 @@
1
+ import { z } from "zod";
2
+ import { podFetch } from "../client.js";
3
+ export const statusInputSchema = {};
4
+ export const statusOutputSchema = {
5
+ alive: z.boolean(),
6
+ url: z.string(),
7
+ models_available: z.array(z.string()),
8
+ models_count: z.number(),
9
+ error: z.string().optional(),
10
+ };
11
+ export function makeStatusHandler(cfg) {
12
+ return async () => {
13
+ const result = await podFetch(cfg, "/models", {
14
+ method: "GET",
15
+ timeoutMs: 10_000,
16
+ });
17
+ if (!result.ok) {
18
+ return {
19
+ content: [
20
+ {
21
+ type: "text",
22
+ text: `❌ Pod status: ${result.kind} — ${result.message}\n\nURL: ${cfg.url}`,
23
+ },
24
+ ],
25
+ structuredContent: {
26
+ alive: false,
27
+ url: cfg.url,
28
+ models_available: [],
29
+ models_count: 0,
30
+ error: result.message,
31
+ },
32
+ };
33
+ }
34
+ const models = result.data.data.map((m) => m.id);
35
+ return {
36
+ content: [
37
+ {
38
+ type: "text",
39
+ text: `✅ Pod alive at ${cfg.url}\nAvailable models (per claude-code token whitelist):\n${models.map((m) => ` - ${m}`).join("\n")}`,
40
+ },
41
+ ],
42
+ structuredContent: {
43
+ alive: true,
44
+ url: cfg.url,
45
+ models_available: models,
46
+ models_count: models.length,
47
+ },
48
+ };
49
+ };
50
+ }
package/package.json ADDED
@@ -0,0 +1,54 @@
1
+ {
2
+ "name": "@vglu/tele-gpu-pilot-mcp",
3
+ "version": "0.1.0",
4
+ "description": "MCP server exposing tele-gpu-pilot pod inference (chat / image / status) to Claude Code, Cursor, and any MCP-compatible client via stdio. Self-hosted Flux schnell + Gemma 4 + Qwen 2.5-Coder on RunPod A40.",
5
+ "type": "module",
6
+ "engines": {
7
+ "node": ">=22"
8
+ },
9
+ "main": "dist/index.js",
10
+ "bin": {
11
+ "tele-gpu-pilot-mcp": "dist/index.js"
12
+ },
13
+ "files": [
14
+ "dist/",
15
+ "README.md",
16
+ "LICENSE"
17
+ ],
18
+ "scripts": {
19
+ "build": "tsc",
20
+ "start": "node dist/index.js",
21
+ "dev": "tsx watch src/index.ts",
22
+ "typecheck": "tsc --noEmit",
23
+ "prepublishOnly": "npm run build"
24
+ },
25
+ "keywords": [
26
+ "mcp",
27
+ "model-context-protocol",
28
+ "claude",
29
+ "claude-code",
30
+ "cursor",
31
+ "flux",
32
+ "image-generation",
33
+ "self-hosted",
34
+ "runpod",
35
+ "ollama",
36
+ "gemma",
37
+ "qwen"
38
+ ],
39
+ "author": "Vitaliy Glushchenko",
40
+ "license": "MIT",
41
+ "homepage": "https://tele-gpu-pilot.sims-service.com",
42
+ "publishConfig": {
43
+ "access": "public"
44
+ },
45
+ "dependencies": {
46
+ "@modelcontextprotocol/sdk": "^1.0.4",
47
+ "zod": "^3.24.0"
48
+ },
49
+ "devDependencies": {
50
+ "@types/node": "^22.10.0",
51
+ "tsx": "^4.19.0",
52
+ "typescript": "^5.7.0"
53
+ }
54
+ }