npm - @vglu/tele-gpu-pilot-mcp - Versions diffs - 0.1.0 - Mend

@vglu/tele-gpu-pilot-mcp 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Vitaliy Glushchenko
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,162 @@
+# @vglu/tele-gpu-pilot-mcp
+MCP server exposing **self-hosted [tele-gpu-pilot](https://tele-gpu-pilot.sims-service.com)** pod inference (chat / image / status) to Claude Code, Cursor, и любому MCP-compatible клиенту через stdio.
+Single-tenant: тебе нужен access token к чьему-то pod'у. Если у тебя нет — этот package бесполезен. Это **не публичный inference SaaS**, это transport-mechanism к pod'у Виталия.
+## Tools
+| Tool | Что делает | Когда вызывать |
+|---|---|---|
+| `pod_image_generate` | POST `/v1/images/generations` → Flux schnell. Optional `save_path` пишет PNG на disk; без него — inline base64 content (для Zed/Desktop/web где рендер работает). | «нарисуй / сгенерируй картинку» |
+| `pod_chat` | POST `/v1/chat/completions` → text. Models: gemma-4-26b / gemma-4-e4b / qwen-2.5-32b-coder / qwen3-4b | Tech-writer drafts, rephrase, classification — где cloud overkill |
+| `pod_status` | GET `/v1/models` — pod alive? какие модели в whitelist'е токена | Debug / discovery |
+## Install + setup
+### 1. Получи token
+Создаёт владелец pod'а — на pod'е через LiteLLM admin API:
+```bash
+ssh <pod-admin>@vps "MASTER=\$(grep LITELLM_MASTER_KEY pod-runtime/.env | cut -d= -f2)
+curl -X POST http://127.0.0.1:8000/key/generate \
+  -H \"Authorization: Bearer \$MASTER\" \
+  -H 'Content-Type: application/json' \
+  -d '{\"key_alias\":\"<your-name>\",\"models\":[\"gemma-4-26b\",\"flux-schnell\"],\"rpm_limit\":60,\"max_budget\":5}'"
+```
+Token приходит в response поле `key` (`sk-...`).
+### 2. Положи token куда-нибудь
+Server ищет `TELE_GPU_PILOT_TOKEN` в порядке приоритета:
+1. **`process.env`** — задано в shell или в `env: {}` блоке `.mcp.json`
+2. **`.env` в repo** — local dev (cwd-based + relative к dist/)
+3. **`~/.tele-gpu-pilot.env`** — global home-dir fallback (рекомендуется для npx-installed)
+4. **`~/.config/tele-gpu-pilot/.env`** — XDG-style alternative
+Простейший способ — global config (работает из любого проекта):
+```bash
+# Linux/Mac
+cat > ~/.tele-gpu-pilot.env << EOF
+TELE_GPU_PILOT_TOKEN=sk-yourtoken
+TELE_GPU_PILOT_URL=https://tele-gpu-pilot.sims-service.com/v1
+EOF
+chmod 600 ~/.tele-gpu-pilot.env
+# Windows PowerShell
+@'
+TELE_GPU_PILOT_TOKEN=sk-yourtoken
+TELE_GPU_PILOT_URL=https://tele-gpu-pilot.sims-service.com/v1
+'@ | Set-Content -Path $HOME\.tele-gpu-pilot.env
+```
+### 3. Register в Claude Code
+`.mcp.json` в repo root (любого проекта, где хочешь использовать pod):
+```json
+{
+  "mcpServers": {
+    "tele-gpu-pilot": {
+      "type": "stdio",
+      "command": "npx",
+      "args": ["-y", "@vglu/tele-gpu-pilot-mcp"]
+    }
+  }
+}
+```
+`npx -y` скачает и закеширует package при первом запуске (~10s), потом fast.
+После рестарта Claude Code:
+- `/mcp` → должен показать `tele-gpu-pilot · connected · 3 tools`
+- В чате: «сгенерируй картинку: ... и сохрани как cat.png»
+### 4. Register в Cursor
+Cursor → Settings → MCP → Add server:
+```
+Type:    stdio
+Command: npx
+Args:    -y @vglu/tele-gpu-pilot-mcp
+```
+Или Cursor подхватит `.mcp.json` если в Settings включено project MCP discovery.
+## Tool examples
+### `pod_image_generate`
+```json
+{
+  "prompt": "a tabby cat coding on a laptop, warm sunset",
+  "size": "1024x1024",
+  "n": 1,
+  "model": "flux-schnell",
+  "save_path": "out.png"
+}
+```
+- С `save_path` → PNG записывается на disk, tool возвращает абсолютный путь в `structuredContent.saved_paths`. Использовать в terminal Claude Code (host не рендерит inline images).
+- Без `save_path` → inline base64 content (для Zed Agent Panel / Claude Desktop / claude.ai).
+- N>1 + save_path → суффиксы `_0`, `_1`, … перед `.ext`.
+### `pod_chat`
+```json
+{
+  "prompt": "Rephrase more concisely: ...",
+  "system": "You are a technical writer. Output only the rephrased text.",
+  "model": "gemma-4-26b",
+  "max_tokens": 500,
+  "temperature": 0.3
+}
+```
+### `pod_status`
+`{}` → alive flag + список моделей per current token whitelist.
+## Error semantics
+Tool returns `isError: true` + текстовое сообщение (не throw'ит) при:
+- `auth_error 401/403` — token неверный / model не в whitelist'е
+- `rate_limit 429` — RPM или daily budget exceeded
+- `pod_offline 5xx / network` — pod sleeping. Owner делает `/wake` в Telegram бота.
+- `client_error 4xx` — invalid params
+Это даёт MCP-host'у возможность cascade'нуться на cloud-модель.
+## Timeouts
+- Image-gen: 240s (cold-start gemma + flux load до 60s, потом ~6s/image)
+- Chat: 120s
+- Status: 10s
+## Logs
+В **stderr** (stdout зарезервирован для MCP JSON-RPC frames). Debug:
+```bash
+npx -y @vglu/tele-gpu-pilot-mcp 2>/tmp/mcp.log
+# Открой /tmp/mcp.log в parallel
+```
+## Что внутри
+- `@modelcontextprotocol/sdk` — официальный TS SDK
+- `zod` — input/output schemas
+- 4 tools (`image`, `chat`, `status`) + 1 client helper + entry
+- ~530 LoC, без runtime deps кроме SDK + zod
+## Source
+[github.com/vglu/tele-gpu-pilot/tree/master/tools/mcp](https://github.com/vglu/tele-gpu-pilot/tree/master/tools/mcp). MCP server — часть проекта `tele-gpu-pilot` (Telegram bot + RunPod scheduler + pod-runtime stack). Сам package distributable отдельно.
+License — MIT.

package/dist/client.js ADDED Viewed

@@ -0,0 +1,154 @@
+/**
+ * tele-gpu-pilot pod client — thin fetch wrapper c единым auth/error handling.
+ *
+ * URL: TELE_GPU_PILOT_URL env (default https://tele-gpu-pilot.sims-service.com/v1)
+ * Token: TELE_GPU_PILOT_TOKEN env (virtual key claude-code)
+ *
+ * MCP-server-side ошибки маппятся в человеко-читаемые сообщения для модели,
+ * не в throw. Модель видит причину и решает что делать (retry, cascade,
+ * сказать пользователю).
+ */
+import { readFileSync, existsSync } from "node:fs";
+import { homedir } from "node:os";
+import { dirname, resolve } from "node:path";
+import { fileURLToPath } from "node:url";
+const DEFAULT_URL = "https://tele-gpu-pilot.sims-service.com/v1";
+/**
+ * Поиск .env в порядке приоритета:
+ *   1. Repo-local: relative к dist/index.js (in-repo build, dev workflow)
+ *   2. cwd + parents (если MCP запущен из конкретного проекта с локальным .env)
+ *   3. Home-dir fallback (для npx-installed usage без per-project setup):
+ *      - $HOME/.tele-gpu-pilot.env
+ *      - $HOME/.config/tele-gpu-pilot/.env
+ *      - $XDG_CONFIG_HOME/tele-gpu-pilot/.env
+ *
+ * Если найден — populate process.env только теми полями, что ещё не заданы
+ * (env vars в .mcp.json `env: {...}` имеют приоритет).
+ */
+function loadDotenv() {
+    const here = dirname(fileURLToPath(import.meta.url));
+    const home = homedir();
+    const xdg = process.env["XDG_CONFIG_HOME"];
+    const candidates = [
+        // In-repo dev workflow
+        resolve(here, "../../../.env"), // tools/mcp/dist → repo root
+        resolve(here, "../../.env"), // tools/mcp/dist → tools/mcp/.env
+        // cwd-based (project-local .env)
+        resolve(process.cwd(), ".env"),
+        resolve(process.cwd(), "../.env"),
+        resolve(process.cwd(), "../../.env"),
+        // Home-dir fallback (npx-installed / global usage)
+        resolve(home, ".tele-gpu-pilot.env"),
+        resolve(home, ".config/tele-gpu-pilot/.env"),
+        ...(xdg ? [resolve(xdg, "tele-gpu-pilot/.env")] : []),
+    ];
+    for (const path of candidates) {
+        if (!existsSync(path))
+            continue;
+        try {
+            const content = readFileSync(path, "utf-8");
+            for (const line of content.split(/\r?\n/)) {
+                const m = /^([A-Z_][A-Z0-9_]*)=(.*)$/.exec(line);
+                if (!m)
+                    continue;
+                const [, key, rawValue] = m;
+                if (key === undefined || rawValue === undefined)
+                    continue;
+                if (process.env[key] !== undefined)
+                    continue;
+                const value = rawValue.replace(/^"(.*)"$/, "$1").replace(/^'(.*)'$/, "$1");
+                process.env[key] = value;
+            }
+            process.stderr.write(`tele-gpu-pilot-mcp: loaded .env from ${path}\n`);
+            return;
+        }
+        catch {
+            // continue to next candidate
+        }
+    }
+}
+export function loadConfig() {
+    loadDotenv();
+    const url = (process.env["TELE_GPU_PILOT_URL"] ?? DEFAULT_URL).replace(/\/$/, "");
+    const token = process.env["TELE_GPU_PILOT_TOKEN"] ?? "";
+    if (token === "") {
+        throw new Error("TELE_GPU_PILOT_TOKEN env not set, и .env с этим полем не найден. См. tools/mcp/README.md для настройки.");
+    }
+    return { url, token };
+}
+export async function podFetch(cfg, path, init = {}) {
+    const { timeoutMs = 120_000, ...rest } = init;
+    const controller = new AbortController();
+    const timer = setTimeout(() => controller.abort(), timeoutMs);
+    let res;
+    try {
+        res = await fetch(`${cfg.url}${path}`, {
+            ...rest,
+            signal: controller.signal,
+            headers: {
+                Authorization: `Bearer ${cfg.token}`,
+                "Content-Type": "application/json",
+                ...(rest.headers ?? {}),
+            },
+        });
+    }
+    catch (err) {
+        clearTimeout(timer);
+        const message = err instanceof Error ? err.message : String(err);
+        return {
+            ok: false,
+            status: 0,
+            kind: "network_error",
+            message: `Pod unreachable: ${message}. Возможно pod offline (нужен /wake) или нет интернета.`,
+        };
+    }
+    finally {
+        clearTimeout(timer);
+    }
+    if (res.status === 401 || res.status === 403) {
+        return {
+            ok: false,
+            status: res.status,
+            kind: "auth_error",
+            message: `Pod returned ${res.status}. TELE_GPU_PILOT_TOKEN неверный или модель не в whitelist'е этого ключа.`,
+        };
+    }
+    if (res.status === 429) {
+        return {
+            ok: false,
+            status: 429,
+            kind: "rate_limit",
+            message: "Pod rate limit exceeded (RPM или daily budget). Подожди или используй другой client token.",
+        };
+    }
+    if (res.status >= 500) {
+        const body = await res.text().catch(() => "");
+        return {
+            ok: false,
+            status: res.status,
+            kind: "pod_offline",
+            message: `Pod returned ${res.status}. Pod offline или upstream service down. Body: ${body.slice(0, 200)}`,
+        };
+    }
+    if (!res.ok) {
+        const body = await res.text().catch(() => "");
+        return {
+            ok: false,
+            status: res.status,
+            kind: "client_error",
+            message: `Pod returned ${res.status}: ${body.slice(0, 200)}`,
+        };
+    }
+    try {
+        const data = (await res.json());
+        return { ok: true, data };
+    }
+    catch (err) {
+        return {
+            ok: false,
+            status: res.status,
+            kind: "invalid_json",
+            message: `Pod returned 200 but body not JSON: ${err instanceof Error ? err.message : err}`,
+        };
+    }
+}

package/dist/index.js ADDED Viewed

@@ -0,0 +1,64 @@
+#!/usr/bin/env node
+/**
+ * tele-gpu-pilot MCP server — exposes self-hosted pod inference (chat /
+ * image / status) to Claude Code & Cursor via stdio.
+ *
+ * Stable URL живёт по ADR-009 (`tele-gpu-pilot.sims-service.com`).
+ *
+ * Tools:
+ *   pod_chat              — text completion на gemma/qwen
+ *   pod_image_generate    — Flux schnell image generation
+ *   pod_status            — check pod alive + which models available
+ *
+ * Config (env vars):
+ *   TELE_GPU_PILOT_TOKEN  — virtual key (claude-code recommended)
+ *   TELE_GPU_PILOT_URL    — stable URL (default
+ *                           https://tele-gpu-pilot.sims-service.com/v1)
+ *
+ * Transport: stdio (Claude Code default). Future: SSE / HTTP если
+ * понадобится shared remote MCP.
+ */
+import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+import { loadConfig } from "./client.js";
+import { imageInputSchema, imageOutputSchema, makeImageHandler, } from "./tools/image.js";
+import { chatInputSchema, chatOutputSchema, makeChatHandler, } from "./tools/chat.js";
+import { statusInputSchema, statusOutputSchema, makeStatusHandler, } from "./tools/status.js";
+async function main() {
+    const cfg = loadConfig();
+    // Логи в stderr (stdout зарезервирован для MCP JSON-RPC frames).
+    process.stderr.write(`tele-gpu-pilot-mcp: URL=${cfg.url} token=${cfg.token.slice(0, 8)}***\n`);
+    const server = new McpServer({
+        name: "tele-gpu-pilot",
+        version: "0.1.0",
+    });
+    server.registerTool("pod_image_generate", {
+        title: "Generate image via tele-gpu-pilot pod",
+        description: "Generates image(s) using self-hosted Flux schnell on tele-gpu-pilot RunPod A40 GPU. ~6s per 1024x1024. Use when user asks to draw, generate, render, or visualize something.\n\n" +
+            "ВАЖНО: если работаешь в terminal Claude Code (host не рендерит inline images) — ВСЕГДА передавай `save_path` чтобы PNG записался на диск. Без save_path image-content уйдёт в host для inline-рендера, и ты потеряешь к нему доступ (нельзя сохранить retroactively).\n\n" +
+            "С `save_path`: PNG пишется на disk, tool возвращает абсолютный путь — можно открыть/показать через другие tools.\n" +
+            "Без `save_path`: base64 inline — работает только в Zed Agent Panel / Claude Desktop / claude.ai где image content рендерится.",
+        inputSchema: imageInputSchema,
+        outputSchema: imageOutputSchema,
+    }, makeImageHandler(cfg));
+    server.registerTool("pod_chat", {
+        title: "Chat completion via tele-gpu-pilot pod (cheap-tier LLM)",
+        description: "Sends a chat completion request to self-hosted Gemma-4 (general) / Qwen-2.5-Coder (code) / Qwen3-4b (ultra-cheap classification) on tele-gpu-pilot pod. Use for tech-writer drafts, rephrase, classification, bulk doc edits — basically anything where cloud Sonnet is overkill. Marginal cost is $0 while pod is in session window.",
+        inputSchema: chatInputSchema,
+        outputSchema: chatOutputSchema,
+    }, makeChatHandler(cfg));
+    server.registerTool("pod_status", {
+        title: "Check tele-gpu-pilot pod status",
+        description: "Pings pod via stable URL and lists models available to current token. Use to debug 'pod offline' situations or discover which models are in the current virtual key's whitelist.",
+        inputSchema: statusInputSchema,
+        outputSchema: statusOutputSchema,
+    }, makeStatusHandler(cfg));
+    const transport = new StdioServerTransport();
+    await server.connect(transport);
+    process.stderr.write("tele-gpu-pilot-mcp: ready (stdio)\n");
+}
+main().catch((err) => {
+    const message = err instanceof Error ? err.stack ?? err.message : String(err);
+    process.stderr.write(`tele-gpu-pilot-mcp fatal: ${message}\n`);
+    process.exit(1);
+});

package/dist/tools/chat.js ADDED Viewed

@@ -0,0 +1,75 @@
+import { z } from "zod";
+import { podFetch } from "../client.js";
+export const chatInputSchema = {
+    prompt: z.string().min(1).describe("User prompt / задача для self-hosted модели"),
+    system: z
+        .string()
+        .optional()
+        .describe("Optional system prompt (роль / стиль / constraints)"),
+    model: z
+        .enum(["gemma-4-26b", "gemma-4-e4b", "qwen-2.5-32b-coder", "qwen3-4b"])
+        .default("gemma-4-26b")
+        .describe("Model alias. gemma-4-26b = general STANDARD-tier; gemma-4-e4b = CHEAP fast classification; qwen-2.5-32b-coder = code-heavy; qwen3-4b = ultra-cheap classification"),
+    max_tokens: z.number().int().min(1).max(8192).default(2048),
+    temperature: z.number().min(0).max(2).default(0.7),
+};
+export const chatOutputSchema = {
+    reply: z.string(),
+    model_used: z.string(),
+    finish_reason: z.string(),
+    tokens_in: z.number(),
+    tokens_out: z.number(),
+};
+export function makeChatHandler(cfg) {
+    return async (input) => {
+        const messages = [];
+        if (input.system)
+            messages.push({ role: "system", content: input.system });
+        messages.push({ role: "user", content: input.prompt });
+        const result = await podFetch(cfg, "/chat/completions", {
+            method: "POST",
+            timeoutMs: 120_000,
+            body: JSON.stringify({
+                model: input.model,
+                messages,
+                max_tokens: input.max_tokens,
+                temperature: input.temperature,
+            }),
+        });
+        if (!result.ok) {
+            return {
+                content: [
+                    {
+                        type: "text",
+                        text: `❌ Pod chat failed: ${result.kind} — ${result.message}`,
+                    },
+                ],
+                isError: true,
+            };
+        }
+        const data = result.data;
+        const choice = data.choices[0];
+        if (!choice) {
+            return {
+                content: [
+                    {
+                        type: "text",
+                        text: "❌ Pod returned 0 choices",
+                    },
+                ],
+                isError: true,
+            };
+        }
+        const reply = choice.message.content;
+        return {
+            content: [{ type: "text", text: reply }],
+            structuredContent: {
+                reply,
+                model_used: data.model,
+                finish_reason: choice.finish_reason,
+                tokens_in: data.usage.prompt_tokens,
+                tokens_out: data.usage.completion_tokens,
+            },
+        };
+    };
+}

package/dist/tools/image.js ADDED Viewed

@@ -0,0 +1,125 @@
+import { writeFileSync, mkdirSync } from "node:fs";
+import { dirname, isAbsolute, resolve } from "node:path";
+import { z } from "zod";
+import { podFetch } from "../client.js";
+export const imageInputSchema = {
+    prompt: z.string().min(1).describe("Image generation prompt (English или RU — Flux schnell oба понимает)"),
+    size: z
+        .string()
+        .regex(/^\d+x\d+$/, "format WxH, e.g. 1024x1024")
+        .default("1024x1024")
+        .describe("Image dimensions, 64-4096 per axis, multiple of 8 recommended"),
+    n: z.number().int().min(1).max(4).default(1).describe("Number of images to generate (1-4)"),
+    model: z
+        .enum(["flux-schnell"])
+        .default("flux-schnell")
+        .describe("Model alias. Currently only flux-schnell available; flux-dev будет если pull-flux.sh --dev"),
+    save_path: z
+        .string()
+        .optional()
+        .describe("Опционально: путь куда сохранить PNG на диск. Relative paths резолвятся от process.cwd() (=repo root при запуске MCP из .mcp.json). При n>1 — суффиксы _0/_1/... перед расширением. Использовать когда host (terminal Claude Code) не рендерит inline images. Если не задан — content возвращается inline (Zed/Desktop рендер)."),
+};
+export const imageOutputSchema = {
+    images_count: z.number(),
+    width: z.number(),
+    height: z.number(),
+    size_bytes_total: z.number(),
+    model: z.string(),
+    saved_paths: z.array(z.string()),
+};
+/**
+ * Build saved path for N-th image. If `save_path` ends with .png — use as-is for index=0,
+ * insert `_<i>` before extension for i>0. If no extension — append `_<i>.png`.
+ */
+function buildSavePath(savePath, index, n) {
+    const abs = isAbsolute(savePath) ? savePath : resolve(process.cwd(), savePath);
+    if (n === 1)
+        return abs;
+    const m = /^(.+)(\.[^.\\/]+)$/.exec(abs);
+    if (m)
+        return `${m[1]}_${index}${m[2]}`;
+    return `${abs}_${index}.png`;
+}
+export function makeImageHandler(cfg) {
+    return async (input) => {
+        const result = await podFetch(cfg, "/images/generations", {
+            method: "POST",
+            timeoutMs: 240_000, // image-gen может cold-start до 60s + 4 images × 6s
+            body: JSON.stringify({
+                model: input.model,
+                prompt: input.prompt,
+                size: input.size,
+                n: input.n,
+                response_format: "b64_json",
+            }),
+        });
+        if (!result.ok) {
+            return {
+                content: [
+                    {
+                        type: "text",
+                        text: `❌ Image generation failed: ${result.kind} — ${result.message}`,
+                    },
+                ],
+                isError: true,
+            };
+        }
+        const data = result.data;
+        const [wStr, hStr] = input.size.split("x");
+        const width = Number(wStr);
+        const height = Number(hStr);
+        let totalBytes = 0;
+        const savedPaths = [];
+        const content = [];
+        for (let i = 0; i < data.data.length; i += 1) {
+            const item = data.data[i];
+            if (!item?.b64_json)
+                continue;
+            const buf = Buffer.from(item.b64_json, "base64");
+            totalBytes += buf.length;
+            if (input.save_path) {
+                const outPath = buildSavePath(input.save_path, i, data.data.length);
+                try {
+                    mkdirSync(dirname(outPath), { recursive: true });
+                    writeFileSync(outPath, buf);
+                    savedPaths.push(outPath);
+                }
+                catch (err) {
+                    return {
+                        content: [
+                            {
+                                type: "text",
+                                text: `❌ Saved to disk failed: ${err instanceof Error ? err.message : err} (path=${outPath})`,
+                            },
+                        ],
+                        isError: true,
+                    };
+                }
+            }
+            else {
+                // Inline mode: возвращаем base64 как image content для рендеринга
+                // host'ом (Zed Agent Panel / Claude Desktop / claude.ai).
+                content.push({
+                    type: "image",
+                    data: item.b64_json,
+                    mimeType: "image/png",
+                });
+            }
+        }
+        const summary = input.save_path
+            ? `Сохранено ${savedPaths.length} файл(ов) ${width}×${height} (${Math.round(totalBytes / 1024)} KB total) на диск:\n${savedPaths.map((p) => `  ${p}`).join("\n")}\n\nЗапрос: ${input.prompt}`
+            : `Generated ${data.data.length} image(s) ${width}×${height} via ${input.model} on tele-gpu-pilot pod. Total ~${Math.round(totalBytes / 1024)} KB.`;
+        content.push({ type: "text", text: summary });
+        return {
+            content,
+            structuredContent: {
+                images_count: data.data.length,
+                width,
+                height,
+                size_bytes_total: totalBytes,
+                model: input.model,
+                saved_paths: savedPaths,
+            },
+        };
+    };
+}

package/dist/tools/status.js ADDED Viewed

@@ -0,0 +1,50 @@
+import { z } from "zod";
+import { podFetch } from "../client.js";
+export const statusInputSchema = {};
+export const statusOutputSchema = {
+    alive: z.boolean(),
+    url: z.string(),
+    models_available: z.array(z.string()),
+    models_count: z.number(),
+    error: z.string().optional(),
+};
+export function makeStatusHandler(cfg) {
+    return async () => {
+        const result = await podFetch(cfg, "/models", {
+            method: "GET",
+            timeoutMs: 10_000,
+        });
+        if (!result.ok) {
+            return {
+                content: [
+                    {
+                        type: "text",
+                        text: `❌ Pod status: ${result.kind} — ${result.message}\n\nURL: ${cfg.url}`,
+                    },
+                ],
+                structuredContent: {
+                    alive: false,
+                    url: cfg.url,
+                    models_available: [],
+                    models_count: 0,
+                    error: result.message,
+                },
+            };
+        }
+        const models = result.data.data.map((m) => m.id);
+        return {
+            content: [
+                {
+                    type: "text",
+                    text: `✅ Pod alive at ${cfg.url}\nAvailable models (per claude-code token whitelist):\n${models.map((m) => `  - ${m}`).join("\n")}`,
+                },
+            ],
+            structuredContent: {
+                alive: true,
+                url: cfg.url,
+                models_available: models,
+                models_count: models.length,
+            },
+        };
+    };
+}

package/package.json ADDED Viewed

@@ -0,0 +1,54 @@
+{
+  "name": "@vglu/tele-gpu-pilot-mcp",
+  "version": "0.1.0",
+  "description": "MCP server exposing tele-gpu-pilot pod inference (chat / image / status) to Claude Code, Cursor, and any MCP-compatible client via stdio. Self-hosted Flux schnell + Gemma 4 + Qwen 2.5-Coder on RunPod A40.",
+  "type": "module",
+  "engines": {
+    "node": ">=22"
+  },
+  "main": "dist/index.js",
+  "bin": {
+    "tele-gpu-pilot-mcp": "dist/index.js"
+  },
+  "files": [
+    "dist/",
+    "README.md",
+    "LICENSE"
+  ],
+  "scripts": {
+    "build": "tsc",
+    "start": "node dist/index.js",
+    "dev": "tsx watch src/index.ts",
+    "typecheck": "tsc --noEmit",
+    "prepublishOnly": "npm run build"
+  },
+  "keywords": [
+    "mcp",
+    "model-context-protocol",
+    "claude",
+    "claude-code",
+    "cursor",
+    "flux",
+    "image-generation",
+    "self-hosted",
+    "runpod",
+    "ollama",
+    "gemma",
+    "qwen"
+  ],
+  "author": "Vitaliy Glushchenko",
+  "license": "MIT",
+  "homepage": "https://tele-gpu-pilot.sims-service.com",
+  "publishConfig": {
+    "access": "public"
+  },
+  "dependencies": {
+    "@modelcontextprotocol/sdk": "^1.0.4",
+    "zod": "^3.24.0"
+  },
+  "devDependencies": {
+    "@types/node": "^22.10.0",
+    "tsx": "^4.19.0",
+    "typescript": "^5.7.0"
+  }
+}