npm - offgrid-ai - Versions diffs - 0.9.0 → 0.9.3 - Mend

offgrid-ai 0.9.0 → 0.9.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +26 -25
package/package.json +1 -1
package/src/autodetect.mjs +6 -3
package/src/backends.mjs +5 -10
package/src/benchmark/pi-runner.mjs +3 -1
package/src/benchmark/prepare.mjs +2 -1
package/src/benchmark/stream-renderer.mjs +0 -1
package/src/commands/run.mjs +6 -1
package/src/commands/status.mjs +40 -9
package/src/model-name.mjs +220 -0
package/src/process.mjs +26 -2
package/src/scan.mjs +9 -20

package/README.md CHANGED Viewed

@@ -2,28 +2,29 @@
 # offgrid-ai
-**Privacy-first CLI for running local AI models on your own machine.**
+**Helper CLI for running local AI models on Mac with llama.cpp, ollama, and oMLX.**
 [![node](https://img.shields.io/badge/node-20%2B-3c873a)](package.json)
 [![platform](https://img.shields.io/badge/platform-macOS%20%7C%20Linux-blue)]()
-Install • Pick a model • Start chatting
-```bash
-curl -fsSL https://raw.githubusercontent.com/eeshansrivastava89/offgrid-ai/main/install.sh | bash
-```
 </div>
 ## What is offgrid-ai?
-offgrid-ai is a command-line tool that lets you run AI models locally. Everything stays on your computer. No API keys, no remote servers, no data leaving your machine.
+offgrid-ai is a command-line tool that lets you run AI models locally. Running local models with llama.cpp, ollama, or oMLX have a steep learning curve compared to cloud-based models, so offgrid-ai is designed to abstract away the complexity, while still providing a powerful and flexible way to run local models.
+This is the recommended workflow:
-It works with:
+1. Download models from **LM Studio**, **Ollama**, or **oMLX**
+2. Do minimal configuration using the `offgrid-ai` command
+3. Run the model with `offgrid-ai` with Pi in interactive mode
-- Models from **LM Studio**
-- **Ollama** models
-- **oMLX** models on Apple Silicon
-- GGUF models from **Hugging Face** or other sources
+## Core Features
+- Auto-detects available models from LM Studio, Ollama, and oMLX
+- Auto-detects MTP (multi-token prediction) or QAT (quantization aware training) models, and applies the correct flags for llama.cpp
+- Auto-applies the optimal flags for the model type in llama.cpp
+- Start / stop llama.cpp server automatically for chat sessions
 ## Quick start
@@ -35,7 +36,7 @@ Open your terminal and run:
 curl -fsSL https://raw.githubusercontent.com/eeshansrivastava89/offgrid-ai/main/install.sh | bash
 ```
-This installs offgrid-ai and anything else it needs. Then open a new terminal window and run:
+This installs offgrid-ai and dependencies (node, npm, and llama.cpp). Then open a new terminal window and run:
 ```bash
 offgrid-ai
@@ -53,14 +54,8 @@ The curl installer is recommended for first-time setup because it also verifies
 The first time you run offgrid-ai, it looks for models already on your machine. If it does not find any, it tells you how to get one.
-Supported ways to get models:
+<img width="808" height="274" alt="image" src="https://github.com/user-attachments/assets/6e1583ab-65db-423c-b0eb-b627586fbf86" />
-| Source | Example command |
-|---|---|
-| LM Studio | `lms get qwen/qwen3.5-9b` |
-| Ollama | `ollama pull gemma3:4b` |
-| oMLX | Use `omlx start` |
-| Hugging Face | Download a GGUF file |
 ### 3. Start chatting
@@ -68,23 +63,29 @@ Supported ways to get models:
 offgrid-ai
 ```
+<img width="786" height="281" alt="image" src="https://github.com/user-attachments/assets/03cb1e06-d461-4bdf-ad82-f0692e5ba5c6" />
 Pick a model from the list and press Enter. offgrid-ai configures the rest and opens the Pi coding agent.
+<img width="786" height="499" alt="image" src="https://github.com/user-attachments/assets/223e1455-c69c-4405-a91c-5bac1b9fc9bd" />
 ## Everyday commands
 ```bash
-offgrid-ai              # start a model
-offgrid-ai status       # see what's running
+offgrid-ai              # primary entry-point for the CLI
+offgrid-ai status       # see if any model is running
 offgrid-ai stop         # stop the running model
-offgrid-ai benchmark    # run a benchmark
+offgrid-ai benchmark    # run a benchmark paired with my local llm benchmark runner
 offgrid-ai uninstall    # remove offgrid-ai
 ```
 ## What can I do with it?
-- **Chat with local models** — no internet required after setup.
-- **Run benchmarks** — compare how different models perform on creative or data-science tasks.
-- **Keep data private** — everything happens on your machine.
+- **Chat with local models** — you download the models yourself, and then offgrid-ai helps configure and run then
+- **Run benchmarks** — compare how different models perform on creative or data-science tasks. Pairs with my other [local llm benchmark runner](https://github.com/eeshansrivastava89/local-llm-visual-benchmark)
+- **Keep data private** — everything runs on your machine without any cloud connections
 ## Need help?

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "offgrid-ai",
-  "version": "0.9.0",
+  "version": "0.9.3",
   "description": "Privacy-first CLI for running local LLMs — discover, configure, run, benchmark",
   "author": "Eeshan Srivastava (https://eeshans.com)",
   "type": "module",

package/src/autodetect.mjs CHANGED Viewed

@@ -2,13 +2,13 @@ import { basename } from "node:path";
 import { existsSync } from "node:fs";
 import { readGgufMetadata } from "./gguf.mjs";
 import { defaultFlagsForBackend } from "./backends.mjs";
+import { parseModelName } from "./model-name.mjs";
 // ── Detect model capabilities from GGUF metadata ──────────────────────────
 export function detectCapabilities(modelPath, mmprojPath) {
   const meta = safeReadGgufMetadata(modelPath);
   const mmprojMeta = mmprojPath ? safeReadGgufMetadata(mmprojPath) : {};
-  const name = basename(modelPath).toLowerCase();
   const pathHints = String(modelPath).toLowerCase();
   // Architecture
@@ -33,8 +33,11 @@ export function detectCapabilities(modelPath, mmprojPath) {
   // Do not treat all Qwen models as MTP; require an explicit filename or metadata hint.
   const mtp = /\bmtp\b|draft-mtp|multi-token/i.test(pathHints) || Object.keys(meta).some((key) => /mtp|draft|speculative/i.test(key));
-  // Quantization
-  const quant = name.match(/(Q\d_K_[A-Z]+|Q\d_[01]|UD-[A-Z0-9_]+)/i)?.[1] ?? null;
+  // Quantization — use parseModelName (single path) for filename-based extraction.
+  // GGUF metadata does not store a standardized quant field, so the filename
+  // is the authoritative source for quant identification.
+  const parsed = parseModelName(basename(modelPath).replace(/\.gguf$/i, ""), "local-gguf");
+  const quant = parsed.quant;
   // Context size from metadata, fallback to name hints
   const metaCtx = architecture

package/src/backends.mjs CHANGED Viewed

@@ -1,5 +1,6 @@
 import { findLlamaServer } from "./config.mjs";
 import { scanGgufModels } from "./scan.mjs";
+import { parseModelName } from "./model-name.mjs";
 // ── Backend definitions ────────────────────────────────────────────────────
@@ -97,7 +98,7 @@ async function scanOllamaModels() {
     .filter((model) => isLocalOllamaModel(model))
     .map((model) => ({
       id: model.name,
-      label: ollamaLabel(model.name),
+      label: parseModelName(model.name, "ollama").display,
       aliasSuggestion: model.name,
       sizeBytes: model.size ?? 0,
       quant: model.details?.quantization_level,
@@ -120,7 +121,7 @@ async function scanOmlxModels() {
     .filter((model) => isChatOmlxModel(model))
     .map((model) => ({
       id: model.id,
-      label: omlxLabel(model.id),
+      label: parseModelName(model.id, "omlx").display,
       aliasSuggestion: model.id,
       sizeBytes: 0,
       quant: null,
@@ -142,15 +143,9 @@ function isLocalOllamaModel(model) {
 function isChatOmlxModel(model) {
   if (typeof model?.id !== "string" || !model.id.trim()) return false;
   const type = String(model.type ?? model.model_type ?? "").toLowerCase();
-  if (["embedding", "embeddings", "reranker", "tool", "converter"].includes(type)) return false;
+  if (["embedding", "embeddings", "reranker", "tool", "converter", "markitdown"].includes(type)) return false;
   if (Object.hasOwn(model, "max_model_len") && model.max_model_len === null) return false;
   return true;
 }
-function ollamaLabel(name) {
-  return name.replace(/[-_]/g, " ").replace(/^gemma\b/i, "Gemma").replace(/^qwen/i, "Qwen");
-}
-function omlxLabel(id) {
-  return id.replace(/[-_]/g, " ").replace(/^gemma-4/i, "Gemma 4").replace(/^qwen/i, "Qwen");
-}
+// (ollamaLabel and omlxLabel removed — parseModelName in model-name.mjs is the single path)

package/src/benchmark/pi-runner.mjs CHANGED Viewed

@@ -5,7 +5,7 @@ import { join } from "node:path";
 import { spawn } from "node:child_process";
 import {
   BENCH_COLORS, renderStreamEvent,
-  formatToolCall,
+  formatToolCall, printFinalLine,
 } from "./stream-renderer.mjs";
 import { piModelString } from "./shared.mjs";
@@ -212,6 +212,8 @@ export async function runBenchmarkInPi(profile, runDirectory, { signal } = {}) {
         return;
       }
+      printFinalLine(BENCH_COLORS.info("Pi benchmark finished"));
       if (runResult.exitCode !== 0) {
         runResult.error = { message: `Pi exited with code ${runResult.exitCode}` };
         resolve(runResult);

package/src/benchmark/prepare.mjs CHANGED Viewed

@@ -4,6 +4,7 @@ import { mkdir, writeFile } from "node:fs/promises";
 import { join } from "node:path";
 import { pc, renderRows, renderSection } from "../ui.mjs";
 import { slugModelId, createRunId, buildToolPrompt } from "./shared.mjs";
+import { parseModelName } from "../model-name.mjs";
 function harnessDisplayName(id) {
   if (id === "pi") return "Pi";
@@ -54,7 +55,7 @@ export async function prepareBenchmarkRun({ repoPath, benchmark, kind, modelId,
     kind,
     runId,
     benchmark: { id: benchmark.id, title: benchmark.title, description: benchmark.description, prompt: benchmark.prompt },
-    model: { id: modelId, slug: modelSlug },
+    model: { id: modelId, slug: modelSlug, displayName: parseModelName(modelId, modelSource === "ollama" ? "ollama" : modelSource === "omlx" ? "omlx" : "local-gguf").display },
     status: "prepared",
     createdAt: now.toISOString(),
     updatedAt: now.toISOString(),

package/src/benchmark/stream-renderer.mjs CHANGED Viewed

@@ -142,7 +142,6 @@ export function renderStreamEvent(parsed, state, opts = {}) {
     }
     case "agent_end":
       clearStatusLine();
-      printFinalLine(BENCH_COLORS.info("Pi benchmark finished"));
       break;
     default:
       break;

package/src/commands/run.mjs CHANGED Viewed

@@ -2,7 +2,7 @@ import { existsSync } from "node:fs";
 import { ensureDirs } from "../config.mjs";
 import { backendFor } from "../backends.mjs";
 import { normalizeProfile, readProfile, saveProfile } from "../profiles.mjs";
-import { startServer, stopProfile, waitForReady, serverReady, serverMatchesProfile } from "../process.mjs";
+import { startServer, stopProfile, waitForReady, serverReady, serverMatchesProfile, modelAvailableOnServer } from "../process.mjs";
 import { syncPiConfig, hasPiModel, launchPi, hasPi } from "../harness-pi.mjs";
 import { tailFriendly } from "../logs.mjs";
 import { estimateMemory } from "../estimate.mjs";
@@ -33,6 +33,11 @@ export async function runProfile(profile, options = {}) {
     if (!(await serverReady(profile.baseUrl))) {
       throw new Error(`${backend.label} is not running at ${profile.baseUrl}. Start it and try again.`);
     }
+    const available = await modelAvailableOnServer(profile);
+    if (!available) {
+      const modelId = profile.omlxModel ?? profile.ollamaModel ?? profile.modelAlias ?? profile.label;
+      throw new Error(`${modelId} is not available on ${backend.label} at ${profile.baseUrl}.`);
+    }
     console.log(pc.green(`[ready] ${backend.label} at ${profile.baseUrl}`));
   } else {
     const startup = await ensureLocalServer(profile, backend, options);

package/src/commands/status.mjs CHANGED Viewed

@@ -13,17 +13,48 @@ export async function statusCommand() {
   }
   const running = statuses.filter((item) => item.status.running);
-  if (running.length === 0) {
-    console.log(renderCard("Status", renderRows([
-      ["Running now", pc.dim("none")],
-      ["Ready setups", profiles.length > 0 ? String(profiles.length) : pc.dim("none")],
-      ["Next step", profiles.length > 0 ? "Run offgrid-ai to start chatting" : pc.yellow("Run offgrid-ai to set up a model")],
-    ]), { formatBorder: pc.dim }));
-    return;
+  const managedUpMissing = statuses.filter((item) => {
+    const backend = backendFor(item.profile.backend);
+    return backend.type === "managed-server" && item.status.serverUp && !item.status.modelAvailable;
+  });
+  const managedUpNotLoaded = statuses.filter((item) => {
+    const backend = backendFor(item.profile.backend);
+    return backend.type === "managed-server" && item.status.serverUp && item.status.modelAvailable && !item.status.modelLoaded;
+  });
+  const summaryRows = [
+    ["Running now", running.length > 0 ? pc.green(`${running.length} model${running.length === 1 ? "" : "s"}`) : pc.dim("none")],
+    ["Ready setups", profiles.length > 0 ? String(profiles.length) : pc.dim("none")],
+  ];
+  if (managedUpMissing.length > 0) {
+    summaryRows.push(["Server up, model missing", pc.yellow(String(managedUpMissing.length))]);
+  }
+  if (managedUpNotLoaded.length > 0) {
+    summaryRows.push(["Server up, model not loaded", pc.yellow(String(managedUpNotLoaded.length))]);
+  }
+  summaryRows.push(["Next step", profiles.length > 0 ? "Run offgrid-ai to start chatting" : pc.yellow("Run offgrid-ai to set up a model")]);
+  console.log(renderCard("Status", renderRows(summaryRows), { formatBorder: running.length > 0 ? pc.green : pc.dim }));
+  if (managedUpMissing.length > 0 || managedUpNotLoaded.length > 0) {
+    const detailRows = [];
+    for (const { profile, status } of [...managedUpMissing, ...managedUpNotLoaded]) {
+      const backend = backendFor(profile.backend);
+      const modelId = profile.omlxModel ?? profile.ollamaModel ?? profile.modelAlias ?? profile.id;
+      const state = status.modelAvailable
+        ? pc.yellow("server up · model not loaded")
+        : pc.red("server up · model missing");
+      detailRows.push([`${profile.label} (${modelId})`, state]);
+      detailRows.push(["Server", `${backend.label} at ${profile.baseUrl}`]);
+    }
+    console.log("\n" + renderCard("Managed servers", renderRows(detailRows), { formatBorder: pc.yellow }));
   }
-  console.log(renderCard("Status", renderRows([
-    ["Running now", pc.green(`${running.length} model${running.length === 1 ? "" : "s"}`)],
+  if (running.length === 0) return;
+  console.log("\n" + renderCard("Running", renderRows([
     ["Stop", "offgrid-ai stop"],
   ]), { formatBorder: pc.green }));
   for (const { profile, status } of running) {

package/src/model-name.mjs ADDED Viewed

@@ -0,0 +1,220 @@
+// ── Single path for parsing and formatting model names ─────────────────────
+//
+// Every model display name in offgrid-ai goes through parseModelName().
+// No other function should format, title-case, or dissect a model name.
+//
+// The returned `id` is always the raw identifier (untouched) and is used for
+// API calls, profile IDs, Pi config matching, and benchmark directory slugs.
+// The returned `display` is the human-readable string shown in pickers, details,
+// and benchmark metadata.
+// ── Known model families ────────────────────────────────────────────────
+//
+// Mapped to their title-case form. Matched as prefix tokens so "qwen"
+// matches "qwen3", "qwen2.5", etc.
+const FAMILY_TITLE_CASE = {
+  "deepseek-r": "DeepSeek-R",
+  "deepseek": "DeepSeek",
+  "starcoder2": "StarCoder2",
+  "starcoder": "StarCoder",
+  "command-r": "Command-R",
+  "command": "Command",
+  "codestral": "Codestral",
+  "mistral": "Mistral",
+  "mixtral": "Mixtral",
+  "mathstral": "Mathstral",
+  "pixtral": "Pixtral",
+  "gemma": "Gemma",
+  "qwen": "Qwen",
+  "llama": "Llama",
+  "phi": "Phi",
+  "yi": "Yi",
+  "zephyr": "Zephyr",
+  "internlm": "InternLM",
+  "cohere": "Cohere",
+  "falcon": "Falcon",
+  "baichuan": "Baichuan",
+  "mamba": "Mamba",
+  "solar": "Solar",
+  "granite": "Granite",
+  "dbrx": "DBRX",
+  "stablelm": "StableLM",
+};
+// Sort families by length descending so longer families match first
+const SORTED_FAMILIES = Object.keys(FAMILY_TITLE_CASE).sort((a, b) => b.length - a.length);
+// ── Quant patterns (order matters — longer/more-specific first) ────────
+const QUANT_PATTERNS = [
+  /[-_]UD-[A-Z0-9_]+/i,
+  /[-_]IQ[0-9_]+(?:_[A-Z]+)?/i,
+  /[-_]Q\d_K_[A-Z]+/i,
+  /[-_]Q\d_[01]/i,
+  /[-_]F(?:16|32)/i,
+  /[-_]BF16/i,
+];
+// ── Tag tokens extracted from the name ──────────────────────────────────
+const TAG_TOKENS = [
+  "it", "instruct", "chat", "code", "base", "vision", "mtp",
+  "mmproj", "draft", "assistant",
+];
+// ── Main entry point ───────────────────────────────────────────────────
+/**
+ * Parse a raw model identifier into a structured display name.
+ *
+ * @param {string} rawId  The raw identifier: GGUF filename (no .gguf),
+ *                        Ollama model name, or oMLX model id.
+ * @param {"local-gguf"|"ollama"|"omlx"} source  Where this name came from.
+ * @returns {{ publisher: string|null, model: string, params: string|null,
+ *             quant: string|null, tags: string[], display: string,
+ *             sort: string, id: string }}
+ */
+export function parseModelName(rawId, source) {
+  const id = rawId; // never modify the raw id
+  // 1. Extract publisher (anything before the first /)
+  let publisher = null;
+  let name = rawId;
+  const slashIdx = rawId.indexOf("/");
+  if (slashIdx !== -1) {
+    publisher = rawId.slice(0, slashIdx);
+    name = rawId.slice(slashIdx + 1);
+  }
+  // 2. For Ollama, split on : to separate model from tag (e.g. "gemma3:4b")
+  //    The tag after : is a model size/variant identifier — not a GGUF quant.
+  let ollamaTag = null;
+  if (source === "ollama") {
+    const colonIdx = name.lastIndexOf(":");
+    if (colonIdx !== -1) {
+      ollamaTag = name.slice(colonIdx + 1);
+      name = name.slice(0, colonIdx);
+    }
+  }
+  // 3. Extract quant (GGUF quantization suffix)
+  let quant = null;
+  for (const pattern of QUANT_PATTERNS) {
+    const match = name.match(pattern);
+    if (match) {
+      quant = match[0].replace(/^[-_]/, "");
+      name = name.slice(0, match.index) + name.slice(match.index + match[0].length);
+      break;
+    }
+  }
+  // 4. Extract known tags as hyphen/underscore-delimited tokens
+  const tags = [];
+  for (const tag of TAG_TOKENS) {
+    const tagRegex = new RegExp(`(?:^|[-_])${tag}(?:$|[-_])`, "i");
+    if (tagRegex.test(name)) {
+      tags.push(tag);
+      // Remove the tag token from the name
+      name = name.replace(new RegExp(`(?:^|[-_])${tag}(?=[-_]|$)`, "i"), (m) => {
+        // Preserve the leading hyphen/underscore boundary
+        return m.startsWith("-") || m.startsWith("_") ? "" : "";
+      });
+    }
+  }
+  // Clean up leftover separators
+  name = name.replace(/[-_]{2,}/g, "-").replace(/^[-_]+|[-_]+$/g, "");
+  // 5. For Ollama, re-attach the tag as part of the model name
+  //    (Ollama tags like "4b" or "30b-a3b" are size variants, not quants)
+  if (ollamaTag) {
+    name = name + "-" + ollamaTag;
+  }
+  // 6. Title-case the remaining model name
+  let model = titleCaseModel(name);
+  // If nothing is left after parsing, fall back to the raw name
+  if (!model || model.trim() === "") {
+    model = rawId.includes("/") ? rawId : rawId.replace(/[-_]/g, " ");
+  }
+  // 7. Extract params (size like 30B, 12B) for sort/filter convenience
+  const params = extractParams(model);
+  // 8. Build display string
+  const display = buildDisplay(publisher, model, tags, quant);
+  // 9. Build sort key (lowercase, no publisher, for alphabetical ordering)
+  const sort = model.toLowerCase().replace(/[-_]/g, " ");
+  return { publisher, model, params, quant, tags, display, sort, id };
+}
+// ── Display builder ────────────────────────────────────────────────────
+function buildDisplay(publisher, model, tags, quant) {
+  const parts = [];
+  if (publisher) {
+    parts.push(publisher);
+  }
+  let modelPart = model;
+  if (tags.length > 0) {
+    modelPart += ` (${tags.join(", ")})`;
+  }
+  parts.push(modelPart);
+  if (quant) {
+    parts.push(quant);
+  }
+  return parts.join(" › ");
+}
+// ── Params extraction ──────────────────────────────────────────────────
+function extractParams(model) {
+  const match = model.match(/\b(\d+(?:\.\d+)?)\s*B\b/);
+  return match ? match[1] + "B" : null;
+}
+// ── Title-case model names ────────────────────────────────────────────
+function titleCaseModel(name) {
+  // Replace hyphens and underscores with spaces
+  let result = name.replace(/[-_]/g, " ");
+  // Title-case known families (prefix match so "qwen" matches "qwen3", etc.)
+  // Insert a space between the family name and a following digit/version.
+  for (const family of SORTED_FAMILIES) {
+    const pattern = new RegExp(`\\b${family}(?=[0-9])`, "gi");
+    result = result.replace(pattern, FAMILY_TITLE_CASE[family] + " ");
+    // Also match family at end of word (no digit following)
+    const patternEnd = new RegExp(`\\b${family}(?![a-z0-9])`, "gi");
+    result = result.replace(patternEnd, FAMILY_TITLE_CASE[family]);
+  }
+  // Title-case param sizes (30b → 30B, 12b → 12B, 0.5b → 0.5B)
+  result = result.replace(/\b(\d+(?:\.\d+)?)\s*[bB]\b/g, (_, num) => {
+    return num + "B";
+  });
+  // Title-case version numbers that follow a family name (Gemma 3, Qwen 2.5)
+  // Pattern: family name followed by a space then a bare digit sequence
+  // that's not a param size (not followed by B/b).
+  // We already have "Gemma 4", "Qwen 3" etc. from family + spacing.
+  // Just ensure the numbers look clean.
+  // Title-case "it" and "instruct" if they survived tag extraction
+  result = result.replace(/\bit\b/g, "IT");
+  result = result.replace(/\binstruct\b/gi, "Instruct");
+  // Title-case "r1", "r2" etc. (DeepSeek-R1, etc.)
+  result = result.replace(/\br(\d+)\b/gi, (_, num) => `R${num}`);
+  // Title-case standalone aXb patterns (A3B, A12B — active parameters)
+  result = result.replace(/\ba(\d+)\s*b\b/gi, (_, num) => `A${num}B`);
+  // Clean up extra spaces
+  result = result.replace(/\s{2,}/g, " ").trim();
+  return result;
+}

package/src/process.mjs CHANGED Viewed

@@ -132,12 +132,27 @@ export async function modelLoadedOnServer(profile) {
   return matches;
 }
+export async function modelAvailableOnServer(profile) {
+  const backend = backendFor(profile.backend);
+  if (backend.id === "ollama") {
+    return modelIdsMatch(await ollamaAvailableModelIds(profile), expectedModelIds(profile));
+  }
+  if (backend.id === "omlx") {
+    // /v1/models lists discovered models; an ID must exist there to be usable.
+    return modelIdsMatch(await serverModelIds(profile.baseUrl), expectedModelIds(profile));
+  }
+  // Local servers are tied to a specific model file via their command argv.
+  return true;
+}
 export async function profileRuntimeStatus(profile) {
   const backend = backendFor(profile.backend);
   if (backend.type === "managed-server") {
     const ready = await serverReady(profile.baseUrl);
-    const modelLoaded = ready ? await modelLoadedOnServer(profile) : false;
-    return { state: null, pid: null, running: ready && modelLoaded, ready, serverUp: ready, modelLoaded, rssBytes: null, startedAt: null };
+    const [modelLoaded, modelAvailable] = ready
+      ? await Promise.all([modelLoadedOnServer(profile), modelAvailableOnServer(profile)])
+      : [false, false];
+    return { state: null, pid: null, running: ready && modelLoaded, ready, serverUp: ready, modelLoaded, modelAvailable, rssBytes: null, startedAt: null };
   }
   const state = await readState(profile.id);
   const running = Boolean(state?.pid && pidAlive(state.pid));
@@ -211,6 +226,15 @@ async function ollamaLoadedModelIds(profile) {
     .filter(Boolean);
 }
+async function ollamaAvailableModelIds(profile) {
+  const result = await fetchJson(`${apiRootUrl(profile.baseUrl)}/api/tags`);
+  if (!result.ok) return [];
+  return (Array.isArray(result.data?.models) ? result.data.models : [])
+    .flatMap((model) => [model?.name, model?.model])
+    .map((id) => String(id ?? "").trim())
+    .filter(Boolean);
+}
 async function omlxLoadedModelIds(profile) {
   const statusResult = await fetchJson(`${profile.baseUrl.replace(/\/+$/u, "")}/models/status`);
   const fromStatus = statusResult.ok

package/src/scan.mjs CHANGED Viewed

@@ -3,6 +3,7 @@ import { readdir } from "node:fs/promises";
 import { basename, dirname, join } from "node:path";
 import { getModelScanDirs } from "./config.mjs";
 import { readGgufMetadata } from "./gguf.mjs";
+import { parseModelName } from "./model-name.mjs";
 // ── Scan for GGUF models and MTP drafters ────────────────────────────────
@@ -48,6 +49,7 @@ async function scanOneDir(root) {
     const mmprojPath = mmprojs.find((candidate) => dirname(candidate) === dir) ?? null;
     const name = basename(path).replace(/\.gguf$/i, "");
     const sizeBytes = statSync(path).size;
+    const parsed = parseModelName(name, "local-gguf");
     // Read GGUF metadata to detect drafter architecture
     const meta = safeReadGgufMetadata(path);
@@ -57,9 +59,9 @@ async function scanOneDir(root) {
       // This is an MTP drafter model, not a main model
       drafters.push({
         path,
-        label: labelFromName(name),
-        aliasSuggestion: aliasFromName(name),
-        quant: quantFromName(name),
+        label: parsed.display,
+        aliasSuggestion: parsed.id,
+        quant: parsed.quant,
         sizeBytes,
         architecture,
         targetHint: drafterTargetHint(name),
@@ -70,9 +72,9 @@ async function scanOneDir(root) {
       models.push({
         path,
         mmprojPath,
-        label: labelFromName(name),
-        aliasSuggestion: aliasFromName(name),
-        quant: quantFromName(name),
+        label: parsed.display,
+        aliasSuggestion: parsed.id,
+        quant: parsed.quant,
         sizeBytes,
         backend: "llama-cpp",
         source: "local-gguf",
@@ -143,20 +145,7 @@ async function findFiles(root, predicate) {
   return result;
 }
-function labelFromName(name) {
-  return name
-    .replace(/-/g, " ")
-    .replace(/\bqwen/i, "Qwen")
-    .replace(/q4_k_m/i, "Q4_K_M");
-}
-function aliasFromName(name) {
-  return name.replace(/-Q4_K_M$/i, "-GGUF");
-}
-function quantFromName(name) {
-  return name.match(/(Q\d_K_[A-Z]+|Q\d_[01]|UD-[A-Z0-9_]+)/)?.[1];
-}
+// (labelFromName, aliasFromName, quantFromName removed — parseModelName in model-name.mjs is the single path)
 function safeReadGgufMetadata(path) {