npm - mohdel - Versions diffs - 0.106.0 → 0.107.0 - Mend

mohdel 0.106.0 → 0.107.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/README.md +53 -66
package/config/curated.example.json +81 -0
package/config/curated.schema.json +156 -0
package/package.json +4 -4
package/src/cli/ask.js +38 -0
package/src/cli/check.js +5 -4
package/src/cli/default.js +2 -2
package/src/cli/doctor.js +180 -0
package/src/cli/index.js +4 -0
package/src/cli/model.js +51 -0
package/src/cli/onboard.js +7 -4
package/src/lib/common.js +15 -1
package/src/lib/curated-cache.js +3 -2
package/src/lib/rank.js +2 -1
package/src/lib/select.js +3 -2

package/README.md CHANGED Viewed

@@ -1,54 +1,36 @@
 # Mohdel
-Self-hosted LLM gateway with an embeddable SDK. Process-isolated, OpenTelemetry-native inference across 11 providers — streaming, tools, thinking control, image generation — without orchestration. Run `thin-gate` as a subprocess for fault isolation, cross-process quota, and any-language HTTP callers; or use the Node factory in-process for CLI tools, scripts, and single-process services.
+One Node API and one CLI for 11 LLM providers — call any model with the same `answer()` shape, get tokens and per-call USD cost back, swap models by changing one string. Self-hosted: your keys, your infra, no SaaS proxy in the path.
-Providers: Anthropic, OpenAI, Gemini, Mistral, Groq, xAI, Cerebras, Fireworks, DeepSeek, OpenRouter, Novita.
-Node 22+, ES modules.
-This README covers install, the `mo` CLI, and configuration. For the JS library guide see [INTEGRATION.md](INTEGRATION.md). For design rationale see [ARCHITECTURE.md](ARCHITECTURE.md). For logging conventions see [LOGGING.md](LOGGING.md).
-Three planes: JS client over a unix socket, Rust thin-gate as the scheduler + state owner, JS session as the provider executor. The `mohdel()` factory path runs the same session inline for single-process consumers. See the **Architecture** section below for a tour.
-## What mohdel is not
-Scope-capping is deliberate. If you're shopping for any of the following, mohdel is the wrong layer — use it *alongside* your framework of choice, not instead of it.
-- **Not an orchestrator.** No chains, no agents, no memory, no prompt templates, no retrieval. Wrap mohdel with LangChain, LangGraph, LlamaIndex, Vercel AI SDK, or your own tool loop — mohdel exposes the inference primitive, orchestration stays in your application.
-- **Not a retry / fallback engine.** Errors are classified (`retryable`, `severity`, `type`) so the caller can decide, but mohdel never retries or swaps models silently. Silent model-swapping would conflict with existing multi-model logic upstream; the caller owns the retry budget and fallback choice.
-- **Not a response cache.** The `cache: true` flag on envelopes is for provider-side prompt caching (Anthropic, OpenAI) — not mohdel-level memoization. Caching inference *results* is orchestration-policy territory and depends on invariants only the caller knows.
-- **Not a context-window / token manager.** No pre-call token count, no projected-cost guard. The caller owns what goes in the prompt and is the source of truth for what counts.
-- **Not a SaaS proxy.** Self-hosted. Your API keys, your infra. No routing through a third party, no vendor lock-in.
-See [ARCHITECTURE.md §Design principles](ARCHITECTURE.md#design-principles) for the full rationale behind each.
-## Observability out of the box
-Every call emits:
+```bash
+npm install -g mohdel
+mo                                     # interactive setup — pick a provider, paste your API key
+mo ask gemini/gemini-3-flash-preview "why is the sky blue"
+```
-- **OpenTelemetry span** (`mohdel.session.answer`) under the caller's `traceparent`, with GenAI semantic-convention attributes (`gen_ai.request.model`, `gen_ai.system`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`) plus mohdel's own (`mohdel.status`, `mohdel.cost`, `mohdel.thinking_tokens`, `mohdel.time_to_first_token_ms`, `mohdel.cooldown` on fast-fail).
-- **Trace-linked logs** — every stderr log line carries `{traceId, spanId, callId, authId, provider, model}`. Dump logs + traces into the same collector (SigNoz, Honeycomb, Jaeger + Loki) and they're correlated for free. No per-call instrumentation code.
-- **Gate-side OTLP metrics** (when running `thin-gate`): `mohdel.sessions.{alive,respawned,spawn_failures}`, `mohdel.calls{provider,status}`, `mohdel.call.duration_ms`, `mohdel.cooldown.rejections`, `mohdel.quota.rejections`, `mohdel.policy.errors`.
+Providers: Anthropic, OpenAI, Gemini, Mistral, Groq, xAI, Cerebras, Fireworks, DeepSeek, OpenRouter, Novita. Node 22+, ES modules.
-One endpoint for everything: set `OTEL_EXPORTER_OTLP_ENDPOINT` and spans + metrics flow to it over gRPC. No-op when unset — zero overhead for callers who aren't wired. See [INTEGRATION.md §OpenTelemetry](INTEGRATION.md#opentelemetry) and [LOGGING.md](LOGGING.md) for details.
+## Why mohdel
-The OTel SDK packages (`@opentelemetry/sdk-node`, `@opentelemetry/exporter-trace-otlp-grpc`) are **`optionalDependencies`** — installed by default, but `npm install --omit=optional` skips them (along with their gRPC transitive tree). If you do that and later want trace export, install them explicitly:
+- **One interface across providers.** Same `answer()` call, same event stream, same `{ status, output, inputTokens, outputTokens, cost }` result. Switching from `anthropic/claude-sonnet-4-6` to `openai/gpt-5.4-mini` is one string change — adapter differences stay inside mohdel.
+- **Real numbers on every call.** Token counts and per-call USD cost computed from your own pricing catalog (`curated.json`) — not estimates, not provider-specific shapes. See [docs/CATALOG.md](docs/CATALOG.md) for the catalog format.
+- **Observability without instrumentation.** OpenTelemetry spans, trace-linked logs, and OTLP metrics over one endpoint. Set `OTEL_EXPORTER_OTLP_ENDPOINT`; everything else is wired.
+- **Two integration paths, same API.** In-process factory for CLI tools, scripts, single-process services. Optional `thin-gate` subprocess for fault isolation, cross-process quota, and any-language HTTP callers — no code change to switch.
+- **Self-hosted, no vendor in the path.** API keys live in `~/.config/mohdel/`. Mohdel calls provider APIs directly; nothing routes through a third party.
-```bash
-npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-grpc
-```
+## Documentation
-`@opentelemetry/api` stays in `dependencies` — the no-op tracer needs it regardless of whether export is wired.
+- [INTEGRATION.md](INTEGRATION.md) — JS library guide (factory, client, answer options, tools, streaming, vision, errors, OTel)
+- [docs/COOKBOOK.md](docs/COOKBOOK.md) — copy-paste recipes (summarize a file, stream, swap providers, tools, vision, batch + cost)
+- [docs/CATALOG.md](docs/CATALOG.md) — `curated.json` walkthrough with worked examples
+- [docs/GLOSSARY.md](docs/GLOSSARY.md) — short definitions for envelope, thin-gate, session, creator vs provider, status, …
+- [ARCHITECTURE.md](ARCHITECTURE.md) — design rationale, three-plane architecture
+- [PROTOCOL.md](PROTOCOL.md) — wire format for porting clients/sessions to other languages
+- [LOGGING.md](LOGGING.md) — log levels, prefixes, pino integration
 ## Quick Start
-```bash
-npm install -g mohdel
-mo                                     # interactive setup — pick a provider, paste your API key
-mo ask gemini/gemini-3-flash-preview "why is the sky blue"
-```
-That's it. `mo` guides you through getting an API key (Gemini, Groq, and Cerebras all have free tiers).
+The three lines at the top of this README are the whole onboarding: install, run `mo` to pick a provider and paste your API key, then `mo ask`. Gemini, Groq, and Cerebras all have free tiers — start there if you don't already have a paid key.
 Model IDs always use the `<provider>/<model>` format:
@@ -59,6 +41,18 @@ openai/gpt-5.4-mini
 groq/llama-4-scout-17b-16e-instruct
 ```
+## What mohdel is not
+Scope-capping is deliberate. If you're shopping for any of the following, mohdel is the wrong layer — use it *alongside* your framework of choice, not instead of it.
+- **Not an orchestrator.** No chains, no agents, no memory, no prompt templates, no retrieval. Wrap mohdel with LangChain, LangGraph, LlamaIndex, Vercel AI SDK, or your own tool loop — mohdel exposes the inference primitive, orchestration stays in your application.
+- **Not a retry / fallback engine.** Errors are classified (`retryable`, `severity`, `type`) so the caller can decide, but mohdel never retries or swaps models silently. Silent model-swapping would conflict with existing multi-model logic upstream; the caller owns the retry budget and fallback choice.
+- **Not a response cache.** The `cache: true` flag on envelopes is for provider-side prompt caching (Anthropic, OpenAI) — not mohdel-level memoization. Caching inference *results* is orchestration-policy territory and depends on invariants only the caller knows.
+- **Not a context-window / token manager.** No pre-call token count, no projected-cost guard. The caller owns what goes in the prompt and is the source of truth for what counts.
+- **Not a SaaS proxy.** Self-hosted. Your API keys, your infra. No routing through a third party, no vendor lock-in.
+See [ARCHITECTURE.md §Design principles](ARCHITECTURE.md#design-principles) for the full rationale behind each.
 ## CLI
 ```bash
@@ -141,6 +135,24 @@ No subprocess; the factory runs the same session adapters inline. Right for CLI
 For the full API — initialization, alias resolution, answer options, response shape, tool use, streaming, vision, error handling, OpenTelemetry, sub-path exports — see **[INTEGRATION.md](INTEGRATION.md)**.
+## Observability
+Every call emits:
+- **OpenTelemetry span** (`mohdel.session.answer`) under the caller's `traceparent`, with GenAI semantic-convention attributes (`gen_ai.request.model`, `gen_ai.system`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`) plus mohdel's own (`mohdel.status`, `mohdel.cost`, `mohdel.thinking_tokens`, `mohdel.time_to_first_token_ms`, `mohdel.cooldown` on fast-fail).
+- **Trace-linked logs** — every stderr log line carries `{traceId, spanId, callId, authId, provider, model}`. Dump logs + traces into the same collector (SigNoz, Honeycomb, Jaeger + Loki) and they're correlated for free. No per-call instrumentation code.
+- **Gate-side OTLP metrics** (when running `thin-gate`): `mohdel.sessions.{alive,respawned,spawn_failures}`, `mohdel.calls{provider,status}`, `mohdel.call.duration_ms`, `mohdel.cooldown.rejections`, `mohdel.quota.rejections`, `mohdel.policy.errors`.
+One endpoint for everything: set `OTEL_EXPORTER_OTLP_ENDPOINT` and spans + metrics flow to it over gRPC. No-op when unset — zero overhead for callers who aren't wired. See [INTEGRATION.md §OpenTelemetry](INTEGRATION.md#opentelemetry) and [LOGGING.md](LOGGING.md) for details.
+The OTel SDK packages (`@opentelemetry/sdk-node`, `@opentelemetry/exporter-trace-otlp-grpc`) are **`optionalDependencies`** — installed by default, but `npm install --omit=optional` skips them (along with their gRPC transitive tree). If you do that and later want trace export, install them explicitly:
+```bash
+npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-grpc
+```
+`@opentelemetry/api` stays in `dependencies` — the no-op tracer needs it regardless of whether export is wired.
 ## Architecture
 Mohdel splits into three planes that can be deployed independently:
@@ -176,26 +188,7 @@ With no session-bin configured, thin-gate runs in demo mode: `POST /v1/call` ret
 ### Calling from JS
-```js
-import { call } from 'mohdel/client'
-const envelope = {
-  callId: 'c-1',
-  authId: 'u-1',
-  auth: { key: process.env.ANTHROPIC_API_SK },
-  model: 'anthropic/claude-haiku-4-5',
-  prompt: 'Say hi.',
-  outputBudget: 100
-}
-for await (const ev of call(envelope, { socketPath: '/tmp/mohdel-data.sock' })) {
-  if (ev.type === 'delta') process.stdout.write(ev.delta.delta)
-  else if (ev.type === 'done') console.log('\n→ status:', ev.result.status, 'cost:', ev.result.cost)
-  else if (ev.type === 'error') console.error('error:', ev.error.message)
-}
-```
-Client surface is deliberately tiny: `call(envelope, { socketPath, signal? })`. Pass an `AbortSignal` to cancel in flight; thin-gate will forward a cancel control message to the session and reuse it on the pool. The envelope is the flat `answer(prompt, options)` surface plus transport metadata (`callId`, `authId`, `auth.key`, optional `traceparent`); see [`js/core/envelope.js`](js/core/envelope.js) for the full field list.
+The client snippet under [Library Usage](#library-usage) above is the full surface: `call(envelope, { socketPath, signal? })` returns an async iterable of events. Pass an `AbortSignal` to cancel in flight; thin-gate forwards a cancel control message to the session and reuses it on the pool. The envelope is the flat `answer(prompt, options)` surface plus transport metadata (`callId`, `authId`, `auth.key`, optional `traceparent`); see [`js/core/envelope.js`](js/core/envelope.js) for the full field list.
 ### Canonical types (frozen wire contract)
@@ -366,12 +359,6 @@ Fork the repository and submit a pull request. Code style: Node 22+, ES modules,
 **Mohdel's wire is language-agnostic.** The JS client is the first implementation, not the only one — a Python / Go / Ruby / Swift / Elixir / ... client is a great starter contribution. See [CONTRIBUTING.md §Porting a client to another language](CONTRIBUTING.md#porting-a-client-to-another-language) and [PROTOCOL.md](PROTOCOL.md).
-## See Also
-- [INTEGRATION.md](INTEGRATION.md) — embed mohdel in your code (factory, model proxy, answer options, tool use, streaming, vision, errors, OTel)
-- [ARCHITECTURE.md](ARCHITECTURE.md) — design decisions and rationale
-- [LOGGING.md](LOGGING.md) — log levels, prefixes, pino integration
 ## License
 MIT. See `LICENSE`.

package/config/curated.example.json ADDED Viewed

@@ -0,0 +1,81 @@
+{
+  "$schema": "./curated.schema.json",
+  "_comment": "Worked examples for ~/.config/mohdel/curated.json. See docs/CATALOG.md for the field reference. Each top-level key is '<provider>/<model-id>'. The provider segment is the routing key; the model-id segment must match the value of the 'model' field below (which is the literal id mohdel sends to the provider's API).",
+  "anthropic/claude-haiku-4-5": {
+    "_comment_a": "Minimal-but-useful entry. Required fields only: model, creator, inputFormat. Everything else is recommended — without prices, 'cost' on results will be 0; without contextTokenLimit, callers can't enforce input budgets.",
+    "model": "claude-haiku-4-5-20251001",
+    "creator": "anthropic",
+    "provider": "anthropic",
+    "sdk": "anthropic",
+    "label": "Claude Haiku 4.5",
+    "inputFormat": ["text", "image"],
+    "inputPrice": 1,
+    "outputPrice": 5,
+    "contextTokenLimit": 200000,
+    "outputTokenLimit": 64000,
+    "tags": ["fast", "chat"]
+  },
+  "anthropic/claude-sonnet-4-6": {
+    "_comment_b": "Full-featured entry. Adds cache pricing (provider-side prompt caching), thinking effort levels (mohdel translates 'low'/'medium'/'high'/etc. to the provider's native budget), default thinking effort, and a leaderboard tuple. The leaderboard is [intelligence, speed, latency] — used by 'mo rank'.",
+    "model": "claude-sonnet-4-6",
+    "creator": "anthropic",
+    "provider": "anthropic",
+    "sdk": "anthropic",
+    "label": "Claude Sonnet 4.6",
+    "inputFormat": ["text", "image"],
+    "inputPrice": 3,
+    "outputPrice": 15,
+    "cacheWritePrice": 3.75,
+    "cacheReadPrice": 0.30,
+    "contextTokenLimit": 1000000,
+    "outputTokenLimit": 128000,
+    "defaultThinkingEffort": "medium",
+    "thinkingEffortLevels": {
+      "low": 100,
+      "medium": 500,
+      "high": 2000,
+      "max": 5000
+    },
+    "tags": ["chat", "tool-loop", "vision"],
+    "leaderboard": [80, 95, 1.2]
+  },
+  "anthropic/claude-3-7-sonnet": {
+    "_comment_c": "Deprecated stub: a one-field entry that redirects callers to the replacement. 'mo' will refuse to use this id and point at the target. Stubs do NOT need any other fields.",
+    "deprecated": "anthropic/claude-sonnet-4-6"
+  },
+  "novita/flux-2-dev": {
+    "_comment_d": "Image-generation entry. type:'image' selects the image dispatcher (no streaming, no tools). imageEndpoint is the provider-side endpoint name; imagePrice is per-image USD.",
+    "model": "flux-2-dev",
+    "creator": "bfl",
+    "provider": "novita",
+    "label": "Flux 2 Dev",
+    "inputFormat": ["text"],
+    "type": "image",
+    "imagePrice": 0.012,
+    "imageEndpoint": "flux-2-dev",
+    "imageDefaultSize": "1024x1024",
+    "tags": ["image"]
+  },
+  "openai/gpt-5.4-mini": {
+    "_comment_e": "Entry with custom rate limits. rpmLimit and tpmLimit override the provider-level defaults in providers.json. rateLimitScope:'model' means the limit is per-model; 'provider' means it joins the provider-level pool.",
+    "model": "gpt-5.4-mini",
+    "creator": "openai",
+    "provider": "openai",
+    "sdk": "openai",
+    "label": "GPT-5.4 Mini",
+    "inputFormat": ["text", "image"],
+    "inputPrice": 0.15,
+    "outputPrice": 0.60,
+    "contextTokenLimit": 400000,
+    "outputTokenLimit": 16384,
+    "rpmLimit": 500,
+    "tpmLimit": 2000000,
+    "rateLimitScope": "model",
+    "tags": ["chat", "cheap"]
+  }
+}

package/config/curated.schema.json ADDED Viewed

@@ -0,0 +1,156 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://github.com/clbrge/mohdel/blob/main/config/curated.schema.json",
+  "title": "Mohdel curated.json",
+  "description": "Schema for ~/.config/mohdel/curated.json — see docs/CATALOG.md for the full reference.",
+  "type": "object",
+  "propertyNames": {
+    "pattern": "^([a-z0-9_-]+/.+|_[a-zA-Z0-9_-]+)$",
+    "description": "Model id in the form '<provider>/<model>'. Underscore-prefixed keys (like '_comment') are allowed for inline notes."
+  },
+  "additionalProperties": {
+    "oneOf": [
+      { "$ref": "#/$defs/deprecatedStub" },
+      { "$ref": "#/$defs/modelEntry" },
+      { "type": "string", "description": "Inline comment value for underscore-prefixed keys." }
+    ]
+  },
+  "$defs": {
+    "deprecatedStub": {
+      "type": "object",
+      "description": "Redirect from a retired model id to its replacement.",
+      "required": ["deprecated"],
+      "additionalProperties": true,
+      "properties": {
+        "deprecated": {
+          "type": "string",
+          "description": "Replacement model id (e.g. 'anthropic/claude-sonnet-4-6')."
+        }
+      }
+    },
+    "modelEntry": {
+      "type": "object",
+      "description": "A real model entry. 'model', 'creator', and 'inputFormat' are required.",
+      "required": ["model", "creator", "inputFormat"],
+      "additionalProperties": true,
+      "properties": {
+        "model": {
+          "type": "string",
+          "description": "Literal model id sent to the provider's API."
+        },
+        "creator": {
+          "type": "string",
+          "description": "Organization that trained the model (e.g. 'anthropic', 'openai', 'alibaba')."
+        },
+        "inputFormat": {
+          "type": "array",
+          "description": "Accepted input modalities. Defaults to ['text'].",
+          "items": { "type": "string", "enum": ["text", "image", "video"] },
+          "minItems": 1,
+          "uniqueItems": true
+        },
+        "provider": {
+          "type": "string",
+          "description": "Routing provider. Defaults to the provider segment of the catalog key."
+        },
+        "sdk": { "type": "string", "description": "SDK adapter to use (some providers require this)." },
+        "type": {
+          "type": "string",
+          "enum": ["model", "image"],
+          "default": "model",
+          "description": "'model' for chat/completion, 'image' for image generation."
+        },
+        "label": { "type": "string", "description": "Human-readable name shown in UIs." },
+        "description": { "type": "string" },
+        "version": { "type": "string" },
+        "createdAt": { "type": "string", "format": "date-time" },
+        "created": { "type": "number", "description": "Unix timestamp (seconds)." },
+        "inputPrice": {
+          "oneOf": [
+            { "type": "number", "minimum": 0 },
+            { "type": "object", "required": ["default"], "additionalProperties": { "type": "number" }, "properties": { "default": { "type": "number" } } }
+          ],
+          "description": "USD per 1M input tokens. Object form is for tiered pricing — must include a 'default' key."
+        },
+        "outputPrice": {
+          "oneOf": [
+            { "type": "number", "minimum": 0 },
+            { "type": "object", "required": ["default"], "additionalProperties": { "type": "number" }, "properties": { "default": { "type": "number" } } }
+          ],
+          "description": "USD per 1M output tokens."
+        },
+        "thinkingPrice": {
+          "oneOf": [
+            { "type": "number", "minimum": 0 },
+            { "type": "object", "required": ["default"], "additionalProperties": { "type": "number" }, "properties": { "default": { "type": "number" } } }
+          ],
+          "description": "USD per 1M thinking/reasoning tokens (when the provider bills these separately)."
+        },
+        "cacheWritePrice": { "type": "number", "minimum": 0, "description": "USD per 1M tokens written to provider-side prompt cache." },
+        "cacheReadPrice": { "type": "number", "minimum": 0, "description": "USD per 1M tokens served from provider-side prompt cache." },
+        "contextTokenLimit": { "type": "integer", "minimum": 1, "description": "Maximum total tokens (input + output)." },
+        "outputTokenLimit": { "type": "integer", "minimum": 1, "description": "Maximum output tokens per call." },
+        "thinkingTokenLimit": { "type": "integer", "minimum": 1, "description": "Maximum thinking tokens per call (when separate from output)." },
+        "tokenizerHeadroom": { "type": "number", "exclusiveMinimum": 0, "description": "Multiplier applied to local token estimates to account for tokenizer drift." },
+        "thinkingEffortLevels": {
+          "oneOf": [
+            { "type": "null" },
+            {
+              "type": "object",
+              "description": "Map effort name → provider-native budget. Standard names: low/medium/high/xhigh/max/none.",
+              "additionalProperties": { "type": "number", "minimum": 0 }
+            }
+          ]
+        },
+        "defaultThinkingEffort": { "type": "string", "description": "Effort level used when the envelope omits 'outputEffort'." },
+        "tags": {
+          "type": "array",
+          "items": { "type": "string", "pattern": "^[a-zA-Z][a-zA-Z0-9._-]{0,31}$" },
+          "uniqueItems": true,
+          "description": "Free-form classification tags (used by 'mo bench --tag', 'mo rank', and caller-side selection)."
+        },
+        "aliases": {
+          "type": "array",
+          "items": { "type": "string" },
+          "description": "Alternative ids that should resolve to this entry."
+        },
+        "replaces": {
+          "type": "array",
+          "items": { "type": "string" },
+          "description": "Older model ids this one supersedes."
+        },
+        "leaderboard": {
+          "type": "array",
+          "items": { "type": "number" },
+          "minItems": 3,
+          "maxItems": 3,
+          "description": "[intelligence, speed, latency] triple — drives 'mo rank'."
+        },
+        "leaderboardNote": { "type": "string" },
+        "supportsTools": { "type": "boolean", "description": "Set false to mark a model as tool-less." },
+        "imagePrice": { "type": "number", "minimum": 0, "description": "USD per generated image (image-type entries)." },
+        "imageEndpoint": { "type": "string", "description": "Provider-side image endpoint name." },
+        "imageDefaultSize": { "type": "string", "description": "Default size when envelope omits one (e.g. '1024x1024')." },
+        "rpmLimit": { "type": "integer", "minimum": 1, "description": "Requests per minute. Overrides provider default." },
+        "tpmLimit": { "type": "integer", "minimum": 1, "description": "Tokens per minute. Overrides provider default." },
+        "rateLimitScope": {
+          "type": "string",
+          "enum": ["model", "provider"],
+          "description": "'model' = private budget. 'provider' = shared with provider-level pool."
+        },
+        "deprecated": { "type": "string", "description": "If present, this entry is treated as a stub (use 'deprecatedStub' shape)." },
+        "suspended": { "type": "string", "description": "Reason this model is temporarily disabled." },
+        "displayName": { "type": "string", "deprecated": true, "description": "Deprecated — use 'label'." }
+      }
+    }
+  }
+}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mohdel",
-  "version": "0.106.0",
+  "version": "0.107.0",
   "license": "MIT",
   "author": {
     "name": "Christophe Le Bars",
@@ -87,12 +87,12 @@
     "@opentelemetry/exporter-trace-otlp-grpc": "^0.218.0",
     "@opentelemetry/sdk-node": "^0.218.0",
     "chalk": "^5.4.0",
-    "mohdel-thin-gate-linux-x64-gnu": "0.106.0"
+    "mohdel-thin-gate-linux-x64-gnu": "0.107.0"
   },
   "dependencies": {
-    "@anthropic-ai/sdk": "^0.95.2",
+    "@anthropic-ai/sdk": "^0.96.0",
     "@cerebras/cerebras_cloud_sdk": "^1.61.1",
-    "@google/genai": "^2.2.0",
+    "@google/genai": "^2.3.0",
     "@opentelemetry/api": "^1.9.1",
     "env-paths": "^4.0.0",
     "groq-sdk": "^1.2.0",

package/src/cli/ask.js CHANGED Viewed

@@ -3,6 +3,42 @@ import { loadDefaultEnv } from '../lib/common.js'
 const noop = () => {}
+// Friendly next-step hints for common ask-time failures. Pure pattern match on
+// err.message — keeps the lib layer neutral, but gives CLI users a copy-pasteable
+// command instead of just an error.
+const hintsForError = (err, modelId) => {
+  const msg = String(err?.message || '')
+  const detail = String(err?.detail || '')
+  const both = `${msg}\n${detail}`
+  const provider = modelId.includes('/') ? modelId.split('/')[0] : null
+  const hints = []
+  if (/not found in curated models/i.test(both)) {
+    if (provider) {
+      hints.push(`→ run:  mo curate ${provider}        # add upstream models from this provider`)
+      hints.push(`→ or:   mo model add ${modelId}      # add this one manually`)
+    } else {
+      hints.push('→ run:  mo ls                       # list available models')
+    }
+    hints.push('→ see:  docs/CATALOG.md             # catalog format reference')
+  }
+  if (/API key not found/i.test(both) || /AUTH_INVALID/i.test(err?.type || '') || /401|unauthorized|invalid api key/i.test(both)) {
+    if (provider) hints.push(`→ run:  mo setup ${provider}`)
+    else hints.push('→ run:  mo                          # interactive provider/key setup')
+  }
+  if (/deprecated/i.test(both) && /replacement/i.test(both)) {
+    hints.push('→ run:  mo check                    # find broken deprecation links in curated.json')
+  }
+  if (/Provider configuration for/i.test(both)) {
+    hints.push('→ see:  docs/CATALOG.md             # the provider segment must match a known adapter')
+  }
+  return hints
+}
 export async function runAsk (args) {
   if (args.includes('-h') || args.includes('--help')) {
     console.log(`mohdel ask — one-shot inference, pipeable
@@ -97,6 +133,7 @@ Examples:
     model = mo.use(modelId)
   } catch (err) {
     console.error(err.message)
+    for (const h of hintsForError(err, modelId)) console.error(h)
     process.exit(1)
   }
@@ -155,6 +192,7 @@ Examples:
     if (summary.length) process.stderr.write(`${summary.join(', ')}\n`)
   } catch (err) {
     console.error(`Error: ${err.detail || err.message}`)
+    for (const h of hintsForError(err, modelId)) console.error(h)
     process.exit(1)
   }
 }

package/src/cli/check.js CHANGED Viewed

@@ -1,7 +1,7 @@
 import { label, err, warn, ok } from './colors.js'
 import providers from '../lib/providers.js'
 import { validate, isValidTag } from '../lib/schema.js'
-import { getCuratedModels, loadDefaultEnv } from '../lib/common.js'
+import { getCuratedModels, loadDefaultEnv, catalogEntries, catalogValues } from '../lib/common.js'
 // --- Local validation ---
@@ -10,7 +10,7 @@ const checkLocal = (curated) => {
   const warnings = []
   const knownProviders = new Set(Object.keys(providers))
-  for (const [key, spec] of Object.entries(curated)) {
+  for (const [key, spec] of catalogEntries(curated)) {
     const [keyProvider] = key.split('/')
     if (spec.deprecated) {
@@ -87,8 +87,9 @@ detection, file an issue — it'll be rebuilt on the /session stack.`)
   const json = args.includes('--json')
   const curated = await getCuratedModels()
-  const active = Object.values(curated).filter(s => !s.deprecated).length
-  const deprecated = Object.values(curated).length - active
+  const all = catalogValues(curated)
+  const active = all.filter(s => !s.deprecated).length
+  const deprecated = all.length - active
   if (!json) {
     console.log(`${label('Catalog:')} ${active} active, ${deprecated} deprecated\n`)

package/src/cli/default.js CHANGED Viewed

@@ -1,12 +1,12 @@
 import { intro, outro, select, isCancel, cancel } from '@clack/prompts'
-import { getCuratedModels, CONFIG_PATH, saveConfig } from '../lib/common.js'
+import { getCuratedModels, CONFIG_PATH, saveConfig, catalogEntries } from '../lib/common.js'
 import providers from '../lib/providers.js'
 export async function runDefault () {
   intro('Mohdel — Set Default Model')
   const curated = await getCuratedModels()
-  const modelOptions = Object.entries(curated).map(([modelId, info]) => ({
+  const modelOptions = catalogEntries(curated).map(([modelId, info]) => ({
     value: modelId,
     label: `${info.label} (${modelId})`
   }))

package/src/cli/doctor.js ADDED Viewed

@@ -0,0 +1,180 @@
+import { existsSync } from 'fs'
+import { label, meta, ok, warn, err, inactive } from './colors.js'
+import providers from '../lib/providers.js'
+import { validate, isValidTag } from '../lib/schema.js'
+import {
+  CONFIG_DIR, CURATED_PATH, CONFIG_PATH, ENV_PATH,
+  getCuratedModels, getConfig, loadDefaultEnv, getAPIKey, catalogEntries
+} from '../lib/common.js'
+const row = (status, name, detail = '') =>
+  `  ${status}  ${name.padEnd(20)} ${meta(detail)}`
+export async function runDoctor (args) {
+  if (args.includes('-h') || args.includes('--help')) {
+    console.log(`mohdel doctor — check that your install is wired up
+Usage:
+  mo doctor [--json]
+What it checks:
+  - Config directory and environment file exist
+  - At least one provider API key is set
+  - curated.json parses and passes schema validation
+  - Default model (if set) resolves to a real entry
+Exit code:
+  0  no errors (warnings allowed)
+  1  one or more errors — fix them before relying on the install`)
+    process.exit(0)
+  }
+  const json = args.includes('--json')
+  loadDefaultEnv()
+  const report = {
+    configDir: { ok: false, path: CONFIG_DIR },
+    envFile: { ok: false, path: ENV_PATH },
+    curatedFile: { ok: false, path: CURATED_PATH, active: 0, deprecated: 0 },
+    keys: { configured: [], missing: [] },
+    schema: { errors: [], warnings: [] },
+    defaultModel: { set: false, id: null, resolves: false },
+    errors: [],
+    warnings: []
+  }
+  // 1. Config dir + env file
+  report.configDir.ok = existsSync(CONFIG_DIR)
+  report.envFile.ok = existsSync(ENV_PATH)
+  if (!report.configDir.ok) report.warnings.push('config directory does not exist (will be created on first save)')
+  if (!report.envFile.ok) report.warnings.push(`no ${ENV_PATH} — set API keys there or via shell env`)
+  // 2. API keys per provider
+  for (const [name, def] of Object.entries(providers)) {
+    if (getAPIKey(def.apiKeyEnv)) {
+      report.keys.configured.push({ provider: name, envVar: def.apiKeyEnv })
+    } else {
+      report.keys.missing.push({ provider: name, envVar: def.apiKeyEnv })
+    }
+  }
+  if (!report.keys.configured.length) {
+    report.errors.push('no API keys configured — run "mo" to set one up')
+  }
+  // 3. curated.json — exists, parses, validates
+  let curated = null
+  try {
+    curated = await getCuratedModels()
+    report.curatedFile.ok = true
+    const entries = catalogEntries(curated)
+    report.curatedFile.active = entries.filter(([, s]) => !s.deprecated).length
+    report.curatedFile.deprecated = entries.length - report.curatedFile.active
+    // Schema validation (same logic as 'mo check', condensed)
+    const knownProviders = new Set(Object.keys(providers))
+    for (const [key, spec] of catalogEntries(curated)) {
+      if (spec.deprecated) {
+        if (!curated[spec.deprecated]) {
+          report.schema.errors.push(`${key}: deprecated target '${spec.deprecated}' missing`)
+        }
+        continue
+      }
+      for (const issue of validate(spec, key)) {
+        if (issue.severity === 'error') report.schema.errors.push(`${key}: ${issue.field} — ${issue.message}`)
+        else report.schema.warnings.push(`${key}: ${issue.field} — ${issue.message}`)
+      }
+      const [keyProvider] = key.split('/')
+      if (!knownProviders.has(keyProvider)) {
+        report.schema.errors.push(`${key}: provider '${keyProvider}' not in providers.js`)
+      }
+      if (Array.isArray(spec.tags)) {
+        for (const t of spec.tags) {
+          if (!isValidTag(t)) report.schema.warnings.push(`${key}: invalid tag "${t}"`)
+        }
+      }
+    }
+    if (report.schema.errors.length) {
+      report.errors.push(`${report.schema.errors.length} schema error(s) in curated.json — run "mo check" for details`)
+    }
+  } catch (e) {
+    report.errors.push(`curated.json: ${e.message}`)
+  }
+  // 4. Default model
+  if (existsSync(CONFIG_PATH)) {
+    try {
+      const cfg = await getConfig()
+      if (cfg.defaultModel) {
+        report.defaultModel.set = true
+        report.defaultModel.id = cfg.defaultModel
+        report.defaultModel.resolves = !!(curated && curated[cfg.defaultModel])
+        if (!report.defaultModel.resolves) {
+          report.errors.push(`default model '${cfg.defaultModel}' is not in curated.json`)
+        }
+      }
+    } catch {
+      report.warnings.push('default.json present but failed to parse')
+    }
+  } else {
+    report.warnings.push('no default model set — pass <provider/model> to "mo ask", or run "mo default"')
+  }
+  if (json) {
+    console.log(JSON.stringify(report, null, 2))
+    process.exit(report.errors.length ? 1 : 0)
+  }
+  // Pretty output
+  console.log(label('Mohdel doctor\n'))
+  console.log(label('Configuration'))
+  console.log(row(report.configDir.ok ? ok('✓') : warn('!'), 'Config dir', report.configDir.path))
+  console.log(row(report.envFile.ok ? ok('✓') : warn('!'), 'Env file', report.envFile.ok ? report.envFile.path : `${report.envFile.path} (missing)`))
+  if (report.curatedFile.ok) {
+    console.log(row(ok('✓'), 'curated.json', `${report.curatedFile.active} active, ${report.curatedFile.deprecated} deprecated`))
+  } else {
+    console.log(row(err('✗'), 'curated.json', 'failed to load'))
+  }
+  console.log()
+  console.log(label(`API keys (${report.keys.configured.length} of ${Object.keys(providers).length})`))
+  for (const k of report.keys.configured) {
+    console.log(row(ok('✓'), k.provider, k.envVar))
+  }
+  for (const k of report.keys.missing) {
+    console.log(row(inactive('○'), k.provider, `${k.envVar} (unset)`))
+  }
+  console.log()
+  console.log(label('Catalog validation'))
+  if (report.schema.errors.length) {
+    console.log(`  ${err('✗')} ${report.schema.errors.length} error(s) ${meta('— run "mo check" for details')}`)
+  } else {
+    console.log(`  ${ok('✓')} no errors`)
+  }
+  if (report.schema.warnings.length) {
+    console.log(`  ${warn('!')} ${report.schema.warnings.length} warning(s) ${meta('— run "mo check" for details')}`)
+  }
+  console.log()
+  console.log(label('Default model'))
+  if (report.defaultModel.set) {
+    const status = report.defaultModel.resolves ? ok('✓') : err('✗')
+    const note = report.defaultModel.resolves ? '' : '(not in curated.json)'
+    console.log(row(status, report.defaultModel.id, note))
+  } else {
+    console.log(row(inactive('○'), 'not set', 'pass <provider/model> to mo ask, or run "mo default"'))
+  }
+  console.log()
+  if (report.errors.length) {
+    console.log(`${err('✗')} ${report.errors.length} error(s):`)
+    for (const e of report.errors) console.log(`  ${err('✗')} ${e}`)
+    process.exit(1)
+  } else if (report.warnings.length) {
+    console.log(`${warn('!')} ${report.warnings.length} warning(s) — install is usable but not fully configured`)
+    for (const w of report.warnings) console.log(`  ${warn('!')} ${w}`)
+  } else {
+    console.log(`${ok('✓')} ready`)
+  }
+}

package/src/cli/index.js CHANGED Viewed

@@ -57,6 +57,7 @@ Commands:
   ask <provider/model> [prompt]           One-shot inference (pipeable)
   default                                 Set default model (interactive)
+  doctor                                  Check that your install is wired up
 Aliases:
   models                model list
@@ -126,6 +127,9 @@ const resolvedArgs = alias ? [...alias.inject, ...args] : args
 if (resolved === 'default') {
   const { runDefault } = await import('./default.js')
   await runDefault()
+} else if (resolved === 'doctor') {
+  const { runDoctor } = await import('./doctor.js')
+  await runDoctor(resolvedArgs)
 } else if (resolved === 'ask') {
   const { runAsk } = await import('./ask.js')
   await runAsk(resolvedArgs)

package/src/cli/model.js CHANGED Viewed

@@ -250,9 +250,35 @@ ${meta('tags:')}         ${(info.tags || []).map(t => tag(t)).join(', ') || meta
   if (action === 'add') {
     const modelId = arg1
+    if (modelId === '-h' || modelId === '--help') {
+      console.log(`mohdel model add — add a model entry to ~/.config/mohdel/curated.json
+Usage:
+  model add <provider>/<model-id>
+What it does:
+  1. Resolves <provider> against the known provider list (anthropic, openai, …)
+  2. Pre-fills 'model', 'provider', 'sdk' from that resolution
+  3. If your API key is set, fetches upstream model metadata (context, pricing, …)
+  4. Prompts for any missing required field
+Examples:
+  mo model add fireworks/deepseek-r1
+  mo model add anthropic/claude-haiku-4-5
+Required fields (asked if not pre-filled):
+  model        the literal id sent to the provider's API
+  creator      who trained the model (e.g. anthropic, openai, alibaba)
+  inputFormat  subset of [text, image, video]
+See docs/CATALOG.md for the full field reference, and
+config/curated.example.json for ready-to-copy entries.`)
+      process.exit(0)
+    }
     if (!modelId || !modelId.includes('/')) {
       console.error('Usage: model add <provider>/<model-id>')
       console.error('Example: mo model add fireworks/deepseek-r1')
+      console.error('Run "mo model add --help" for details.')
       process.exit(1)
     }
@@ -324,6 +350,31 @@ ${meta('tags:')}         ${(info.tags || []).map(t => tag(t)).join(', ') || meta
   }
   if (action === 'curate') {
+    if (arg1 === '-h' || arg1 === '--help') {
+      console.log(`mohdel model curate — bulk-add upstream models to your catalog
+Usage:
+  curate                  Pick a provider from a menu
+  curate <provider>       Curate models from one provider directly
+What it does:
+  Lists every model your API key can see at <provider>, lets you select
+  which to add to ~/.config/mohdel/curated.json. Pre-fills 'model',
+  'provider', 'sdk', and any metadata the SDK can return.
+Examples:
+  mo curate                  # interactive provider picker
+  mo curate anthropic
+  mo curate openai
+Tip: after curating, fill in the things only you know — prices, contextTokenLimit,
+tags, thinkingEffortLevels — with 'mo model set <id> <key> <value>' or by editing
+~/.config/mohdel/curated.json directly. See docs/CATALOG.md for the field reference
+and config/curated.example.json for ready-to-copy entries.
+Requires an API key for the chosen provider — run 'mo' to configure one.`)
+      process.exit(0)
+    }
     const { initializeAPIs, processModels } = await import('../lib/select.js')
     const { api, providersWithKeys } = await initializeAPIs()

package/src/cli/onboard.js CHANGED Viewed

@@ -139,10 +139,13 @@ export async function runOnboard () {
       }
     }
     console.log(`\n${meta('Commands:')}
-  mo model list          Browse models
-  mo model show <model>  Model details
-  mo default             Set default model
-  mo --help              All commands`)
+  mo ask <model> "..."     One-shot inference (pipeable)
+  mo doctor                Check install health
+  mo model list            Browse curated models
+  mo model show <model>    Model details
+  mo default               Set default model
+  mo provider setup <p>    Add another provider
+  mo --help                All commands`)
     return
   }

package/src/lib/common.js CHANGED Viewed

@@ -17,6 +17,15 @@ export const EXCLUDED_PATH = join(CONFIG_DIR, 'excluded.json')
 export const PROVIDERS_CONFIG_PATH = join(CONFIG_DIR, 'providers.json')
 export const ENV_PATH = join(CONFIG_DIR, 'environment')
+// Meta keys (e.g. $schema for JSON Schema editors, _* for inline notes) live at
+// the top level of curated.json alongside model entries. They're preserved on
+// load/save but excluded from every iteration site so they don't pollute alias
+// maps, rank indices, suggestion search, or pickers.
+export const isMetaKey = (key) => key.startsWith('$') || key.startsWith('_')
+export const catalogEntries = (catalog) => Object.entries(catalog).filter(([k]) => !isMetaKey(k))
+export const catalogKeys = (catalog) => Object.keys(catalog).filter(k => !isMetaKey(k))
+export const catalogValues = (catalog) => catalogEntries(catalog).map(([, v]) => v)
 const DEFAULT_CURATED = {}
 const DEFAULT_EXCLUDED = {}
@@ -115,6 +124,11 @@ const createFileOperation = (filePath, defaultValue = {}, operationType) => {
       if (typeof loadedData === 'object' && loadedData !== null && !Array.isArray(loadedData)) {
         const processedData = {}
         for (const [key, entryValue] of Object.entries(loadedData)) {
+          // Meta keys are preserved on round-trip but skipped from entry processing.
+          if (isMetaKey(key)) {
+            processedData[key] = entryValue
+            continue
+          }
           if (typeof entryValue !== 'object' || entryValue === null || Array.isArray(entryValue)) {
             processedData[key] = entryValue
             continue
@@ -131,7 +145,7 @@ const createFileOperation = (filePath, defaultValue = {}, operationType) => {
         }
         if (operationType === 'curated models') {
-          for (const [curatedKey, entry] of Object.entries(processedData)) {
+          for (const [curatedKey, entry] of catalogEntries(processedData)) {
             const issues = validate(entry, curatedKey)
             for (const issue of issues) {
               moduleLogger.warn(`[mohdel:schema] ${curatedKey}: ${issue.field} — ${issue.message}`)

package/src/lib/curated-cache.js CHANGED Viewed

@@ -1,4 +1,4 @@
-import { getCuratedModels, saveCuratedModels } from './common.js'
+import { getCuratedModels, saveCuratedModels, isMetaKey, catalogKeys } from './common.js'
 let curatedCache = null
 let aliasMapCache = null
@@ -17,6 +17,7 @@ const buildAliasMap = (curatedModels) => {
   // Pass 1: count names and cache parsed results
   for (const fullModelId in curatedModels) {
+    if (isMetaKey(fullModelId)) continue
     const { provider, model: modelName } = getMohdelModel(fullModelId)
     const baseMatch = modelName.match(BASE_NAME_RE)
     const baseName = baseMatch?.[1] || null
@@ -79,7 +80,7 @@ export const suggestModels = (query, maxResults = 5) => {
   const q = query.toLowerCase()
   const scored = []
-  for (const fullId of Object.keys(curatedCache)) {
+  for (const fullId of catalogKeys(curatedCache)) {
     if (curatedCache[fullId].deprecated) continue
     const entry = curatedCache[fullId]
     const label = (entry.label || '').toLowerCase()

package/src/lib/rank.js CHANGED Viewed

@@ -6,6 +6,7 @@ import { join, dirname } from 'path'
 import { fileURLToPath } from 'url'
 import { existsSync } from 'fs'
 import { CACHE_DIR } from './cache.js'
+import { catalogEntries } from './common.js'
 const __dirname = dirname(fileURLToPath(import.meta.url))
 const CONFIG_PATH = join(__dirname, '..', '..', 'config', 'benchmarks.json')
@@ -254,7 +255,7 @@ const computeGroupScores = (available, benchmarks) => {
 const buildCuratedIndex = (curated) => {
   const index = new Map()
-  for (const [key, entry] of Object.entries(curated)) {
+  for (const [key, entry] of catalogEntries(curated)) {
     if (entry.deprecated) continue
     const modelPart = key.split('/').slice(1).join('/')
     index.set(modelPart, key)

package/src/lib/select.js CHANGED Viewed

@@ -7,7 +7,8 @@ import {
   saveCuratedModels,
   saveExcludedModels,
   loadEnvFile,
-  loadDefaultEnv
+  loadDefaultEnv,
+  catalogEntries
 } from './common.js'
 import { getMohdelModel } from './curated-cache.js'
 import { stripUnknown } from './schema.js'
@@ -81,7 +82,7 @@ const findReplacementCandidates = (providerName, modelId, curated) => {
   const candidates = []
-  for (const [curatedKey, curatedInfo] of Object.entries(curated)) {
+  for (const [curatedKey, curatedInfo] of catalogEntries(curated)) {
     const { provider: curProviderName, model: curModelId } = getMohdelModel(curatedKey)
     if (curProviderName === providerName && curModelId !== modelId) {
       if (baseRegExp.test(curModelId)) {