mohdel 0.105.2 → 0.107.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,54 +1,36 @@
1
1
  # Mohdel
2
2
 
3
- Self-hosted LLM gateway with an embeddable SDK. Process-isolated, OpenTelemetry-native inference across 11 providers — streaming, tools, thinking control, image generation — without orchestration. Run `thin-gate` as a subprocess for fault isolation, cross-process quota, and any-language HTTP callers; or use the Node factory in-process for CLI tools, scripts, and single-process services.
3
+ One Node API and one CLI for 11 LLM providers — call any model with the same `answer()` shape, get tokens and per-call USD cost back, swap models by changing one string. Self-hosted: your keys, your infra, no SaaS proxy in the path.
4
4
 
5
- Providers: Anthropic, OpenAI, Gemini, Mistral, Groq, xAI, Cerebras, Fireworks, DeepSeek, OpenRouter, Novita.
6
-
7
- Node 22+, ES modules.
8
-
9
- This README covers install, the `mo` CLI, and configuration. For the JS library guide see [INTEGRATION.md](INTEGRATION.md). For design rationale see [ARCHITECTURE.md](ARCHITECTURE.md). For logging conventions see [LOGGING.md](LOGGING.md).
10
-
11
- Three planes: JS client over a unix socket, Rust thin-gate as the scheduler + state owner, JS session as the provider executor. The `mohdel()` factory path runs the same session inline for single-process consumers. See the **Architecture** section below for a tour.
12
-
13
- ## What mohdel is not
14
-
15
- Scope-capping is deliberate. If you're shopping for any of the following, mohdel is the wrong layer — use it *alongside* your framework of choice, not instead of it.
16
-
17
- - **Not an orchestrator.** No chains, no agents, no memory, no prompt templates, no retrieval. Wrap mohdel with LangChain, LangGraph, LlamaIndex, Vercel AI SDK, or your own tool loop — mohdel exposes the inference primitive, orchestration stays in your application.
18
- - **Not a retry / fallback engine.** Errors are classified (`retryable`, `severity`, `type`) so the caller can decide, but mohdel never retries or swaps models silently. Silent model-swapping would conflict with existing multi-model logic upstream; the caller owns the retry budget and fallback choice.
19
- - **Not a response cache.** The `cache: true` flag on envelopes is for provider-side prompt caching (Anthropic, OpenAI) — not mohdel-level memoization. Caching inference *results* is orchestration-policy territory and depends on invariants only the caller knows.
20
- - **Not a context-window / token manager.** No pre-call token count, no projected-cost guard. The caller owns what goes in the prompt and is the source of truth for what counts.
21
- - **Not a SaaS proxy.** Self-hosted. Your API keys, your infra. No routing through a third party, no vendor lock-in.
22
-
23
- See [ARCHITECTURE.md §Design principles](ARCHITECTURE.md#design-principles) for the full rationale behind each.
24
-
25
- ## Observability out of the box
26
-
27
- Every call emits:
5
+ ```bash
6
+ npm install -g mohdel
7
+ mo # interactive setup — pick a provider, paste your API key
8
+ mo ask gemini/gemini-3-flash-preview "why is the sky blue"
9
+ ```
28
10
 
29
- - **OpenTelemetry span** (`mohdel.session.answer`) under the caller's `traceparent`, with GenAI semantic-convention attributes (`gen_ai.request.model`, `gen_ai.system`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`) plus mohdel's own (`mohdel.status`, `mohdel.cost`, `mohdel.thinking_tokens`, `mohdel.time_to_first_token_ms`, `mohdel.cooldown` on fast-fail).
30
- - **Trace-linked logs** — every stderr log line carries `{traceId, spanId, callId, authId, provider, model}`. Dump logs + traces into the same collector (SigNoz, Honeycomb, Jaeger + Loki) and they're correlated for free. No per-call instrumentation code.
31
- - **Gate-side OTLP metrics** (when running `thin-gate`): `mohdel.sessions.{alive,respawned,spawn_failures}`, `mohdel.calls{provider,status}`, `mohdel.call.duration_ms`, `mohdel.cooldown.rejections`, `mohdel.quota.rejections`, `mohdel.policy.errors`.
11
+ Providers: Anthropic, OpenAI, Gemini, Mistral, Groq, xAI, Cerebras, Fireworks, DeepSeek, OpenRouter, Novita. Node 22+, ES modules.
32
12
 
33
- One endpoint for everything: set `OTEL_EXPORTER_OTLP_ENDPOINT` and spans + metrics flow to it over gRPC. No-op when unset — zero overhead for callers who aren't wired. See [INTEGRATION.md §OpenTelemetry](INTEGRATION.md#opentelemetry) and [LOGGING.md](LOGGING.md) for details.
13
+ ## Why mohdel
34
14
 
35
- The OTel SDK packages (`@opentelemetry/sdk-node`, `@opentelemetry/exporter-trace-otlp-grpc`) are **`optionalDependencies`** installed by default, but `npm install --omit=optional` skips them (along with their gRPC transitive tree). If you do that and later want trace export, install them explicitly:
15
+ - **One interface across providers.** Same `answer()` call, same event stream, same `{ status, output, inputTokens, outputTokens, cost }` result. Switching from `anthropic/claude-sonnet-4-6` to `openai/gpt-5.4-mini` is one string change adapter differences stay inside mohdel.
16
+ - **Real numbers on every call.** Token counts and per-call USD cost computed from your own pricing catalog (`curated.json`) — not estimates, not provider-specific shapes. See [docs/CATALOG.md](docs/CATALOG.md) for the catalog format.
17
+ - **Observability without instrumentation.** OpenTelemetry spans, trace-linked logs, and OTLP metrics over one endpoint. Set `OTEL_EXPORTER_OTLP_ENDPOINT`; everything else is wired.
18
+ - **Two integration paths, same API.** In-process factory for CLI tools, scripts, single-process services. Optional `thin-gate` subprocess for fault isolation, cross-process quota, and any-language HTTP callers — no code change to switch.
19
+ - **Self-hosted, no vendor in the path.** API keys live in `~/.config/mohdel/`. Mohdel calls provider APIs directly; nothing routes through a third party.
36
20
 
37
- ```bash
38
- npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-grpc
39
- ```
21
+ ## Documentation
40
22
 
41
- `@opentelemetry/api` stays in `dependencies` the no-op tracer needs it regardless of whether export is wired.
23
+ - [INTEGRATION.md](INTEGRATION.md) JS library guide (factory, client, answer options, tools, streaming, vision, errors, OTel)
24
+ - [docs/COOKBOOK.md](docs/COOKBOOK.md) — copy-paste recipes (summarize a file, stream, swap providers, tools, vision, batch + cost)
25
+ - [docs/CATALOG.md](docs/CATALOG.md) — `curated.json` walkthrough with worked examples
26
+ - [docs/GLOSSARY.md](docs/GLOSSARY.md) — short definitions for envelope, thin-gate, session, creator vs provider, status, …
27
+ - [ARCHITECTURE.md](ARCHITECTURE.md) — design rationale, three-plane architecture
28
+ - [PROTOCOL.md](PROTOCOL.md) — wire format for porting clients/sessions to other languages
29
+ - [LOGGING.md](LOGGING.md) — log levels, prefixes, pino integration
42
30
 
43
31
  ## Quick Start
44
32
 
45
- ```bash
46
- npm install -g mohdel
47
- mo # interactive setup — pick a provider, paste your API key
48
- mo ask gemini/gemini-3-flash-preview "why is the sky blue"
49
- ```
50
-
51
- That's it. `mo` guides you through getting an API key (Gemini, Groq, and Cerebras all have free tiers).
33
+ The three lines at the top of this README are the whole onboarding: install, run `mo` to pick a provider and paste your API key, then `mo ask`. Gemini, Groq, and Cerebras all have free tiers — start there if you don't already have a paid key.
52
34
 
53
35
  Model IDs always use the `<provider>/<model>` format:
54
36
 
@@ -59,6 +41,18 @@ openai/gpt-5.4-mini
59
41
  groq/llama-4-scout-17b-16e-instruct
60
42
  ```
61
43
 
44
+ ## What mohdel is not
45
+
46
+ Scope-capping is deliberate. If you're shopping for any of the following, mohdel is the wrong layer — use it *alongside* your framework of choice, not instead of it.
47
+
48
+ - **Not an orchestrator.** No chains, no agents, no memory, no prompt templates, no retrieval. Wrap mohdel with LangChain, LangGraph, LlamaIndex, Vercel AI SDK, or your own tool loop — mohdel exposes the inference primitive, orchestration stays in your application.
49
+ - **Not a retry / fallback engine.** Errors are classified (`retryable`, `severity`, `type`) so the caller can decide, but mohdel never retries or swaps models silently. Silent model-swapping would conflict with existing multi-model logic upstream; the caller owns the retry budget and fallback choice.
50
+ - **Not a response cache.** The `cache: true` flag on envelopes is for provider-side prompt caching (Anthropic, OpenAI) — not mohdel-level memoization. Caching inference *results* is orchestration-policy territory and depends on invariants only the caller knows.
51
+ - **Not a context-window / token manager.** No pre-call token count, no projected-cost guard. The caller owns what goes in the prompt and is the source of truth for what counts.
52
+ - **Not a SaaS proxy.** Self-hosted. Your API keys, your infra. No routing through a third party, no vendor lock-in.
53
+
54
+ See [ARCHITECTURE.md §Design principles](ARCHITECTURE.md#design-principles) for the full rationale behind each.
55
+
62
56
  ## CLI
63
57
 
64
58
  ```bash
@@ -141,6 +135,24 @@ No subprocess; the factory runs the same session adapters inline. Right for CLI
141
135
 
142
136
  For the full API — initialization, alias resolution, answer options, response shape, tool use, streaming, vision, error handling, OpenTelemetry, sub-path exports — see **[INTEGRATION.md](INTEGRATION.md)**.
143
137
 
138
+ ## Observability
139
+
140
+ Every call emits:
141
+
142
+ - **OpenTelemetry span** (`mohdel.session.answer`) under the caller's `traceparent`, with GenAI semantic-convention attributes (`gen_ai.request.model`, `gen_ai.system`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`) plus mohdel's own (`mohdel.status`, `mohdel.cost`, `mohdel.thinking_tokens`, `mohdel.time_to_first_token_ms`, `mohdel.cooldown` on fast-fail).
143
+ - **Trace-linked logs** — every stderr log line carries `{traceId, spanId, callId, authId, provider, model}`. Dump logs + traces into the same collector (SigNoz, Honeycomb, Jaeger + Loki) and they're correlated for free. No per-call instrumentation code.
144
+ - **Gate-side OTLP metrics** (when running `thin-gate`): `mohdel.sessions.{alive,respawned,spawn_failures}`, `mohdel.calls{provider,status}`, `mohdel.call.duration_ms`, `mohdel.cooldown.rejections`, `mohdel.quota.rejections`, `mohdel.policy.errors`.
145
+
146
+ One endpoint for everything: set `OTEL_EXPORTER_OTLP_ENDPOINT` and spans + metrics flow to it over gRPC. No-op when unset — zero overhead for callers who aren't wired. See [INTEGRATION.md §OpenTelemetry](INTEGRATION.md#opentelemetry) and [LOGGING.md](LOGGING.md) for details.
147
+
148
+ The OTel SDK packages (`@opentelemetry/sdk-node`, `@opentelemetry/exporter-trace-otlp-grpc`) are **`optionalDependencies`** — installed by default, but `npm install --omit=optional` skips them (along with their gRPC transitive tree). If you do that and later want trace export, install them explicitly:
149
+
150
+ ```bash
151
+ npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-grpc
152
+ ```
153
+
154
+ `@opentelemetry/api` stays in `dependencies` — the no-op tracer needs it regardless of whether export is wired.
155
+
144
156
  ## Architecture
145
157
 
146
158
  Mohdel splits into three planes that can be deployed independently:
@@ -176,26 +188,7 @@ With no session-bin configured, thin-gate runs in demo mode: `POST /v1/call` ret
176
188
 
177
189
  ### Calling from JS
178
190
 
179
- ```js
180
- import { call } from 'mohdel/client'
181
-
182
- const envelope = {
183
- callId: 'c-1',
184
- authId: 'u-1',
185
- auth: { key: process.env.ANTHROPIC_API_SK },
186
- model: 'anthropic/claude-haiku-4-5',
187
- prompt: 'Say hi.',
188
- outputBudget: 100
189
- }
190
-
191
- for await (const ev of call(envelope, { socketPath: '/tmp/mohdel-data.sock' })) {
192
- if (ev.type === 'delta') process.stdout.write(ev.delta.delta)
193
- else if (ev.type === 'done') console.log('\n→ status:', ev.result.status, 'cost:', ev.result.cost)
194
- else if (ev.type === 'error') console.error('error:', ev.error.message)
195
- }
196
- ```
197
-
198
- Client surface is deliberately tiny: `call(envelope, { socketPath, signal? })`. Pass an `AbortSignal` to cancel in flight; thin-gate will forward a cancel control message to the session and reuse it on the pool. The envelope is the flat `answer(prompt, options)` surface plus transport metadata (`callId`, `authId`, `auth.key`, optional `traceparent`); see [`js/core/envelope.js`](js/core/envelope.js) for the full field list.
191
+ The client snippet under [Library Usage](#library-usage) above is the full surface: `call(envelope, { socketPath, signal? })` returns an async iterable of events. Pass an `AbortSignal` to cancel in flight; thin-gate forwards a cancel control message to the session and reuses it on the pool. The envelope is the flat `answer(prompt, options)` surface plus transport metadata (`callId`, `authId`, `auth.key`, optional `traceparent`); see [`js/core/envelope.js`](js/core/envelope.js) for the full field list.
199
192
 
200
193
  ### Canonical types (frozen wire contract)
201
194
 
@@ -366,12 +359,6 @@ Fork the repository and submit a pull request. Code style: Node 22+, ES modules,
366
359
 
367
360
  **Mohdel's wire is language-agnostic.** The JS client is the first implementation, not the only one — a Python / Go / Ruby / Swift / Elixir / ... client is a great starter contribution. See [CONTRIBUTING.md §Porting a client to another language](CONTRIBUTING.md#porting-a-client-to-another-language) and [PROTOCOL.md](PROTOCOL.md).
368
361
 
369
- ## See Also
370
-
371
- - [INTEGRATION.md](INTEGRATION.md) — embed mohdel in your code (factory, model proxy, answer options, tool use, streaming, vision, errors, OTel)
372
- - [ARCHITECTURE.md](ARCHITECTURE.md) — design decisions and rationale
373
- - [LOGGING.md](LOGGING.md) — log levels, prefixes, pino integration
374
-
375
362
  ## License
376
363
 
377
364
  MIT. See `LICENSE`.
@@ -0,0 +1,81 @@
1
+ {
2
+ "$schema": "./curated.schema.json",
3
+ "_comment": "Worked examples for ~/.config/mohdel/curated.json. See docs/CATALOG.md for the field reference. Each top-level key is '<provider>/<model-id>'. The provider segment is the routing key; the model-id segment must match the value of the 'model' field below (which is the literal id mohdel sends to the provider's API).",
4
+
5
+ "anthropic/claude-haiku-4-5": {
6
+ "_comment_a": "Minimal-but-useful entry. Required fields only: model, creator, inputFormat. Everything else is recommended — without prices, 'cost' on results will be 0; without contextTokenLimit, callers can't enforce input budgets.",
7
+ "model": "claude-haiku-4-5-20251001",
8
+ "creator": "anthropic",
9
+ "provider": "anthropic",
10
+ "sdk": "anthropic",
11
+ "label": "Claude Haiku 4.5",
12
+ "inputFormat": ["text", "image"],
13
+ "inputPrice": 1,
14
+ "outputPrice": 5,
15
+ "contextTokenLimit": 200000,
16
+ "outputTokenLimit": 64000,
17
+ "tags": ["fast", "chat"]
18
+ },
19
+
20
+ "anthropic/claude-sonnet-4-6": {
21
+ "_comment_b": "Full-featured entry. Adds cache pricing (provider-side prompt caching), thinking effort levels (mohdel translates 'low'/'medium'/'high'/etc. to the provider's native budget), default thinking effort, and a leaderboard tuple. The leaderboard is [intelligence, speed, latency] — used by 'mo rank'.",
22
+ "model": "claude-sonnet-4-6",
23
+ "creator": "anthropic",
24
+ "provider": "anthropic",
25
+ "sdk": "anthropic",
26
+ "label": "Claude Sonnet 4.6",
27
+ "inputFormat": ["text", "image"],
28
+ "inputPrice": 3,
29
+ "outputPrice": 15,
30
+ "cacheWritePrice": 3.75,
31
+ "cacheReadPrice": 0.30,
32
+ "contextTokenLimit": 1000000,
33
+ "outputTokenLimit": 128000,
34
+ "defaultThinkingEffort": "medium",
35
+ "thinkingEffortLevels": {
36
+ "low": 100,
37
+ "medium": 500,
38
+ "high": 2000,
39
+ "max": 5000
40
+ },
41
+ "tags": ["chat", "tool-loop", "vision"],
42
+ "leaderboard": [80, 95, 1.2]
43
+ },
44
+
45
+ "anthropic/claude-3-7-sonnet": {
46
+ "_comment_c": "Deprecated stub: a one-field entry that redirects callers to the replacement. 'mo' will refuse to use this id and point at the target. Stubs do NOT need any other fields.",
47
+ "deprecated": "anthropic/claude-sonnet-4-6"
48
+ },
49
+
50
+ "novita/flux-2-dev": {
51
+ "_comment_d": "Image-generation entry. type:'image' selects the image dispatcher (no streaming, no tools). imageEndpoint is the provider-side endpoint name; imagePrice is per-image USD.",
52
+ "model": "flux-2-dev",
53
+ "creator": "bfl",
54
+ "provider": "novita",
55
+ "label": "Flux 2 Dev",
56
+ "inputFormat": ["text"],
57
+ "type": "image",
58
+ "imagePrice": 0.012,
59
+ "imageEndpoint": "flux-2-dev",
60
+ "imageDefaultSize": "1024x1024",
61
+ "tags": ["image"]
62
+ },
63
+
64
+ "openai/gpt-5.4-mini": {
65
+ "_comment_e": "Entry with custom rate limits. rpmLimit and tpmLimit override the provider-level defaults in providers.json. rateLimitScope:'model' means the limit is per-model; 'provider' means it joins the provider-level pool.",
66
+ "model": "gpt-5.4-mini",
67
+ "creator": "openai",
68
+ "provider": "openai",
69
+ "sdk": "openai",
70
+ "label": "GPT-5.4 Mini",
71
+ "inputFormat": ["text", "image"],
72
+ "inputPrice": 0.15,
73
+ "outputPrice": 0.60,
74
+ "contextTokenLimit": 400000,
75
+ "outputTokenLimit": 16384,
76
+ "rpmLimit": 500,
77
+ "tpmLimit": 2000000,
78
+ "rateLimitScope": "model",
79
+ "tags": ["chat", "cheap"]
80
+ }
81
+ }
@@ -0,0 +1,156 @@
1
+ {
2
+ "$schema": "https://json-schema.org/draft/2020-12/schema",
3
+ "$id": "https://github.com/clbrge/mohdel/blob/main/config/curated.schema.json",
4
+ "title": "Mohdel curated.json",
5
+ "description": "Schema for ~/.config/mohdel/curated.json — see docs/CATALOG.md for the full reference.",
6
+ "type": "object",
7
+ "propertyNames": {
8
+ "pattern": "^([a-z0-9_-]+/.+|_[a-zA-Z0-9_-]+)$",
9
+ "description": "Model id in the form '<provider>/<model>'. Underscore-prefixed keys (like '_comment') are allowed for inline notes."
10
+ },
11
+ "additionalProperties": {
12
+ "oneOf": [
13
+ { "$ref": "#/$defs/deprecatedStub" },
14
+ { "$ref": "#/$defs/modelEntry" },
15
+ { "type": "string", "description": "Inline comment value for underscore-prefixed keys." }
16
+ ]
17
+ },
18
+ "$defs": {
19
+ "deprecatedStub": {
20
+ "type": "object",
21
+ "description": "Redirect from a retired model id to its replacement.",
22
+ "required": ["deprecated"],
23
+ "additionalProperties": true,
24
+ "properties": {
25
+ "deprecated": {
26
+ "type": "string",
27
+ "description": "Replacement model id (e.g. 'anthropic/claude-sonnet-4-6')."
28
+ }
29
+ }
30
+ },
31
+ "modelEntry": {
32
+ "type": "object",
33
+ "description": "A real model entry. 'model', 'creator', and 'inputFormat' are required.",
34
+ "required": ["model", "creator", "inputFormat"],
35
+ "additionalProperties": true,
36
+ "properties": {
37
+ "model": {
38
+ "type": "string",
39
+ "description": "Literal model id sent to the provider's API."
40
+ },
41
+ "creator": {
42
+ "type": "string",
43
+ "description": "Organization that trained the model (e.g. 'anthropic', 'openai', 'alibaba')."
44
+ },
45
+ "inputFormat": {
46
+ "type": "array",
47
+ "description": "Accepted input modalities. Defaults to ['text'].",
48
+ "items": { "type": "string", "enum": ["text", "image", "video"] },
49
+ "minItems": 1,
50
+ "uniqueItems": true
51
+ },
52
+ "provider": {
53
+ "type": "string",
54
+ "description": "Routing provider. Defaults to the provider segment of the catalog key."
55
+ },
56
+ "sdk": { "type": "string", "description": "SDK adapter to use (some providers require this)." },
57
+ "type": {
58
+ "type": "string",
59
+ "enum": ["model", "image"],
60
+ "default": "model",
61
+ "description": "'model' for chat/completion, 'image' for image generation."
62
+ },
63
+ "label": { "type": "string", "description": "Human-readable name shown in UIs." },
64
+ "description": { "type": "string" },
65
+ "version": { "type": "string" },
66
+ "createdAt": { "type": "string", "format": "date-time" },
67
+ "created": { "type": "number", "description": "Unix timestamp (seconds)." },
68
+
69
+ "inputPrice": {
70
+ "oneOf": [
71
+ { "type": "number", "minimum": 0 },
72
+ { "type": "object", "required": ["default"], "additionalProperties": { "type": "number" }, "properties": { "default": { "type": "number" } } }
73
+ ],
74
+ "description": "USD per 1M input tokens. Object form is for tiered pricing — must include a 'default' key."
75
+ },
76
+ "outputPrice": {
77
+ "oneOf": [
78
+ { "type": "number", "minimum": 0 },
79
+ { "type": "object", "required": ["default"], "additionalProperties": { "type": "number" }, "properties": { "default": { "type": "number" } } }
80
+ ],
81
+ "description": "USD per 1M output tokens."
82
+ },
83
+ "thinkingPrice": {
84
+ "oneOf": [
85
+ { "type": "number", "minimum": 0 },
86
+ { "type": "object", "required": ["default"], "additionalProperties": { "type": "number" }, "properties": { "default": { "type": "number" } } }
87
+ ],
88
+ "description": "USD per 1M thinking/reasoning tokens (when the provider bills these separately)."
89
+ },
90
+ "cacheWritePrice": { "type": "number", "minimum": 0, "description": "USD per 1M tokens written to provider-side prompt cache." },
91
+ "cacheReadPrice": { "type": "number", "minimum": 0, "description": "USD per 1M tokens served from provider-side prompt cache." },
92
+
93
+ "contextTokenLimit": { "type": "integer", "minimum": 1, "description": "Maximum total tokens (input + output)." },
94
+ "outputTokenLimit": { "type": "integer", "minimum": 1, "description": "Maximum output tokens per call." },
95
+ "thinkingTokenLimit": { "type": "integer", "minimum": 1, "description": "Maximum thinking tokens per call (when separate from output)." },
96
+ "tokenizerHeadroom": { "type": "number", "exclusiveMinimum": 0, "description": "Multiplier applied to local token estimates to account for tokenizer drift." },
97
+
98
+ "thinkingEffortLevels": {
99
+ "oneOf": [
100
+ { "type": "null" },
101
+ {
102
+ "type": "object",
103
+ "description": "Map effort name → provider-native budget. Standard names: low/medium/high/xhigh/max/none.",
104
+ "additionalProperties": { "type": "number", "minimum": 0 }
105
+ }
106
+ ]
107
+ },
108
+ "defaultThinkingEffort": { "type": "string", "description": "Effort level used when the envelope omits 'outputEffort'." },
109
+
110
+ "tags": {
111
+ "type": "array",
112
+ "items": { "type": "string", "pattern": "^[a-zA-Z][a-zA-Z0-9._-]{0,31}$" },
113
+ "uniqueItems": true,
114
+ "description": "Free-form classification tags (used by 'mo bench --tag', 'mo rank', and caller-side selection)."
115
+ },
116
+ "aliases": {
117
+ "type": "array",
118
+ "items": { "type": "string" },
119
+ "description": "Alternative ids that should resolve to this entry."
120
+ },
121
+ "replaces": {
122
+ "type": "array",
123
+ "items": { "type": "string" },
124
+ "description": "Older model ids this one supersedes."
125
+ },
126
+ "leaderboard": {
127
+ "type": "array",
128
+ "items": { "type": "number" },
129
+ "minItems": 3,
130
+ "maxItems": 3,
131
+ "description": "[intelligence, speed, latency] triple — drives 'mo rank'."
132
+ },
133
+ "leaderboardNote": { "type": "string" },
134
+
135
+ "supportsTools": { "type": "boolean", "description": "Set false to mark a model as tool-less." },
136
+
137
+ "imagePrice": { "type": "number", "minimum": 0, "description": "USD per generated image (image-type entries)." },
138
+ "imageEndpoint": { "type": "string", "description": "Provider-side image endpoint name." },
139
+ "imageDefaultSize": { "type": "string", "description": "Default size when envelope omits one (e.g. '1024x1024')." },
140
+
141
+ "rpmLimit": { "type": "integer", "minimum": 1, "description": "Requests per minute. Overrides provider default." },
142
+ "tpmLimit": { "type": "integer", "minimum": 1, "description": "Tokens per minute. Overrides provider default." },
143
+ "rateLimitScope": {
144
+ "type": "string",
145
+ "enum": ["model", "provider"],
146
+ "description": "'model' = private budget. 'provider' = shared with provider-level pool."
147
+ },
148
+
149
+ "deprecated": { "type": "string", "description": "If present, this entry is treated as a stub (use 'deprecatedStub' shape)." },
150
+ "suspended": { "type": "string", "description": "Reason this model is temporarily disabled." },
151
+
152
+ "displayName": { "type": "string", "deprecated": true, "description": "Deprecated — use 'label'." }
153
+ }
154
+ }
155
+ }
156
+ }
@@ -57,6 +57,14 @@
57
57
  * Opaque per-user ID passed to the provider for tracking / abuse
58
58
  * monitoring.
59
59
  *
60
+ * @property {number} [idleHeartbeatMs]
61
+ * When set, the session emits a synthetic `{type:'idle', sinceMs}`
62
+ * event whenever the adapter has been silent for at least this
63
+ * many milliseconds, and re-emits every `idleHeartbeatMs` while
64
+ * the gap persists. The consumer decides whether to act (log,
65
+ * bump a watchdog, abort via its own AbortSignal). Mohdel never
66
+ * aborts on its own. Omitting the field disables the heartbeat.
67
+ *
60
68
  * @property {Object<string, object>} [providerOptions]
61
69
  * Namespaced bag of provider-specific knobs that don't fit the
62
70
  * shared envelope. Keys are provider names; values are arbitrary
@@ -139,5 +147,6 @@ export const ENVELOPE_FIELDS = Object.freeze([
139
147
  'toolChoice',
140
148
  'parallelToolCalls',
141
149
  'identifier',
150
+ 'idleHeartbeatMs',
142
151
  'providerOptions'
143
152
  ])
package/js/core/events.js CHANGED
@@ -1,8 +1,13 @@
1
1
  /**
2
2
  * Event union for the session → thin-gate → client stream.
3
3
  *
4
- * Three events:
4
+ * Four events:
5
5
  * - `delta` — streaming chunk (message text or function-call args).
6
+ * - `idle` — synthetic event emitted when the adapter has been
7
+ * silent for at least `idleHeartbeatMs` (opt-in via
8
+ * `CallEnvelope.idleHeartbeatMs`). Carries the elapsed
9
+ * gap so the consumer can drive its own stall policy.
10
+ * Never terminal; further events may follow.
6
11
  * - `done` — terminal with the full `AnswerResult`.
7
12
  * - `error` — wire-format error (serializable `TypedError`).
8
13
  *
@@ -12,7 +17,7 @@
12
17
  */
13
18
 
14
19
  /**
15
- * @typedef {(DeltaEvent | DoneEvent | ErrorEvent)} Event
20
+ * @typedef {(DeltaEvent | IdleEvent | DoneEvent | ErrorEvent)} Event
16
21
  */
17
22
 
18
23
  /**
@@ -23,6 +28,23 @@
23
28
  * @property {DeltaChunk} delta
24
29
  */
25
30
 
31
+ /**
32
+ * Synthetic heartbeat emitted by `session/run.js` while the adapter
33
+ * is silent — i.e. no real event (delta / done / error / tool call)
34
+ * for at least `idleHeartbeatMs`. The consumer decides what to do
35
+ * (log, bump a watchdog, abort via its own AbortSignal). Mohdel
36
+ * never aborts on its own.
37
+ *
38
+ * Re-emitted every `idleHeartbeatMs` while the gap persists; the
39
+ * timer resets on the next real event.
40
+ *
41
+ * @typedef {object} IdleEvent
42
+ * @property {'idle'} type
43
+ * @property {number} sinceMs
44
+ * Milliseconds since the last real event (or since the call
45
+ * started, when no event has arrived yet).
46
+ */
47
+
26
48
  /**
27
49
  * @typedef {object} DeltaChunk
28
50
  * @property {('message'|'function_call')} type
@@ -109,7 +131,7 @@
109
131
  * ToolCall back unchanged so the adapter can re-attach it.
110
132
  */
111
133
 
112
- export const EVENT_TYPES = Object.freeze(['delta', 'done', 'error'])
134
+ export const EVENT_TYPES = Object.freeze(['delta', 'idle', 'done', 'error'])
113
135
 
114
136
  /**
115
137
  * @param {unknown} x
@@ -51,6 +51,7 @@ import { createRealtimeDeltaBuffer } from '../../src/lib/utils.js'
51
51
  * | `images` / `videos` / `cache` | flat fields on envelope |
52
52
  * | `tools` / `toolChoice` / `parallelToolCalls` | flat fields on envelope |
53
53
  * | `identifier` | envelope.identifier (adapter maps to provider field) |
54
+ * | `idleHeartbeatMs` | envelope.idleHeartbeatMs (session emits idle events) |
54
55
  * | `realtimeHandler` / `bufferOpts` | drained via createRealtimeDeltaBuffer |
55
56
  * | `providerOrder` / `providerAllow` / `providerDeny`| envelope.providerOptions.openrouter |
56
57
  * | `traceparent` / `baggage` | envelope transport metadata |
@@ -183,6 +184,7 @@ function toEnvelope ({ modelKey, configuration, prompt, options }) {
183
184
  if (options.toolChoice) envelope.toolChoice = options.toolChoice
184
185
  if (options.parallelToolCalls === false) envelope.parallelToolCalls = false
185
186
  if (options.identifier) envelope.identifier = options.identifier
187
+ if (options.idleHeartbeatMs !== undefined) envelope.idleHeartbeatMs = options.idleHeartbeatMs
186
188
 
187
189
  // OpenRouter routing prefs ride in their own bag to keep the flat
188
190
  // envelope clean. The openrouter adapter reads this via
@@ -0,0 +1,77 @@
1
+ /**
2
+ * Idle-heartbeat wrapper for an adapter's event stream.
3
+ *
4
+ * `withIdleHeartbeat(source, idleMs)` consumes `source` and re-emits
5
+ * every event it yields. When `source` is silent for at least
6
+ * `idleMs`, an `{type:'idle', sinceMs}` event is yielded; while the
7
+ * silence persists, further idle events are yielded every `idleMs`.
8
+ * The timer resets on every real event.
9
+ *
10
+ * Idle events are advisory — mohdel does not abort the call on its
11
+ * own. Consumers decide whether to log, bump a watchdog, or trigger
12
+ * an external AbortSignal.
13
+ *
14
+ * The in-flight `iterator.next()` is reused across timer firings so
15
+ * no real event is dropped: when the timer wins the race, the
16
+ * underlying promise stays pending and the next loop iteration
17
+ * attaches a fresh race to the same promise.
18
+ *
19
+ * @module session/_idle_heartbeat
20
+ */
21
+
22
+ /**
23
+ * @template T
24
+ * @param {AsyncIterable<T>} source
25
+ * @param {number | undefined | null} idleMs
26
+ * When falsy or non-positive, the source is yielded through
27
+ * unchanged (no timer is set up).
28
+ * @returns {AsyncGenerator<T | import('#core/events.js').IdleEvent>}
29
+ */
30
+ export async function * withIdleHeartbeat (source, idleMs) {
31
+ if (!idleMs || idleMs <= 0) {
32
+ yield * source
33
+ return
34
+ }
35
+
36
+ const iter = source[Symbol.asyncIterator]()
37
+ let lastAt = Date.now()
38
+ /** @type {Promise<IteratorResult<T>> | null} */
39
+ let pending = null
40
+
41
+ try {
42
+ while (true) {
43
+ if (!pending) pending = iter.next()
44
+
45
+ /** @type {NodeJS.Timeout | undefined} */
46
+ let timer
47
+ /** @type {{idle: true} | {real: IteratorResult<T>} | {err: unknown}} */
48
+ const winner = await new Promise(resolve => {
49
+ timer = setTimeout(() => resolve({ idle: true }), idleMs)
50
+ pending.then(
51
+ r => resolve({ real: r }),
52
+ e => resolve({ err: e })
53
+ )
54
+ })
55
+ clearTimeout(timer)
56
+
57
+ if ('idle' in winner) {
58
+ yield /** @type {import('#core/events.js').IdleEvent} */ ({
59
+ type: 'idle',
60
+ sinceMs: Date.now() - lastAt
61
+ })
62
+ continue
63
+ }
64
+
65
+ pending = null
66
+ if ('err' in winner) throw winner.err
67
+ if (winner.real.done) return
68
+ lastAt = Date.now()
69
+ yield winner.real.value
70
+ }
71
+ } finally {
72
+ // Best-effort cleanup if the consumer abandons us mid-stream.
73
+ if (typeof iter.return === 'function') {
74
+ try { await iter.return() } catch { /* ignore */ }
75
+ }
76
+ }
77
+ }
@@ -239,6 +239,10 @@ const providerOverrides = {
239
239
  if (code === 'rate_limit_exceeded') return tierResult(detail)
240
240
  return undefined
241
241
  },
242
+ xiaomi (_err, code, detail) {
243
+ if (code === 'rate_limit_exceeded') return tierResult(detail)
244
+ return undefined
245
+ },
242
246
  novita (_err, code, detail) {
243
247
  if (code === 'rate_limit_exceeded') return tierResult(detail)
244
248
  return undefined
@@ -24,6 +24,7 @@ import { novita } from './novita.js'
24
24
  import { openai } from './openai.js'
25
25
  import { openrouter } from './openrouter.js'
26
26
  import { xai } from './xai.js'
27
+ import { xiaomi } from './xiaomi.js'
27
28
 
28
29
  export const adapters = Object.freeze({
29
30
  anthropic,
@@ -38,7 +39,8 @@ export const adapters = Object.freeze({
38
39
  novita,
39
40
  openai,
40
41
  openrouter,
41
- xai
42
+ xai,
43
+ xiaomi
42
44
  })
43
45
 
44
46
  /**
@@ -0,0 +1,35 @@
1
+ /**
2
+ * Xiaomi MiMo adapter — OpenAI-compatible chat completions against
3
+ * api.xiaomimimo.com. Standard `reasoning_content` field handling
4
+ * is provided by the shared core; this module just wires the base
5
+ * URL and provider tag.
6
+ *
7
+ * @module session/adapters/xiaomi
8
+ */
9
+
10
+ import OpenAI from 'openai'
11
+
12
+ import { runChatCompletions } from './_chat_completions.js'
13
+ import { streamingDispatcher } from './_dispatcher.js'
14
+
15
+ const BASE_URL = 'https://api.xiaomimimo.com/v1'
16
+
17
+ /**
18
+ * @param {import('#core/envelope.js').CallEnvelope} envelope
19
+ * @param {{client?: any, signal?: AbortSignal, log?: any, span?: any}} [deps]
20
+ * @returns {AsyncGenerator<import('#core/events.js').Event>}
21
+ */
22
+ export async function * xiaomi (envelope, deps = {}) {
23
+ const client = deps.client ?? new OpenAI({
24
+ apiKey: envelope.auth.key,
25
+ baseURL: envelope.auth.baseURL || BASE_URL,
26
+ fetchOptions: { dispatcher: streamingDispatcher() }
27
+ })
28
+ yield * runChatCompletions(envelope, client, {
29
+ provider: 'xiaomi'
30
+ }, {
31
+ signal: deps.signal,
32
+ log: deps.log,
33
+ span: deps.span
34
+ })
35
+ }
package/js/session/run.js CHANGED
@@ -29,6 +29,7 @@ import { getProviderLimits } from './adapters/_providers.js'
29
29
  import { providerOf, catalogKey, effortOf } from '#core/model-id.js'
30
30
  import * as defaultCooldown from './_cooldown.js'
31
31
  import * as defaultLimiter from './_rate_limiter.js'
32
+ import { withIdleHeartbeat } from './_idle_heartbeat.js'
32
33
  import { logger as defaultLogger } from './_logger.js'
33
34
  import {
34
35
  startSpan,
@@ -175,7 +176,18 @@ export async function * run (envelope, {
175
176
  let lastFrameAt = startedAt
176
177
  let maxInterFrameMs = 0
177
178
  try {
178
- for await (const ev of adapter(envelope, { signal, log, span })) {
179
+ const adapterStream = adapter(envelope, { signal, log, span })
180
+ const heartbeated = withIdleHeartbeat(adapterStream, envelope.idleHeartbeatMs)
181
+ for await (const ev of heartbeated) {
182
+ // Idle events are synthetic — they're emitted *because* nothing
183
+ // else has happened. Don't reset the inter-frame clock (that
184
+ // would mask the very stall the heartbeat is reporting) and
185
+ // don't run them through the terminal/cooldown branches.
186
+ if (ev.type === 'idle') {
187
+ yield ev
188
+ continue
189
+ }
190
+
179
191
  const now = Date.now()
180
192
  const gap = now - lastFrameAt
181
193
  if (gap > maxInterFrameMs) maxInterFrameMs = gap
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mohdel",
3
- "version": "0.105.2",
3
+ "version": "0.107.0",
4
4
  "license": "MIT",
5
5
  "author": {
6
6
  "name": "Christophe Le Bars",
@@ -87,12 +87,12 @@
87
87
  "@opentelemetry/exporter-trace-otlp-grpc": "^0.218.0",
88
88
  "@opentelemetry/sdk-node": "^0.218.0",
89
89
  "chalk": "^5.4.0",
90
- "mohdel-thin-gate-linux-x64-gnu": "0.105.2"
90
+ "mohdel-thin-gate-linux-x64-gnu": "0.107.0"
91
91
  },
92
92
  "dependencies": {
93
- "@anthropic-ai/sdk": "^0.95.2",
93
+ "@anthropic-ai/sdk": "^0.96.0",
94
94
  "@cerebras/cerebras_cloud_sdk": "^1.61.1",
95
- "@google/genai": "^2.2.0",
95
+ "@google/genai": "^2.3.0",
96
96
  "@opentelemetry/api": "^1.9.1",
97
97
  "env-paths": "^4.0.0",
98
98
  "groq-sdk": "^1.2.0",
package/src/cli/ask.js CHANGED
@@ -3,6 +3,42 @@ import { loadDefaultEnv } from '../lib/common.js'
3
3
 
4
4
  const noop = () => {}
5
5
 
6
+ // Friendly next-step hints for common ask-time failures. Pure pattern match on
7
+ // err.message — keeps the lib layer neutral, but gives CLI users a copy-pasteable
8
+ // command instead of just an error.
9
+ const hintsForError = (err, modelId) => {
10
+ const msg = String(err?.message || '')
11
+ const detail = String(err?.detail || '')
12
+ const both = `${msg}\n${detail}`
13
+ const provider = modelId.includes('/') ? modelId.split('/')[0] : null
14
+ const hints = []
15
+
16
+ if (/not found in curated models/i.test(both)) {
17
+ if (provider) {
18
+ hints.push(`→ run: mo curate ${provider} # add upstream models from this provider`)
19
+ hints.push(`→ or: mo model add ${modelId} # add this one manually`)
20
+ } else {
21
+ hints.push('→ run: mo ls # list available models')
22
+ }
23
+ hints.push('→ see: docs/CATALOG.md # catalog format reference')
24
+ }
25
+
26
+ if (/API key not found/i.test(both) || /AUTH_INVALID/i.test(err?.type || '') || /401|unauthorized|invalid api key/i.test(both)) {
27
+ if (provider) hints.push(`→ run: mo setup ${provider}`)
28
+ else hints.push('→ run: mo # interactive provider/key setup')
29
+ }
30
+
31
+ if (/deprecated/i.test(both) && /replacement/i.test(both)) {
32
+ hints.push('→ run: mo check # find broken deprecation links in curated.json')
33
+ }
34
+
35
+ if (/Provider configuration for/i.test(both)) {
36
+ hints.push('→ see: docs/CATALOG.md # the provider segment must match a known adapter')
37
+ }
38
+
39
+ return hints
40
+ }
41
+
6
42
  export async function runAsk (args) {
7
43
  if (args.includes('-h') || args.includes('--help')) {
8
44
  console.log(`mohdel ask — one-shot inference, pipeable
@@ -97,6 +133,7 @@ Examples:
97
133
  model = mo.use(modelId)
98
134
  } catch (err) {
99
135
  console.error(err.message)
136
+ for (const h of hintsForError(err, modelId)) console.error(h)
100
137
  process.exit(1)
101
138
  }
102
139
 
@@ -155,6 +192,7 @@ Examples:
155
192
  if (summary.length) process.stderr.write(`${summary.join(', ')}\n`)
156
193
  } catch (err) {
157
194
  console.error(`Error: ${err.detail || err.message}`)
195
+ for (const h of hintsForError(err, modelId)) console.error(h)
158
196
  process.exit(1)
159
197
  }
160
198
  }
package/src/cli/check.js CHANGED
@@ -1,7 +1,7 @@
1
1
  import { label, err, warn, ok } from './colors.js'
2
2
  import providers from '../lib/providers.js'
3
3
  import { validate, isValidTag } from '../lib/schema.js'
4
- import { getCuratedModels, loadDefaultEnv } from '../lib/common.js'
4
+ import { getCuratedModels, loadDefaultEnv, catalogEntries, catalogValues } from '../lib/common.js'
5
5
 
6
6
  // --- Local validation ---
7
7
 
@@ -10,7 +10,7 @@ const checkLocal = (curated) => {
10
10
  const warnings = []
11
11
  const knownProviders = new Set(Object.keys(providers))
12
12
 
13
- for (const [key, spec] of Object.entries(curated)) {
13
+ for (const [key, spec] of catalogEntries(curated)) {
14
14
  const [keyProvider] = key.split('/')
15
15
 
16
16
  if (spec.deprecated) {
@@ -87,8 +87,9 @@ detection, file an issue — it'll be rebuilt on the /session stack.`)
87
87
  const json = args.includes('--json')
88
88
 
89
89
  const curated = await getCuratedModels()
90
- const active = Object.values(curated).filter(s => !s.deprecated).length
91
- const deprecated = Object.values(curated).length - active
90
+ const all = catalogValues(curated)
91
+ const active = all.filter(s => !s.deprecated).length
92
+ const deprecated = all.length - active
92
93
 
93
94
  if (!json) {
94
95
  console.log(`${label('Catalog:')} ${active} active, ${deprecated} deprecated\n`)
@@ -1,12 +1,12 @@
1
1
  import { intro, outro, select, isCancel, cancel } from '@clack/prompts'
2
- import { getCuratedModels, CONFIG_PATH, saveConfig } from '../lib/common.js'
2
+ import { getCuratedModels, CONFIG_PATH, saveConfig, catalogEntries } from '../lib/common.js'
3
3
  import providers from '../lib/providers.js'
4
4
 
5
5
  export async function runDefault () {
6
6
  intro('Mohdel — Set Default Model')
7
7
 
8
8
  const curated = await getCuratedModels()
9
- const modelOptions = Object.entries(curated).map(([modelId, info]) => ({
9
+ const modelOptions = catalogEntries(curated).map(([modelId, info]) => ({
10
10
  value: modelId,
11
11
  label: `${info.label} (${modelId})`
12
12
  }))
@@ -0,0 +1,180 @@
1
+ import { existsSync } from 'fs'
2
+ import { label, meta, ok, warn, err, inactive } from './colors.js'
3
+ import providers from '../lib/providers.js'
4
+ import { validate, isValidTag } from '../lib/schema.js'
5
+ import {
6
+ CONFIG_DIR, CURATED_PATH, CONFIG_PATH, ENV_PATH,
7
+ getCuratedModels, getConfig, loadDefaultEnv, getAPIKey, catalogEntries
8
+ } from '../lib/common.js'
9
+
10
+ const row = (status, name, detail = '') =>
11
+ ` ${status} ${name.padEnd(20)} ${meta(detail)}`
12
+
13
+ export async function runDoctor (args) {
14
+ if (args.includes('-h') || args.includes('--help')) {
15
+ console.log(`mohdel doctor — check that your install is wired up
16
+
17
+ Usage:
18
+ mo doctor [--json]
19
+
20
+ What it checks:
21
+ - Config directory and environment file exist
22
+ - At least one provider API key is set
23
+ - curated.json parses and passes schema validation
24
+ - Default model (if set) resolves to a real entry
25
+
26
+ Exit code:
27
+ 0 no errors (warnings allowed)
28
+ 1 one or more errors — fix them before relying on the install`)
29
+ process.exit(0)
30
+ }
31
+
32
+ const json = args.includes('--json')
33
+ loadDefaultEnv()
34
+
35
+ const report = {
36
+ configDir: { ok: false, path: CONFIG_DIR },
37
+ envFile: { ok: false, path: ENV_PATH },
38
+ curatedFile: { ok: false, path: CURATED_PATH, active: 0, deprecated: 0 },
39
+ keys: { configured: [], missing: [] },
40
+ schema: { errors: [], warnings: [] },
41
+ defaultModel: { set: false, id: null, resolves: false },
42
+ errors: [],
43
+ warnings: []
44
+ }
45
+
46
+ // 1. Config dir + env file
47
+ report.configDir.ok = existsSync(CONFIG_DIR)
48
+ report.envFile.ok = existsSync(ENV_PATH)
49
+ if (!report.configDir.ok) report.warnings.push('config directory does not exist (will be created on first save)')
50
+ if (!report.envFile.ok) report.warnings.push(`no ${ENV_PATH} — set API keys there or via shell env`)
51
+
52
+ // 2. API keys per provider
53
+ for (const [name, def] of Object.entries(providers)) {
54
+ if (getAPIKey(def.apiKeyEnv)) {
55
+ report.keys.configured.push({ provider: name, envVar: def.apiKeyEnv })
56
+ } else {
57
+ report.keys.missing.push({ provider: name, envVar: def.apiKeyEnv })
58
+ }
59
+ }
60
+ if (!report.keys.configured.length) {
61
+ report.errors.push('no API keys configured — run "mo" to set one up')
62
+ }
63
+
64
+ // 3. curated.json — exists, parses, validates
65
+ let curated = null
66
+ try {
67
+ curated = await getCuratedModels()
68
+ report.curatedFile.ok = true
69
+ const entries = catalogEntries(curated)
70
+ report.curatedFile.active = entries.filter(([, s]) => !s.deprecated).length
71
+ report.curatedFile.deprecated = entries.length - report.curatedFile.active
72
+
73
+ // Schema validation (same logic as 'mo check', condensed)
74
+ const knownProviders = new Set(Object.keys(providers))
75
+ for (const [key, spec] of catalogEntries(curated)) {
76
+ if (spec.deprecated) {
77
+ if (!curated[spec.deprecated]) {
78
+ report.schema.errors.push(`${key}: deprecated target '${spec.deprecated}' missing`)
79
+ }
80
+ continue
81
+ }
82
+ for (const issue of validate(spec, key)) {
83
+ if (issue.severity === 'error') report.schema.errors.push(`${key}: ${issue.field} — ${issue.message}`)
84
+ else report.schema.warnings.push(`${key}: ${issue.field} — ${issue.message}`)
85
+ }
86
+ const [keyProvider] = key.split('/')
87
+ if (!knownProviders.has(keyProvider)) {
88
+ report.schema.errors.push(`${key}: provider '${keyProvider}' not in providers.js`)
89
+ }
90
+ if (Array.isArray(spec.tags)) {
91
+ for (const t of spec.tags) {
92
+ if (!isValidTag(t)) report.schema.warnings.push(`${key}: invalid tag "${t}"`)
93
+ }
94
+ }
95
+ }
96
+ if (report.schema.errors.length) {
97
+ report.errors.push(`${report.schema.errors.length} schema error(s) in curated.json — run "mo check" for details`)
98
+ }
99
+ } catch (e) {
100
+ report.errors.push(`curated.json: ${e.message}`)
101
+ }
102
+
103
+ // 4. Default model
104
+ if (existsSync(CONFIG_PATH)) {
105
+ try {
106
+ const cfg = await getConfig()
107
+ if (cfg.defaultModel) {
108
+ report.defaultModel.set = true
109
+ report.defaultModel.id = cfg.defaultModel
110
+ report.defaultModel.resolves = !!(curated && curated[cfg.defaultModel])
111
+ if (!report.defaultModel.resolves) {
112
+ report.errors.push(`default model '${cfg.defaultModel}' is not in curated.json`)
113
+ }
114
+ }
115
+ } catch {
116
+ report.warnings.push('default.json present but failed to parse')
117
+ }
118
+ } else {
119
+ report.warnings.push('no default model set — pass <provider/model> to "mo ask", or run "mo default"')
120
+ }
121
+
122
+ if (json) {
123
+ console.log(JSON.stringify(report, null, 2))
124
+ process.exit(report.errors.length ? 1 : 0)
125
+ }
126
+
127
+ // Pretty output
128
+ console.log(label('Mohdel doctor\n'))
129
+
130
+ console.log(label('Configuration'))
131
+ console.log(row(report.configDir.ok ? ok('✓') : warn('!'), 'Config dir', report.configDir.path))
132
+ console.log(row(report.envFile.ok ? ok('✓') : warn('!'), 'Env file', report.envFile.ok ? report.envFile.path : `${report.envFile.path} (missing)`))
133
+ if (report.curatedFile.ok) {
134
+ console.log(row(ok('✓'), 'curated.json', `${report.curatedFile.active} active, ${report.curatedFile.deprecated} deprecated`))
135
+ } else {
136
+ console.log(row(err('✗'), 'curated.json', 'failed to load'))
137
+ }
138
+
139
+ console.log()
140
+ console.log(label(`API keys (${report.keys.configured.length} of ${Object.keys(providers).length})`))
141
+ for (const k of report.keys.configured) {
142
+ console.log(row(ok('✓'), k.provider, k.envVar))
143
+ }
144
+ for (const k of report.keys.missing) {
145
+ console.log(row(inactive('○'), k.provider, `${k.envVar} (unset)`))
146
+ }
147
+
148
+ console.log()
149
+ console.log(label('Catalog validation'))
150
+ if (report.schema.errors.length) {
151
+ console.log(` ${err('✗')} ${report.schema.errors.length} error(s) ${meta('— run "mo check" for details')}`)
152
+ } else {
153
+ console.log(` ${ok('✓')} no errors`)
154
+ }
155
+ if (report.schema.warnings.length) {
156
+ console.log(` ${warn('!')} ${report.schema.warnings.length} warning(s) ${meta('— run "mo check" for details')}`)
157
+ }
158
+
159
+ console.log()
160
+ console.log(label('Default model'))
161
+ if (report.defaultModel.set) {
162
+ const status = report.defaultModel.resolves ? ok('✓') : err('✗')
163
+ const note = report.defaultModel.resolves ? '' : '(not in curated.json)'
164
+ console.log(row(status, report.defaultModel.id, note))
165
+ } else {
166
+ console.log(row(inactive('○'), 'not set', 'pass <provider/model> to mo ask, or run "mo default"'))
167
+ }
168
+
169
+ console.log()
170
+ if (report.errors.length) {
171
+ console.log(`${err('✗')} ${report.errors.length} error(s):`)
172
+ for (const e of report.errors) console.log(` ${err('✗')} ${e}`)
173
+ process.exit(1)
174
+ } else if (report.warnings.length) {
175
+ console.log(`${warn('!')} ${report.warnings.length} warning(s) — install is usable but not fully configured`)
176
+ for (const w of report.warnings) console.log(` ${warn('!')} ${w}`)
177
+ } else {
178
+ console.log(`${ok('✓')} ready`)
179
+ }
180
+ }
package/src/cli/index.js CHANGED
@@ -57,6 +57,7 @@ Commands:
57
57
  ask <provider/model> [prompt] One-shot inference (pipeable)
58
58
 
59
59
  default Set default model (interactive)
60
+ doctor Check that your install is wired up
60
61
 
61
62
  Aliases:
62
63
  models model list
@@ -126,6 +127,9 @@ const resolvedArgs = alias ? [...alias.inject, ...args] : args
126
127
  if (resolved === 'default') {
127
128
  const { runDefault } = await import('./default.js')
128
129
  await runDefault()
130
+ } else if (resolved === 'doctor') {
131
+ const { runDoctor } = await import('./doctor.js')
132
+ await runDoctor(resolvedArgs)
129
133
  } else if (resolved === 'ask') {
130
134
  const { runAsk } = await import('./ask.js')
131
135
  await runAsk(resolvedArgs)
package/src/cli/model.js CHANGED
@@ -250,9 +250,35 @@ ${meta('tags:')} ${(info.tags || []).map(t => tag(t)).join(', ') || meta
250
250
 
251
251
  if (action === 'add') {
252
252
  const modelId = arg1
253
+ if (modelId === '-h' || modelId === '--help') {
254
+ console.log(`mohdel model add — add a model entry to ~/.config/mohdel/curated.json
255
+
256
+ Usage:
257
+ model add <provider>/<model-id>
258
+
259
+ What it does:
260
+ 1. Resolves <provider> against the known provider list (anthropic, openai, …)
261
+ 2. Pre-fills 'model', 'provider', 'sdk' from that resolution
262
+ 3. If your API key is set, fetches upstream model metadata (context, pricing, …)
263
+ 4. Prompts for any missing required field
264
+
265
+ Examples:
266
+ mo model add fireworks/deepseek-r1
267
+ mo model add anthropic/claude-haiku-4-5
268
+
269
+ Required fields (asked if not pre-filled):
270
+ model the literal id sent to the provider's API
271
+ creator who trained the model (e.g. anthropic, openai, alibaba)
272
+ inputFormat subset of [text, image, video]
273
+
274
+ See docs/CATALOG.md for the full field reference, and
275
+ config/curated.example.json for ready-to-copy entries.`)
276
+ process.exit(0)
277
+ }
253
278
  if (!modelId || !modelId.includes('/')) {
254
279
  console.error('Usage: model add <provider>/<model-id>')
255
280
  console.error('Example: mo model add fireworks/deepseek-r1')
281
+ console.error('Run "mo model add --help" for details.')
256
282
  process.exit(1)
257
283
  }
258
284
 
@@ -324,6 +350,31 @@ ${meta('tags:')} ${(info.tags || []).map(t => tag(t)).join(', ') || meta
324
350
  }
325
351
 
326
352
  if (action === 'curate') {
353
+ if (arg1 === '-h' || arg1 === '--help') {
354
+ console.log(`mohdel model curate — bulk-add upstream models to your catalog
355
+
356
+ Usage:
357
+ curate Pick a provider from a menu
358
+ curate <provider> Curate models from one provider directly
359
+
360
+ What it does:
361
+ Lists every model your API key can see at <provider>, lets you select
362
+ which to add to ~/.config/mohdel/curated.json. Pre-fills 'model',
363
+ 'provider', 'sdk', and any metadata the SDK can return.
364
+
365
+ Examples:
366
+ mo curate # interactive provider picker
367
+ mo curate anthropic
368
+ mo curate openai
369
+
370
+ Tip: after curating, fill in the things only you know — prices, contextTokenLimit,
371
+ tags, thinkingEffortLevels — with 'mo model set <id> <key> <value>' or by editing
372
+ ~/.config/mohdel/curated.json directly. See docs/CATALOG.md for the field reference
373
+ and config/curated.example.json for ready-to-copy entries.
374
+
375
+ Requires an API key for the chosen provider — run 'mo' to configure one.`)
376
+ process.exit(0)
377
+ }
327
378
  const { initializeAPIs, processModels } = await import('../lib/select.js')
328
379
  const { api, providersWithKeys } = await initializeAPIs()
329
380
 
@@ -139,10 +139,13 @@ export async function runOnboard () {
139
139
  }
140
140
  }
141
141
  console.log(`\n${meta('Commands:')}
142
- mo model list Browse models
143
- mo model show <model> Model details
144
- mo default Set default model
145
- mo --help All commands`)
142
+ mo ask <model> "..." One-shot inference (pipeable)
143
+ mo doctor Check install health
144
+ mo model list Browse curated models
145
+ mo model show <model> Model details
146
+ mo default Set default model
147
+ mo provider setup <p> Add another provider
148
+ mo --help All commands`)
146
149
  return
147
150
  }
148
151
 
package/src/lib/common.js CHANGED
@@ -17,6 +17,15 @@ export const EXCLUDED_PATH = join(CONFIG_DIR, 'excluded.json')
17
17
  export const PROVIDERS_CONFIG_PATH = join(CONFIG_DIR, 'providers.json')
18
18
  export const ENV_PATH = join(CONFIG_DIR, 'environment')
19
19
 
20
+ // Meta keys (e.g. $schema for JSON Schema editors, _* for inline notes) live at
21
+ // the top level of curated.json alongside model entries. They're preserved on
22
+ // load/save but excluded from every iteration site so they don't pollute alias
23
+ // maps, rank indices, suggestion search, or pickers.
24
+ export const isMetaKey = (key) => key.startsWith('$') || key.startsWith('_')
25
+ export const catalogEntries = (catalog) => Object.entries(catalog).filter(([k]) => !isMetaKey(k))
26
+ export const catalogKeys = (catalog) => Object.keys(catalog).filter(k => !isMetaKey(k))
27
+ export const catalogValues = (catalog) => catalogEntries(catalog).map(([, v]) => v)
28
+
20
29
  const DEFAULT_CURATED = {}
21
30
 
22
31
  const DEFAULT_EXCLUDED = {}
@@ -115,6 +124,11 @@ const createFileOperation = (filePath, defaultValue = {}, operationType) => {
115
124
  if (typeof loadedData === 'object' && loadedData !== null && !Array.isArray(loadedData)) {
116
125
  const processedData = {}
117
126
  for (const [key, entryValue] of Object.entries(loadedData)) {
127
+ // Meta keys are preserved on round-trip but skipped from entry processing.
128
+ if (isMetaKey(key)) {
129
+ processedData[key] = entryValue
130
+ continue
131
+ }
118
132
  if (typeof entryValue !== 'object' || entryValue === null || Array.isArray(entryValue)) {
119
133
  processedData[key] = entryValue
120
134
  continue
@@ -131,7 +145,7 @@ const createFileOperation = (filePath, defaultValue = {}, operationType) => {
131
145
  }
132
146
 
133
147
  if (operationType === 'curated models') {
134
- for (const [curatedKey, entry] of Object.entries(processedData)) {
148
+ for (const [curatedKey, entry] of catalogEntries(processedData)) {
135
149
  const issues = validate(entry, curatedKey)
136
150
  for (const issue of issues) {
137
151
  moduleLogger.warn(`[mohdel:schema] ${curatedKey}: ${issue.field} — ${issue.message}`)
@@ -39,6 +39,11 @@ const creators = {
39
39
  logo: 'minimax.svg',
40
40
  description: 'Minimax offers versatile Chinese-first chat and coding models tuned for fast, cost-aware assistants and enterprise integrations.'
41
41
  },
42
+ mistral: {
43
+ label: 'Mistral',
44
+ logo: 'mistral.svg',
45
+ description: 'Mistral AI ships strong open-weight and proprietary models with a focus on European hosting, multilingual quality, and efficient deployment.'
46
+ },
42
47
  moonshotai: {
43
48
  label: 'Moonshot AI',
44
49
  logo: 'moonshotai.svg',
@@ -1,4 +1,4 @@
1
- import { getCuratedModels, saveCuratedModels } from './common.js'
1
+ import { getCuratedModels, saveCuratedModels, isMetaKey, catalogKeys } from './common.js'
2
2
 
3
3
  let curatedCache = null
4
4
  let aliasMapCache = null
@@ -17,6 +17,7 @@ const buildAliasMap = (curatedModels) => {
17
17
 
18
18
  // Pass 1: count names and cache parsed results
19
19
  for (const fullModelId in curatedModels) {
20
+ if (isMetaKey(fullModelId)) continue
20
21
  const { provider, model: modelName } = getMohdelModel(fullModelId)
21
22
  const baseMatch = modelName.match(BASE_NAME_RE)
22
23
  const baseName = baseMatch?.[1] || null
@@ -79,7 +80,7 @@ export const suggestModels = (query, maxResults = 5) => {
79
80
  const q = query.toLowerCase()
80
81
  const scored = []
81
82
 
82
- for (const fullId of Object.keys(curatedCache)) {
83
+ for (const fullId of catalogKeys(curatedCache)) {
83
84
  if (curatedCache[fullId].deprecated) continue
84
85
  const entry = curatedCache[fullId]
85
86
  const label = (entry.label || '').toLowerCase()
@@ -95,6 +95,15 @@ const providers = {
95
95
  creators: ['xai'],
96
96
  contextSemantics: 'shared',
97
97
  outputCapStrategy: 'accept'
98
+ },
99
+ xiaomi: {
100
+ sdk: 'openai',
101
+ api: 'chatCompletions',
102
+ apiKeyEnv: 'XIAOMI_API_SK',
103
+ createConfiguration: apiKey => ({ baseURL: 'https://api.xiaomimimo.com/v1', apiKey }),
104
+ creators: ['xiaomi'],
105
+ contextSemantics: 'shared',
106
+ outputCapStrategy: 'accept'
98
107
  }
99
108
  }
100
109
 
package/src/lib/rank.js CHANGED
@@ -6,6 +6,7 @@ import { join, dirname } from 'path'
6
6
  import { fileURLToPath } from 'url'
7
7
  import { existsSync } from 'fs'
8
8
  import { CACHE_DIR } from './cache.js'
9
+ import { catalogEntries } from './common.js'
9
10
 
10
11
  const __dirname = dirname(fileURLToPath(import.meta.url))
11
12
  const CONFIG_PATH = join(__dirname, '..', '..', 'config', 'benchmarks.json')
@@ -254,7 +255,7 @@ const computeGroupScores = (available, benchmarks) => {
254
255
 
255
256
  const buildCuratedIndex = (curated) => {
256
257
  const index = new Map()
257
- for (const [key, entry] of Object.entries(curated)) {
258
+ for (const [key, entry] of catalogEntries(curated)) {
258
259
  if (entry.deprecated) continue
259
260
  const modelPart = key.split('/').slice(1).join('/')
260
261
  index.set(modelPart, key)
package/src/lib/select.js CHANGED
@@ -7,7 +7,8 @@ import {
7
7
  saveCuratedModels,
8
8
  saveExcludedModels,
9
9
  loadEnvFile,
10
- loadDefaultEnv
10
+ loadDefaultEnv,
11
+ catalogEntries
11
12
  } from './common.js'
12
13
  import { getMohdelModel } from './curated-cache.js'
13
14
  import { stripUnknown } from './schema.js'
@@ -81,7 +82,7 @@ const findReplacementCandidates = (providerName, modelId, curated) => {
81
82
 
82
83
  const candidates = []
83
84
 
84
- for (const [curatedKey, curatedInfo] of Object.entries(curated)) {
85
+ for (const [curatedKey, curatedInfo] of catalogEntries(curated)) {
85
86
  const { provider: curProviderName, model: curModelId } = getMohdelModel(curatedKey)
86
87
  if (curProviderName === providerName && curModelId !== modelId) {
87
88
  if (baseRegExp.test(curModelId)) {