modelmix 4.4.12 → 4.4.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +12 -11
- package/demo/gemini.js +12 -9
- package/demo/gpt-realtime.js +22 -0
- package/demo/{gpt51.js → gpt54.js} +2 -2
- package/demo/package-lock.json +11 -1
- package/demo/package.json +2 -1
- package/index.js +364 -3
- package/package.json +6 -5
- package/skills/modelmix/SKILL.md +183 -78
package/README.md
CHANGED
|
@@ -135,9 +135,10 @@ Here's a comprehensive list of available methods:
|
|
|
135
135
|
|
|
136
136
|
| Method | Provider | Model | Price (I/O) per 1 M tokens |
|
|
137
137
|
| ------------------ | ---------- | ------------------------------ | -------------------------- |
|
|
138
|
+
| `gpt54()` | OpenAI | gpt-5.4 | [\$2.50 / \$15.00][1] |
|
|
138
139
|
| `gpt52()` | OpenAI | gpt-5.2 | [\$1.75 / \$14.00][1] |
|
|
139
140
|
| `gpt51()` | OpenAI | gpt-5.1 | [\$1.25 / \$10.00][1] |
|
|
140
|
-
| `
|
|
141
|
+
| `gpt53codex()` | OpenAI | gpt-5.3-codex | [\$1.25 / \$14.00][1] |
|
|
141
142
|
| `gpt5mini()` | OpenAI | gpt-5-mini | [\$0.25 / \$2.00][1] |
|
|
142
143
|
| `gpt5nano()` | OpenAI | gpt-5-nano | [\$0.05 / \$0.40][1] |
|
|
143
144
|
| `gpt41()` | OpenAI | gpt-4.1 | [\$2.00 / \$8.00][1] |
|
|
@@ -148,8 +149,8 @@ Here's a comprehensive list of available methods:
|
|
|
148
149
|
| `opus45[think]()` | Anthropic | claude-opus-4-5-20251101 | [\$5.00 / \$25.00][2] |
|
|
149
150
|
| `sonnet46[think]()`| Anthropic | claude-sonnet-4-6 | [\$3.00 / \$15.00][2] |
|
|
150
151
|
| `sonnet45[think]()`| Anthropic | claude-sonnet-4-5-20250929 | [\$3.00 / \$15.00][2] |
|
|
151
|
-
| `haiku35()` | Anthropic | claude-3-5-haiku-20241022 | [\$0.80 / \$4.00][2] |
|
|
152
152
|
| `haiku45[think]()` | Anthropic | claude-haiku-4-5-20251001 | [\$1.00 / \$5.00][2] |
|
|
153
|
+
| `gemini31pro()` | Google | gemini-3.1-pro-preview | [\$2.00 / \$12.00][3] |
|
|
153
154
|
| `gemini3pro()` | Google | gemini-3-pro-preview | [\$2.00 / \$12.00][3] |
|
|
154
155
|
| `gemini3flash()` | Google | gemini-3-flash-preview | [\$0.50 / \$3.00][3] |
|
|
155
156
|
| `gemini25pro()` | Google | gemini-2.5-pro | [\$1.25 / \$10.00][3] |
|
|
@@ -161,8 +162,6 @@ Here's a comprehensive list of available methods:
|
|
|
161
162
|
| `minimaxM25()` | MiniMax | MiniMax-M2.5 | [\$0.30 / \$1.20][9] |
|
|
162
163
|
| `sonar()` | Perplexity | sonar | [\$1.00 / \$1.00][4] |
|
|
163
164
|
| `sonarPro()` | Perplexity | sonar-pro | [\$3.00 / \$15.00][4] |
|
|
164
|
-
| `scout()` | Groq | Llama-4-Scout-17B-16E-Instruct | [\$0.11 / \$0.34][5] |
|
|
165
|
-
| `maverick()` | Groq | Maverick-17B-128E-Instruct-FP8 | [\$0.20 / \$0.60][5] |
|
|
166
165
|
| `hermes3()` | Lambda | Hermes-3-Llama-3.1-405B-FP8 | [\$0.80 / \$0.80][8] |
|
|
167
166
|
| `qwen3()` | Together | Qwen3-235B-A22B-fp8-tput | [\$0.20 / \$0.60][7] |
|
|
168
167
|
| `kimiK2()` | Together | Kimi-K2-Instruct | [\$1.00 / \$3.00][7] |
|
|
@@ -345,11 +344,11 @@ Descriptions support **descriptor objects** with `description`, `required`, `enu
|
|
|
345
344
|
|
|
346
345
|
```javascript
|
|
347
346
|
const result = await model.json(
|
|
348
|
-
{ name: '
|
|
347
|
+
{ name: 'Martin', age: 22, sex: 'male' },
|
|
349
348
|
{
|
|
350
349
|
name: { description: 'Name of the actor', required: false },
|
|
351
|
-
age: 'Age of the actor',
|
|
352
|
-
sex: { description: 'Gender', enum: ['
|
|
350
|
+
age: 'Age of the actor', // string still works
|
|
351
|
+
sex: { description: 'Gender', enum: ['male', 'female', null], default: null }
|
|
353
352
|
}
|
|
354
353
|
);
|
|
355
354
|
```
|
|
@@ -406,7 +405,9 @@ Every response from `raw()` now includes a `tokens` object with the following st
|
|
|
406
405
|
tokens: {
|
|
407
406
|
input: 150, // Number of tokens in the prompt/input
|
|
408
407
|
output: 75, // Number of tokens in the completion/output
|
|
409
|
-
total: 225
|
|
408
|
+
total: 225, // Total tokens used (input + output)
|
|
409
|
+
cost: 0.0012, // Estimated cost in USD (null if model not in pricing table)
|
|
410
|
+
speed: 42 // Output tokens per second (int)
|
|
410
411
|
}
|
|
411
412
|
}
|
|
412
413
|
```
|
|
@@ -418,10 +419,10 @@ After calling `message()` or `json()`, use `lastRaw` to access the complete resp
|
|
|
418
419
|
```javascript
|
|
419
420
|
const text = await model.message();
|
|
420
421
|
console.log(model.lastRaw.tokens);
|
|
421
|
-
// { input: 122, output: 86, total: 541, cost: 0.000319 }
|
|
422
|
+
// { input: 122, output: 86, total: 541, cost: 0.000319, speed: 38 }
|
|
422
423
|
```
|
|
423
424
|
|
|
424
|
-
The `cost` field is the estimated cost in USD based on the model's pricing per 1M tokens (input/output). If the model is not found in the pricing table, `cost` will be `null`.
|
|
425
|
+
The `cost` field is the estimated cost in USD based on the model's pricing per 1M tokens (input/output). If the model is not found in the pricing table, `cost` will be `null`. The `speed` field is the generation speed measured in output tokens per second (integer).
|
|
425
426
|
|
|
426
427
|
## 🐛 Enabling Debug Mode
|
|
427
428
|
|
|
@@ -515,7 +516,7 @@ new ModelMix(args = { options: {}, config: {} })
|
|
|
515
516
|
- `message`: The text response from the model
|
|
516
517
|
- `think`: Reasoning/thinking content (if available)
|
|
517
518
|
- `toolCalls`: Array of tool calls made by the model (if any)
|
|
518
|
-
- `tokens`: Object with `input`, `output`,
|
|
519
|
+
- `tokens`: Object with `input`, `output`, `total` token counts, `cost` (USD), and `speed` (output tokens/sec)
|
|
519
520
|
- `response`: The raw API response
|
|
520
521
|
- `stream(callback)`: Sends the message and streams the response, invoking the callback with each streamed part.
|
|
521
522
|
- `json(schemaExample, descriptions = {}, options = {})`: Forces the model to return a response in a specific JSON format.
|
package/demo/gemini.js
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
import { ModelMix, MixGoogle } from '../index.js';
|
|
2
|
-
try { process.loadEnvFile(); } catch {}
|
|
2
|
+
try { process.loadEnvFile(); } catch { }
|
|
3
3
|
|
|
4
4
|
const mmix = new ModelMix({
|
|
5
5
|
options: {
|
|
@@ -12,9 +12,9 @@ const mmix = new ModelMix({
|
|
|
12
12
|
}
|
|
13
13
|
});
|
|
14
14
|
|
|
15
|
-
// Using
|
|
15
|
+
// Using gemini3flash (Gemini 3 Flash) with built-in method
|
|
16
16
|
console.log("\n" + '--------| gemini25flash() |--------');
|
|
17
|
-
const flash = await mmix.
|
|
17
|
+
const flash = await mmix.gemini3flash()
|
|
18
18
|
.addText('Hi there! Do you like cats?')
|
|
19
19
|
.message();
|
|
20
20
|
|
|
@@ -22,20 +22,23 @@ console.log(flash);
|
|
|
22
22
|
|
|
23
23
|
// Using gemini3pro (Gemini 3 Pro) with custom config
|
|
24
24
|
console.log("\n" + '--------| gemini3pro() with JSON response |--------');
|
|
25
|
-
const pro = mmix.new().
|
|
25
|
+
const pro = mmix.new().gemini31pro();
|
|
26
26
|
|
|
27
27
|
pro.addText('Give me a fun fact about cats');
|
|
28
|
-
|
|
28
|
+
|
|
29
|
+
const jsonExampleAndSchema = {
|
|
29
30
|
fact: 'A fun fact about cats',
|
|
30
|
-
category: 'animal behavior'
|
|
31
|
-
}
|
|
31
|
+
category: 'animal behavior'
|
|
32
|
+
};
|
|
33
|
+
|
|
34
|
+
const jsonResponse = await pro.json(jsonExampleAndSchema, jsonExampleAndSchema);
|
|
32
35
|
|
|
33
36
|
console.log(jsonResponse);
|
|
34
37
|
|
|
35
38
|
// Using attach method with MixGoogle for custom model
|
|
36
39
|
console.log("\n" + '--------| Custom Gemini with attach() |--------');
|
|
37
|
-
mmix.attach('gemini-2.5-flash', new MixGoogle());
|
|
40
|
+
const customModel = mmix.new().attach('gemini-2.5-flash', new MixGoogle());
|
|
38
41
|
|
|
39
|
-
const custom = await
|
|
42
|
+
const custom = await customModel.addText('Tell me a short joke about cats.').message();
|
|
40
43
|
console.log(custom);
|
|
41
44
|
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
import { ModelMix } from '../index.js';
|
|
2
|
+
try { process.loadEnvFile(); } catch {}
|
|
3
|
+
|
|
4
|
+
const mmix = new ModelMix({
|
|
5
|
+
config: {
|
|
6
|
+
debug: 3
|
|
7
|
+
}
|
|
8
|
+
});
|
|
9
|
+
|
|
10
|
+
console.log('\n--------| gptRealtime() |--------');
|
|
11
|
+
|
|
12
|
+
const realtime = mmix.gptRealtimeMini({
|
|
13
|
+
options: {
|
|
14
|
+
stream: true
|
|
15
|
+
}
|
|
16
|
+
});
|
|
17
|
+
|
|
18
|
+
realtime.addText('Explain quantum entanglement in simple terms.');
|
|
19
|
+
const response = await realtime.stream(({ delta }) => {
|
|
20
|
+
process.stdout.write(delta || '');
|
|
21
|
+
});
|
|
22
|
+
console.log('\n\n[done]\n', response.tokens);
|
|
@@ -8,10 +8,10 @@ const mmix = new ModelMix({
|
|
|
8
8
|
}
|
|
9
9
|
});
|
|
10
10
|
|
|
11
|
-
console.log("\n" + '--------|
|
|
11
|
+
console.log("\n" + '--------| gpt54() |--------');
|
|
12
12
|
|
|
13
13
|
const gptArgs = { options: { reasoning_effort: "none", verbosity: "low" } };
|
|
14
|
-
const gpt = mmix.
|
|
14
|
+
const gpt = mmix.gpt54(gptArgs);
|
|
15
15
|
|
|
16
16
|
gpt.addText("Explain quantum entanglement in simple terms.");
|
|
17
17
|
const response = await gpt.message();
|
package/demo/package-lock.json
CHANGED
|
@@ -11,7 +11,8 @@
|
|
|
11
11
|
"dependencies": {
|
|
12
12
|
"dotenv": "^17.2.3",
|
|
13
13
|
"isolated-vm": "^6.0.2",
|
|
14
|
-
"lemonlog": "^1.1.4"
|
|
14
|
+
"lemonlog": "^1.1.4",
|
|
15
|
+
"pathmix": "^1.0.0"
|
|
15
16
|
}
|
|
16
17
|
},
|
|
17
18
|
".api/apis/pplx": {
|
|
@@ -290,6 +291,15 @@
|
|
|
290
291
|
"wrappy": "1"
|
|
291
292
|
}
|
|
292
293
|
},
|
|
294
|
+
"node_modules/pathmix": {
|
|
295
|
+
"version": "1.0.0",
|
|
296
|
+
"resolved": "https://registry.npmjs.org/pathmix/-/pathmix-1.0.0.tgz",
|
|
297
|
+
"integrity": "sha512-oLbvoOKuyV6TjkKLEYqH5O+q+d+qZwtRNzMrBI93IsCYN0liDw8W8aZq3BPvIaF4jJU+igeO/1p6lCwFfy8E5Q==",
|
|
298
|
+
"license": "ISC",
|
|
299
|
+
"engines": {
|
|
300
|
+
"node": ">=16.0.0"
|
|
301
|
+
}
|
|
302
|
+
},
|
|
293
303
|
"node_modules/prebuild-install": {
|
|
294
304
|
"version": "7.1.3",
|
|
295
305
|
"resolved": "https://registry.npmjs.org/prebuild-install/-/prebuild-install-7.1.3.tgz",
|
package/demo/package.json
CHANGED
package/index.js
CHANGED
|
@@ -5,6 +5,7 @@ const { inspect } = require('util');
|
|
|
5
5
|
const log = require('lemonlog')('ModelMix');
|
|
6
6
|
const Bottleneck = require('bottleneck');
|
|
7
7
|
const path = require('path');
|
|
8
|
+
const WebSocket = require('ws');
|
|
8
9
|
const generateJsonSchema = require('./schema');
|
|
9
10
|
const { Client } = require("@modelcontextprotocol/sdk/client/index.js");
|
|
10
11
|
const { StdioClientTransport } = require("@modelcontextprotocol/sdk/client/stdio.js");
|
|
@@ -14,6 +15,11 @@ const { MCPToolsManager } = require('./mcp-tools');
|
|
|
14
15
|
// Based on provider pricing pages linked in README
|
|
15
16
|
const MODEL_PRICING = {
|
|
16
17
|
// OpenAI
|
|
18
|
+
'gpt-realtime-mini': [0.60, 2.40],
|
|
19
|
+
'gpt-realtime': [4.00, 16.00],
|
|
20
|
+
'gpt-5.4': [2.50, 15.00],
|
|
21
|
+
'gpt-5.4-pro': [30, 180.00],
|
|
22
|
+
'gpt-5.3-codex': [1.75, 14.00],
|
|
17
23
|
'gpt-5.2': [1.75, 14.00],
|
|
18
24
|
'gpt-5.2-chat-latest': [1.75, 14.00],
|
|
19
25
|
'gpt-5.1': [1.25, 10.00],
|
|
@@ -37,6 +43,7 @@ const MODEL_PRICING = {
|
|
|
37
43
|
'claude-3-5-haiku-20241022': [0.80, 4.00],
|
|
38
44
|
'claude-haiku-4-5-20251001': [1.00, 5.00],
|
|
39
45
|
// Google
|
|
46
|
+
'gemini-3.1-pro-preview':[2.00, 12.00],
|
|
40
47
|
'gemini-3-pro-preview': [2.00, 12.00],
|
|
41
48
|
'gemini-3-flash-preview': [0.50, 3.00],
|
|
42
49
|
'gemini-2.5-pro': [1.25, 10.00],
|
|
@@ -267,6 +274,21 @@ class ModelMix {
|
|
|
267
274
|
gpt52({ options = {}, config = {} } = {}) {
|
|
268
275
|
return this.attach('gpt-5.2', new MixOpenAI({ options, config }));
|
|
269
276
|
}
|
|
277
|
+
gpt54({ options = {}, config = {} } = {}) {
|
|
278
|
+
return this.attach('gpt-5.4', new MixOpenAIResponses({ options, config }));
|
|
279
|
+
}
|
|
280
|
+
gpt54pro({ options = {}, config = {} } = {}) {
|
|
281
|
+
return this.attach('gpt-5.4-pro', new MixOpenAIResponses({ options, config }));
|
|
282
|
+
}
|
|
283
|
+
gptRealtime({ options = {}, config = {} } = {}) {
|
|
284
|
+
return this.attach('gpt-realtime', new MixOpenAIWebSocket({ options, config }));
|
|
285
|
+
}
|
|
286
|
+
gptRealtimeMini({ options = {}, config = {} } = {}) {
|
|
287
|
+
return this.attach('gpt-realtime-mini', new MixOpenAIWebSocket({ options, config }));
|
|
288
|
+
}
|
|
289
|
+
gpt53codex({ options = {}, config = {} } = {}) {
|
|
290
|
+
return this.attach('gpt-5.3-codex', new MixOpenAIResponses({ options, config }));
|
|
291
|
+
}
|
|
270
292
|
gpt52chat({ options = {}, config = {} } = {}) {
|
|
271
293
|
return this.attach('gpt-5.2-chat-latest', new MixOpenAI({ options, config }));
|
|
272
294
|
}
|
|
@@ -341,6 +363,9 @@ class ModelMix {
|
|
|
341
363
|
gemini25flash({ options = {}, config = {} } = {}) {
|
|
342
364
|
return this.attach('gemini-2.5-flash', new MixGoogle({ options, config }));
|
|
343
365
|
}
|
|
366
|
+
gemini31pro({ options = {}, config = {} } = {}) {
|
|
367
|
+
return this.attach('gemini-3.1-pro-preview', new MixGoogle({ options, config }));
|
|
368
|
+
}
|
|
344
369
|
gemini3pro({ options = {}, config = {} } = {}) {
|
|
345
370
|
return this.attach('gemini-3-pro-preview', new MixGoogle({ options, config }));
|
|
346
371
|
}
|
|
@@ -889,11 +914,14 @@ class ModelMix {
|
|
|
889
914
|
providerInstance.streamCallback = this.streamCallback;
|
|
890
915
|
}
|
|
891
916
|
|
|
917
|
+
const startTime = Date.now();
|
|
892
918
|
const result = await providerInstance.create({ options: currentOptions, config: currentConfig });
|
|
919
|
+
const elapsedMs = Date.now() - startTime;
|
|
893
920
|
|
|
894
|
-
// Calculate cost based on model pricing
|
|
895
921
|
if (result.tokens) {
|
|
896
922
|
result.tokens.cost = ModelMix.calculateCost(currentModelKey, result.tokens);
|
|
923
|
+
const elapsedSec = elapsedMs / 1000;
|
|
924
|
+
result.tokens.speed = elapsedSec > 0 ? Math.round(result.tokens.output / elapsedSec) : 0;
|
|
897
925
|
}
|
|
898
926
|
|
|
899
927
|
if (result.toolCalls && result.toolCalls.length > 0) {
|
|
@@ -935,7 +963,7 @@ class ModelMix {
|
|
|
935
963
|
// debug level 2: Readable summary of output
|
|
936
964
|
if (currentConfig.debug >= 2) {
|
|
937
965
|
const tokenInfo = result.tokens
|
|
938
|
-
? ` ${result.tokens.input} → ${result.tokens.output} tok` + (result.tokens.cost != null ? ` $${result.tokens.cost.toFixed(4)}` : '')
|
|
966
|
+
? ` ${result.tokens.input} → ${result.tokens.output} tok` + (result.tokens.speed ? ` ${result.tokens.speed} t/s` : '') + (result.tokens.cost != null ? ` $${result.tokens.cost.toFixed(4)}` : '')
|
|
939
967
|
: '';
|
|
940
968
|
console.log(`✓${tokenInfo}\n${ModelMix.formatOutputSummary(result, currentConfig.debug).trim()}`);
|
|
941
969
|
}
|
|
@@ -1492,6 +1520,339 @@ class MixOpenAI extends MixCustom {
|
|
|
1492
1520
|
}
|
|
1493
1521
|
}
|
|
1494
1522
|
|
|
1523
|
+
class MixOpenAIResponses extends MixOpenAI {
|
|
1524
|
+
async create({ config = {}, options = {} } = {}) {
|
|
1525
|
+
|
|
1526
|
+
// Keep GPT/o-model option normalization behavior
|
|
1527
|
+
if (options.model?.startsWith('o')) {
|
|
1528
|
+
delete options.max_tokens;
|
|
1529
|
+
delete options.temperature;
|
|
1530
|
+
}
|
|
1531
|
+
if (options.model?.includes('gpt-5')) {
|
|
1532
|
+
if (options.max_tokens) {
|
|
1533
|
+
options.max_completion_tokens = options.max_tokens;
|
|
1534
|
+
delete options.max_tokens;
|
|
1535
|
+
}
|
|
1536
|
+
delete options.temperature;
|
|
1537
|
+
}
|
|
1538
|
+
|
|
1539
|
+
const responsesUrl = this.config.url.replace('/chat/completions', '/responses');
|
|
1540
|
+
const request = MixOpenAIResponses.buildResponsesRequest(options);
|
|
1541
|
+
const response = await axios.post(responsesUrl, request, {
|
|
1542
|
+
headers: this.headers
|
|
1543
|
+
});
|
|
1544
|
+
|
|
1545
|
+
return MixOpenAIResponses.processResponsesResponse(response);
|
|
1546
|
+
}
|
|
1547
|
+
|
|
1548
|
+
static buildResponsesRequest(options = {}) {
|
|
1549
|
+
const request = {
|
|
1550
|
+
model: options.model,
|
|
1551
|
+
input: MixOpenAIResponses.messagesToResponsesInput(options.messages),
|
|
1552
|
+
stream: false
|
|
1553
|
+
};
|
|
1554
|
+
|
|
1555
|
+
if (options.reasoning_effort) request.reasoning = { effort: options.reasoning_effort };
|
|
1556
|
+
if (options.verbosity) request.text = { verbosity: options.verbosity };
|
|
1557
|
+
|
|
1558
|
+
if (typeof options.max_completion_tokens === 'number') {
|
|
1559
|
+
request.max_output_tokens = options.max_completion_tokens;
|
|
1560
|
+
} else if (typeof options.max_tokens === 'number') {
|
|
1561
|
+
request.max_output_tokens = options.max_tokens;
|
|
1562
|
+
}
|
|
1563
|
+
|
|
1564
|
+
if (typeof options.temperature === 'number') request.temperature = options.temperature;
|
|
1565
|
+
if (typeof options.top_p === 'number') request.top_p = options.top_p;
|
|
1566
|
+
if (typeof options.presence_penalty === 'number') request.presence_penalty = options.presence_penalty;
|
|
1567
|
+
if (typeof options.frequency_penalty === 'number') request.frequency_penalty = options.frequency_penalty;
|
|
1568
|
+
if (options.stop !== undefined) request.stop = options.stop;
|
|
1569
|
+
if (typeof options.n === 'number') request.n = options.n;
|
|
1570
|
+
if (options.logit_bias !== undefined) request.logit_bias = options.logit_bias;
|
|
1571
|
+
if (options.user !== undefined) request.user = options.user;
|
|
1572
|
+
|
|
1573
|
+
return request;
|
|
1574
|
+
}
|
|
1575
|
+
|
|
1576
|
+
static processResponsesResponse(response) {
|
|
1577
|
+
const message = MixOpenAIResponses.extractResponsesMessage(response.data);
|
|
1578
|
+
return {
|
|
1579
|
+
message,
|
|
1580
|
+
think: null,
|
|
1581
|
+
toolCalls: [],
|
|
1582
|
+
tokens: MixOpenAIResponses.extractResponsesTokens(response.data),
|
|
1583
|
+
response: response.data
|
|
1584
|
+
};
|
|
1585
|
+
}
|
|
1586
|
+
|
|
1587
|
+
static extractResponsesTokens(data) {
|
|
1588
|
+
if (data.usage) {
|
|
1589
|
+
return {
|
|
1590
|
+
input: data.usage.input_tokens || 0,
|
|
1591
|
+
output: data.usage.output_tokens || 0,
|
|
1592
|
+
total: data.usage.total_tokens || ((data.usage.input_tokens || 0) + (data.usage.output_tokens || 0))
|
|
1593
|
+
};
|
|
1594
|
+
}
|
|
1595
|
+
return {
|
|
1596
|
+
input: 0,
|
|
1597
|
+
output: 0,
|
|
1598
|
+
total: 0
|
|
1599
|
+
};
|
|
1600
|
+
}
|
|
1601
|
+
|
|
1602
|
+
static extractResponsesMessage(data) {
|
|
1603
|
+
if (!Array.isArray(data.output)) return '';
|
|
1604
|
+
return data.output
|
|
1605
|
+
.filter(item => item.type === 'message')
|
|
1606
|
+
.flatMap(item => Array.isArray(item.content) ? item.content : [])
|
|
1607
|
+
.filter(content => content.type === 'output_text' && typeof content.text === 'string')
|
|
1608
|
+
.map(content => content.text)
|
|
1609
|
+
.join('\n')
|
|
1610
|
+
.trim();
|
|
1611
|
+
}
|
|
1612
|
+
|
|
1613
|
+
static messagesToResponsesInput(messages = []) {
|
|
1614
|
+
const mapped = [];
|
|
1615
|
+
|
|
1616
|
+
for (const message of messages) {
|
|
1617
|
+
if (!message || !message.role) continue;
|
|
1618
|
+
if (message.tool_calls || message.role === 'tool') continue;
|
|
1619
|
+
|
|
1620
|
+
let text = '';
|
|
1621
|
+
if (typeof message.content === 'string') {
|
|
1622
|
+
text = message.content;
|
|
1623
|
+
} else if (Array.isArray(message.content)) {
|
|
1624
|
+
text = message.content
|
|
1625
|
+
.filter(item => item && item.type === 'text' && typeof item.text === 'string')
|
|
1626
|
+
.map(item => item.text)
|
|
1627
|
+
.join('\n');
|
|
1628
|
+
}
|
|
1629
|
+
|
|
1630
|
+
if (!text) continue;
|
|
1631
|
+
mapped.push({
|
|
1632
|
+
role: message.role,
|
|
1633
|
+
content: [{ type: 'input_text', text }]
|
|
1634
|
+
});
|
|
1635
|
+
}
|
|
1636
|
+
|
|
1637
|
+
return mapped;
|
|
1638
|
+
}
|
|
1639
|
+
}
|
|
1640
|
+
|
|
1641
|
+
class MixOpenAIWebSocket extends MixOpenAIResponses {
|
|
1642
|
+
getDefaultConfig(customConfig) {
|
|
1643
|
+
return super.getDefaultConfig({
|
|
1644
|
+
realtimeUrl: 'wss://api.openai.com/v1/realtime',
|
|
1645
|
+
websocketTimeoutMs: 120000,
|
|
1646
|
+
...customConfig
|
|
1647
|
+
});
|
|
1648
|
+
}
|
|
1649
|
+
|
|
1650
|
+
async create({ config = {}, options = {} } = {}) {
|
|
1651
|
+
if (options.model?.startsWith('o')) {
|
|
1652
|
+
delete options.max_tokens;
|
|
1653
|
+
delete options.temperature;
|
|
1654
|
+
}
|
|
1655
|
+
if (options.model?.includes('gpt-5')) {
|
|
1656
|
+
if (options.max_tokens) {
|
|
1657
|
+
options.max_completion_tokens = options.max_tokens;
|
|
1658
|
+
delete options.max_tokens;
|
|
1659
|
+
}
|
|
1660
|
+
delete options.temperature;
|
|
1661
|
+
}
|
|
1662
|
+
|
|
1663
|
+
const mergedConfig = { ...this.config, ...config };
|
|
1664
|
+
const realtimeUrl = `${mergedConfig.realtimeUrl}?model=${encodeURIComponent(options.model)}`;
|
|
1665
|
+
const timeoutMs = mergedConfig.websocketTimeoutMs || 120000;
|
|
1666
|
+
|
|
1667
|
+
return await new Promise((resolve, reject) => {
|
|
1668
|
+
const ws = new WebSocket(realtimeUrl, {
|
|
1669
|
+
headers: {
|
|
1670
|
+
authorization: `Bearer ${mergedConfig.apiKey}`
|
|
1671
|
+
}
|
|
1672
|
+
});
|
|
1673
|
+
|
|
1674
|
+
const events = [];
|
|
1675
|
+
let message = '';
|
|
1676
|
+
let settled = false;
|
|
1677
|
+
let finalResponse = null;
|
|
1678
|
+
|
|
1679
|
+
const timeout = setTimeout(() => {
|
|
1680
|
+
if (settled) return;
|
|
1681
|
+
settled = true;
|
|
1682
|
+
ws.close();
|
|
1683
|
+
reject({
|
|
1684
|
+
message: `Realtime WebSocket timed out after ${timeoutMs}ms`,
|
|
1685
|
+
statusCode: null,
|
|
1686
|
+
details: null,
|
|
1687
|
+
config: mergedConfig,
|
|
1688
|
+
options
|
|
1689
|
+
});
|
|
1690
|
+
}, timeoutMs);
|
|
1691
|
+
|
|
1692
|
+
const cleanUp = () => clearTimeout(timeout);
|
|
1693
|
+
|
|
1694
|
+
ws.on('open', () => {
|
|
1695
|
+
const session = {
|
|
1696
|
+
type: 'realtime',
|
|
1697
|
+
output_modalities: ['text']
|
|
1698
|
+
};
|
|
1699
|
+
|
|
1700
|
+
if (mergedConfig.system) session.instructions = mergedConfig.system;
|
|
1701
|
+
if (Array.isArray(options.tools) && options.tools.length > 0) {
|
|
1702
|
+
session.tools = options.tools;
|
|
1703
|
+
}
|
|
1704
|
+
|
|
1705
|
+
ws.send(JSON.stringify({ type: 'session.update', session }));
|
|
1706
|
+
|
|
1707
|
+
const items = MixOpenAIWebSocket.messagesToConversationItems(options.messages);
|
|
1708
|
+
for (const item of items) {
|
|
1709
|
+
ws.send(JSON.stringify({
|
|
1710
|
+
type: 'conversation.item.create',
|
|
1711
|
+
item
|
|
1712
|
+
}));
|
|
1713
|
+
}
|
|
1714
|
+
|
|
1715
|
+
const responseConfig = { output_modalities: ['text'] };
|
|
1716
|
+
if (typeof options.max_completion_tokens === 'number') {
|
|
1717
|
+
responseConfig.max_output_tokens = Math.min(options.max_completion_tokens, 4096);
|
|
1718
|
+
} else if (typeof options.max_tokens === 'number') {
|
|
1719
|
+
responseConfig.max_output_tokens = Math.min(options.max_tokens, 4096);
|
|
1720
|
+
}
|
|
1721
|
+
if (Array.isArray(options.tools) && options.tools.length > 0) responseConfig.tools = options.tools;
|
|
1722
|
+
|
|
1723
|
+
ws.send(JSON.stringify({
|
|
1724
|
+
type: 'response.create',
|
|
1725
|
+
response: responseConfig
|
|
1726
|
+
}));
|
|
1727
|
+
});
|
|
1728
|
+
|
|
1729
|
+
ws.on('message', raw => {
|
|
1730
|
+
let event;
|
|
1731
|
+
try {
|
|
1732
|
+
event = JSON.parse(raw.toString());
|
|
1733
|
+
} catch {
|
|
1734
|
+
return;
|
|
1735
|
+
}
|
|
1736
|
+
|
|
1737
|
+
events.push(event);
|
|
1738
|
+
|
|
1739
|
+
const isTextDeltaEvent = event.type === 'response.text.delta' || event.type === 'response.output_text.delta';
|
|
1740
|
+
if (isTextDeltaEvent) {
|
|
1741
|
+
const delta = MixOpenAIWebSocket.extractDelta(event);
|
|
1742
|
+
if (delta) {
|
|
1743
|
+
message += delta;
|
|
1744
|
+
if (this.streamCallback) {
|
|
1745
|
+
this.streamCallback({ response: event, message, delta });
|
|
1746
|
+
}
|
|
1747
|
+
}
|
|
1748
|
+
return;
|
|
1749
|
+
}
|
|
1750
|
+
|
|
1751
|
+
if (event.type === 'response.done') {
|
|
1752
|
+
finalResponse = event.response || null;
|
|
1753
|
+
if (!message && finalResponse) {
|
|
1754
|
+
message = MixOpenAIResponses.extractResponsesMessage(finalResponse);
|
|
1755
|
+
}
|
|
1756
|
+
|
|
1757
|
+
if (!settled) {
|
|
1758
|
+
settled = true;
|
|
1759
|
+
cleanUp();
|
|
1760
|
+
ws.close();
|
|
1761
|
+
resolve({
|
|
1762
|
+
message: message.trim(),
|
|
1763
|
+
think: null,
|
|
1764
|
+
toolCalls: [],
|
|
1765
|
+
tokens: MixOpenAIResponses.extractResponsesTokens(finalResponse || {}),
|
|
1766
|
+
response: {
|
|
1767
|
+
response: finalResponse,
|
|
1768
|
+
events
|
|
1769
|
+
}
|
|
1770
|
+
});
|
|
1771
|
+
}
|
|
1772
|
+
return;
|
|
1773
|
+
}
|
|
1774
|
+
|
|
1775
|
+
if (event.type === 'error' && !settled) {
|
|
1776
|
+
settled = true;
|
|
1777
|
+
cleanUp();
|
|
1778
|
+
ws.close();
|
|
1779
|
+
reject({
|
|
1780
|
+
message: event.error?.message || 'Realtime WebSocket error',
|
|
1781
|
+
statusCode: null,
|
|
1782
|
+
details: event.error || event,
|
|
1783
|
+
config: mergedConfig,
|
|
1784
|
+
options
|
|
1785
|
+
});
|
|
1786
|
+
}
|
|
1787
|
+
});
|
|
1788
|
+
|
|
1789
|
+
ws.on('error', error => {
|
|
1790
|
+
if (settled) return;
|
|
1791
|
+
settled = true;
|
|
1792
|
+
cleanUp();
|
|
1793
|
+
reject({
|
|
1794
|
+
message: error.message || 'Realtime WebSocket connection error',
|
|
1795
|
+
statusCode: null,
|
|
1796
|
+
details: null,
|
|
1797
|
+
stack: error.stack,
|
|
1798
|
+
config: mergedConfig,
|
|
1799
|
+
options
|
|
1800
|
+
});
|
|
1801
|
+
});
|
|
1802
|
+
|
|
1803
|
+
ws.on('close', () => {
|
|
1804
|
+
if (settled) return;
|
|
1805
|
+
settled = true;
|
|
1806
|
+
cleanUp();
|
|
1807
|
+
reject({
|
|
1808
|
+
message: 'Realtime WebSocket closed before response.done',
|
|
1809
|
+
statusCode: null,
|
|
1810
|
+
details: null,
|
|
1811
|
+
config: mergedConfig,
|
|
1812
|
+
options
|
|
1813
|
+
});
|
|
1814
|
+
});
|
|
1815
|
+
});
|
|
1816
|
+
}
|
|
1817
|
+
|
|
1818
|
+
static messagesToConversationItems(messages = []) {
|
|
1819
|
+
const items = [];
|
|
1820
|
+
|
|
1821
|
+
for (const message of messages) {
|
|
1822
|
+
if (!message || !message.role) continue;
|
|
1823
|
+
if (message.role === 'tool' || message.tool_calls) continue;
|
|
1824
|
+
|
|
1825
|
+
const role = message.role === 'assistant' ? 'assistant' : (message.role === 'system' ? 'system' : 'user');
|
|
1826
|
+
const content = [];
|
|
1827
|
+
|
|
1828
|
+
if (typeof message.content === 'string') {
|
|
1829
|
+
content.push({
|
|
1830
|
+
type: role === 'assistant' ? 'text' : 'input_text',
|
|
1831
|
+
text: message.content
|
|
1832
|
+
});
|
|
1833
|
+
} else if (Array.isArray(message.content)) {
|
|
1834
|
+
for (const item of message.content) {
|
|
1835
|
+
if (!item || item.type !== 'text' || typeof item.text !== 'string') continue;
|
|
1836
|
+
content.push({
|
|
1837
|
+
type: role === 'assistant' ? 'text' : 'input_text',
|
|
1838
|
+
text: item.text
|
|
1839
|
+
});
|
|
1840
|
+
}
|
|
1841
|
+
}
|
|
1842
|
+
|
|
1843
|
+
if (content.length === 0) continue;
|
|
1844
|
+
items.push({ type: 'message', role, content });
|
|
1845
|
+
}
|
|
1846
|
+
|
|
1847
|
+
return items;
|
|
1848
|
+
}
|
|
1849
|
+
|
|
1850
|
+
static extractDelta(event) {
|
|
1851
|
+
if (typeof event.delta === 'string') return event.delta;
|
|
1852
|
+
return '';
|
|
1853
|
+
}
|
|
1854
|
+
}
|
|
1855
|
+
|
|
1495
1856
|
class MixOpenRouter extends MixOpenAI {
|
|
1496
1857
|
getDefaultConfig(customConfig) {
|
|
1497
1858
|
|
|
@@ -2266,4 +2627,4 @@ class MixGoogle extends MixCustom {
|
|
|
2266
2627
|
}
|
|
2267
2628
|
}
|
|
2268
2629
|
|
|
2269
|
-
module.exports = { MixCustom, ModelMix, MixAnthropic, MixMiniMax, MixOpenAI, MixOpenRouter, MixPerplexity, MixOllama, MixLMStudio, MixGroq, MixTogether, MixGrok, MixCerebras, MixGoogle, MixFireworks };
|
|
2630
|
+
module.exports = { MixCustom, ModelMix, MixAnthropic, MixMiniMax, MixOpenAI, MixOpenAIResponses, MixOpenAIWebSocket, MixOpenRouter, MixPerplexity, MixOllama, MixLMStudio, MixGroq, MixTogether, MixGrok, MixCerebras, MixGoogle, MixFireworks };
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "modelmix",
|
|
3
|
-
"version": "4.4.
|
|
3
|
+
"version": "4.4.16",
|
|
4
4
|
"description": "🧬 Reliable interface with automatic fallback for AI LLMs.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"repository": {
|
|
@@ -16,7 +16,7 @@
|
|
|
16
16
|
"openai",
|
|
17
17
|
"anthropic",
|
|
18
18
|
"agent",
|
|
19
|
-
"
|
|
19
|
+
"realtime",
|
|
20
20
|
"gpt",
|
|
21
21
|
"claude",
|
|
22
22
|
"llama",
|
|
@@ -47,16 +47,17 @@
|
|
|
47
47
|
},
|
|
48
48
|
"homepage": "https://github.com/clasen/ModelMix#readme",
|
|
49
49
|
"dependencies": {
|
|
50
|
-
"@modelcontextprotocol/sdk": "^1.
|
|
50
|
+
"@modelcontextprotocol/sdk": "^1.27.1",
|
|
51
51
|
"axios": "^1.13.5",
|
|
52
52
|
"bottleneck": "^2.19.5",
|
|
53
53
|
"file-type": "^16.5.4",
|
|
54
54
|
"form-data": "^4.0.4",
|
|
55
|
-
"lemonlog": "^1.2.0"
|
|
55
|
+
"lemonlog": "^1.2.0",
|
|
56
|
+
"ws": "^8.19.0"
|
|
56
57
|
},
|
|
57
58
|
"devDependencies": {
|
|
58
59
|
"chai": "^5.2.1",
|
|
59
|
-
"mocha": "^11.
|
|
60
|
+
"mocha": "^11.7.5",
|
|
60
61
|
"nock": "^14.0.9",
|
|
61
62
|
"sinon": "^21.0.0"
|
|
62
63
|
},
|
package/skills/modelmix/SKILL.md
CHANGED
|
@@ -1,41 +1,50 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: modelmix
|
|
3
|
-
description: Instructions for using the ModelMix Node.js library to interact with multiple AI LLM providers through a unified interface. Use when
|
|
3
|
+
description: Instructions for using the ModelMix Node.js library to interact with multiple AI LLM providers through a unified interface. Use when writing code that calls AI models (OpenAI, Anthropic, Google, Groq, Perplexity, Grok, MiniMax, Fireworks, Together, Lambda, Cerebras, OpenRouter, Ollama, LM Studio), chaining models with fallback, getting structured JSON from LLMs, adding MCP tools, streaming responses, managing multi-provider AI workflows, round-robin load balancing, or rate limiting API requests in Node.js. Also use when the user mentions "modelmix", "ModelMix", asks to "call an LLM", "query a model", "add AI to my app", or wants to integrate any supported provider.
|
|
4
|
+
metadata:
|
|
5
|
+
tags: [llm, ai, openai, anthropic, google, groq, perplexity, grok, mcp, streaming, json-output]
|
|
4
6
|
---
|
|
5
7
|
|
|
6
8
|
# ModelMix Library Skill
|
|
7
9
|
|
|
8
10
|
## Overview
|
|
9
11
|
|
|
10
|
-
ModelMix is a Node.js library
|
|
12
|
+
ModelMix is a Node.js library providing a unified fluent API to interact with multiple AI LLM providers. It handles automatic fallback between models, round-robin load balancing, structured JSON output, streaming, MCP tool integration, custom local tools, rate limiting, and token tracking.
|
|
11
13
|
|
|
12
14
|
Use this skill when:
|
|
13
15
|
- Integrating one or more AI models into a Node.js project
|
|
14
|
-
- Chaining models with automatic fallback
|
|
16
|
+
- Chaining models with automatic fallback or round-robin
|
|
15
17
|
- Extracting structured JSON from LLMs
|
|
16
18
|
- Adding MCP tools or custom tools to models
|
|
19
|
+
- Streaming responses from any provider
|
|
17
20
|
- Working with templates and file-based prompts
|
|
21
|
+
- Tracking token usage and costs
|
|
18
22
|
|
|
19
|
-
Do NOT use
|
|
23
|
+
Do NOT use for:
|
|
20
24
|
- Python or non-Node.js projects
|
|
21
25
|
- Direct HTTP calls to LLM APIs (use ModelMix instead)
|
|
22
26
|
|
|
23
|
-
##
|
|
27
|
+
## Quick Reference
|
|
24
28
|
|
|
29
|
+
- [Installation](#installation)
|
|
30
|
+
- [Creating an instance](#creating-an-instance)
|
|
31
|
+
- [Attaching models](#attaching-models)
|
|
25
32
|
- [Get a text response](#get-a-text-response)
|
|
26
33
|
- [Get structured JSON](#get-structured-json)
|
|
27
34
|
- [Stream a response](#stream-a-response)
|
|
28
|
-
- [
|
|
29
|
-
- [
|
|
35
|
+
- [Extract a code block](#extract-a-code-block)
|
|
36
|
+
- [Get raw response (tokens, thinking, tool calls)](#get-raw-response)
|
|
37
|
+
- [Access full response with lastRaw](#access-full-response-with-lastraw)
|
|
30
38
|
- [Add images](#add-images)
|
|
31
|
-
- [
|
|
39
|
+
- [Templates with placeholders](#templates-with-placeholders)
|
|
32
40
|
- [Round-robin load balancing](#round-robin-load-balancing)
|
|
33
|
-
- [MCP integration
|
|
34
|
-
- [Custom local tools
|
|
35
|
-
- [Rate limiting
|
|
36
|
-
- [Debug mode](#debug-mode)
|
|
37
|
-
- [Use free-tier models](#use-free-tier-models)
|
|
41
|
+
- [MCP integration](#mcp-integration)
|
|
42
|
+
- [Custom local tools](#custom-local-tools)
|
|
43
|
+
- [Rate limiting](#rate-limiting)
|
|
38
44
|
- [Conversation history](#conversation-history)
|
|
45
|
+
- [Debug mode](#debug-mode)
|
|
46
|
+
- [Free-tier models](#free-tier-models)
|
|
47
|
+
- [Multi-provider routing](#multi-provider-routing)
|
|
39
48
|
|
|
40
49
|
## Installation
|
|
41
50
|
|
|
@@ -54,49 +63,77 @@ import { ModelMix } from 'modelmix';
|
|
|
54
63
|
### Creating an Instance
|
|
55
64
|
|
|
56
65
|
```javascript
|
|
57
|
-
// Static factory (preferred)
|
|
58
66
|
const model = ModelMix.new();
|
|
59
67
|
|
|
60
|
-
// With global options
|
|
61
68
|
const model = ModelMix.new({
|
|
62
69
|
options: { max_tokens: 4096, temperature: 0.7 },
|
|
63
70
|
config: {
|
|
64
71
|
system: "You are a helpful assistant.",
|
|
65
|
-
max_history: 5,
|
|
66
|
-
debug: 0,
|
|
67
|
-
roundRobin: false
|
|
72
|
+
max_history: 5, // -1 = unlimited, 0 = none (default), N = keep last N
|
|
73
|
+
debug: 0, // 0=silent, 1=minimal, 2=summary, 3=full, 4=verbose
|
|
74
|
+
roundRobin: false // false=fallback, true=rotate models
|
|
68
75
|
}
|
|
69
76
|
});
|
|
70
77
|
```
|
|
71
78
|
|
|
72
|
-
### Attaching Models
|
|
79
|
+
### Attaching Models
|
|
73
80
|
|
|
74
|
-
Chain shorthand methods to attach providers. First model is primary; others are fallbacks:
|
|
81
|
+
Chain shorthand methods to attach providers. First model is primary; others are fallbacks (or rotated if `roundRobin: true`):
|
|
75
82
|
|
|
76
83
|
```javascript
|
|
77
84
|
const model = ModelMix.new()
|
|
78
85
|
.sonnet46() // primary
|
|
79
|
-
.gpt52()
|
|
86
|
+
.gpt52() // fallback 1
|
|
80
87
|
.gemini3flash() // fallback 2
|
|
81
88
|
.addText("Hello!")
|
|
82
89
|
```
|
|
83
90
|
|
|
84
|
-
If `
|
|
91
|
+
If `sonnet46` fails, it automatically tries `gpt52`, then `gemini3flash`.
|
|
85
92
|
|
|
86
93
|
## Available Model Shorthands
|
|
87
94
|
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
95
|
+
### OpenAI
|
|
96
|
+
`gpt52()` `gpt52chat()` `gpt51()` `gpt5()` `gpt5mini()` `gpt5nano()` `gpt45()` `gpt41()` `gpt41mini()` `gpt41nano()` `o3()` `o4mini()`
|
|
97
|
+
|
|
98
|
+
### Anthropic
|
|
99
|
+
`opus46()` `opus45()` `opus41()` `sonnet46()` `sonnet45()` `sonnet4()` `sonnet37()` `haiku45()` `haiku35()`
|
|
100
|
+
|
|
101
|
+
Thinking variants: append `think` — e.g. `opus46think()` `sonnet46think()` `sonnet45think()` `sonnet4think()` `sonnet37think()` `opus45think()` `opus41think()` `haiku45think()`
|
|
102
|
+
|
|
103
|
+
### Google
|
|
104
|
+
`gemini3pro()` `gemini3flash()` `gemini25pro()` `gemini25flash()`
|
|
105
|
+
|
|
106
|
+
### Grok
|
|
107
|
+
`grok4()` `grok41()` `grok41think()` `grok3()` `grok3mini()`
|
|
108
|
+
|
|
109
|
+
### Perplexity
|
|
110
|
+
`sonar()` `sonarPro()`
|
|
111
|
+
|
|
112
|
+
### Groq
|
|
113
|
+
`scout()` `maverick()`
|
|
114
|
+
|
|
115
|
+
### Together
|
|
116
|
+
`qwen3()` `kimiK2()` `kimiK2think()` `kimiK25think()` `gptOss()`
|
|
117
|
+
|
|
118
|
+
### MiniMax
|
|
119
|
+
`minimaxM25()` `minimaxM21()` `minimaxM2()` `minimaxM2Stable()`
|
|
120
|
+
|
|
121
|
+
### Fireworks
|
|
122
|
+
`deepseekV32()` `GLM5()` `GLM47()`
|
|
123
|
+
|
|
124
|
+
### Cerebras
|
|
125
|
+
`GLM46()`
|
|
126
|
+
|
|
127
|
+
### OpenRouter
|
|
128
|
+
`GLM45()`
|
|
129
|
+
|
|
130
|
+
### Multi-provider (auto-fallback across free/paid tiers)
|
|
131
|
+
`deepseekR1()` `hermes3()` `scout()` `maverick()` `kimiK2()` `GLM47()`
|
|
98
132
|
|
|
99
|
-
|
|
133
|
+
### Local
|
|
134
|
+
`lmstudio()` — for LM Studio local models
|
|
135
|
+
|
|
136
|
+
Each method accepts optional `{ options, config }` to override per-model settings.
|
|
100
137
|
|
|
101
138
|
## Common Tasks
|
|
102
139
|
|
|
@@ -116,35 +153,30 @@ const result = await ModelMix.new()
|
|
|
116
153
|
.gpt5mini()
|
|
117
154
|
.addText("Name and capital of 3 South American countries.")
|
|
118
155
|
.json(
|
|
119
|
-
{ countries: [{ name: "", capital: "" }] },
|
|
120
|
-
{ countries: [{ name: "country name", capital: "in uppercase" }] },
|
|
121
|
-
{ addNote: true }
|
|
156
|
+
{ countries: [{ name: "", capital: "" }] },
|
|
157
|
+
{ countries: [{ name: "country name", capital: "in uppercase" }] },
|
|
158
|
+
{ addNote: true }
|
|
122
159
|
);
|
|
123
|
-
// result.countries → [{ name: "Brazil", capital: "BRASILIA" }, ...]
|
|
124
160
|
```
|
|
125
161
|
|
|
126
162
|
`json()` signature: `json(schemaExample, schemaDescription?, { addSchema, addExample, addNote }?)`
|
|
127
163
|
|
|
128
164
|
#### Enhanced descriptors
|
|
129
165
|
|
|
130
|
-
Descriptions can be
|
|
166
|
+
Descriptions can be strings or descriptor objects with metadata:
|
|
131
167
|
|
|
132
168
|
```javascript
|
|
133
169
|
const result = await model.json(
|
|
134
170
|
{ name: 'martin', age: 22, sex: 'Male' },
|
|
135
171
|
{
|
|
136
172
|
name: { description: 'Name of the actor', required: false },
|
|
137
|
-
age: 'Age of the actor',
|
|
173
|
+
age: 'Age of the actor',
|
|
138
174
|
sex: { description: 'Gender', enum: ['Male', 'Female', null] }
|
|
139
175
|
}
|
|
140
176
|
);
|
|
141
177
|
```
|
|
142
178
|
|
|
143
|
-
Descriptor properties:
|
|
144
|
-
- `description` (string) — field description
|
|
145
|
-
- `required` (boolean, default `true`) — if `false`: removed from required array, type becomes nullable
|
|
146
|
-
- `enum` (array) — allowed values; if includes `null`, type auto-becomes nullable
|
|
147
|
-
- `default` (any) — default value
|
|
179
|
+
Descriptor properties: `description` (string), `required` (boolean, default true — if false, field becomes nullable), `enum` (array — if includes null, type auto-becomes nullable), `default` (any).
|
|
148
180
|
|
|
149
181
|
#### Array auto-wrap
|
|
150
182
|
|
|
@@ -166,7 +198,19 @@ await ModelMix.new()
|
|
|
166
198
|
});
|
|
167
199
|
```
|
|
168
200
|
|
|
169
|
-
###
|
|
201
|
+
### Extract a code block
|
|
202
|
+
|
|
203
|
+
```javascript
|
|
204
|
+
const code = await ModelMix.new()
|
|
205
|
+
.gpt5mini()
|
|
206
|
+
.addText("Write a hello world function in JavaScript.")
|
|
207
|
+
.block();
|
|
208
|
+
// Returns only the content inside the first code block
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
`block()` accepts `{ addSystemExtra }` (default true) — adds system instructions that tell the model to wrap output in a code block.
|
|
212
|
+
|
|
213
|
+
### Get raw response
|
|
170
214
|
|
|
171
215
|
```javascript
|
|
172
216
|
const raw = await ModelMix.new()
|
|
@@ -176,15 +220,15 @@ const raw = await ModelMix.new()
|
|
|
176
220
|
// raw.message, raw.think, raw.tokens, raw.toolCalls, raw.response
|
|
177
221
|
```
|
|
178
222
|
|
|
179
|
-
### Access full response
|
|
223
|
+
### Access full response with lastRaw
|
|
180
224
|
|
|
181
|
-
After calling `message()`, `json()`, `block()`, or `stream()`, use `lastRaw` to access the complete response
|
|
225
|
+
After calling `message()`, `json()`, `block()`, or `stream()`, use `lastRaw` to access the complete response:
|
|
182
226
|
|
|
183
227
|
```javascript
|
|
184
228
|
const model = ModelMix.new().gpt5mini().addText("Hello!");
|
|
185
229
|
const text = await model.message();
|
|
186
230
|
console.log(model.lastRaw.tokens);
|
|
187
|
-
// { input: 122, output: 86, total: 541, cost: 0.000319 }
|
|
231
|
+
// { input: 122, output: 86, total: 541, cost: 0.000319, speed: 38 }
|
|
188
232
|
console.log(model.lastRaw.think); // reasoning content (if available)
|
|
189
233
|
console.log(model.lastRaw.response); // raw API response
|
|
190
234
|
```
|
|
@@ -193,13 +237,16 @@ console.log(model.lastRaw.response); // raw API response
|
|
|
193
237
|
|
|
194
238
|
```javascript
|
|
195
239
|
const model = ModelMix.new().sonnet45();
|
|
196
|
-
model.addImage('./photo.jpg');
|
|
197
|
-
model.addImageFromUrl('https://example.com/img.png');
|
|
240
|
+
model.addImage('./photo.jpg'); // from file
|
|
241
|
+
model.addImageFromUrl('https://example.com/img.png'); // from URL
|
|
242
|
+
model.addImageFromBuffer(imageBuffer); // from Buffer
|
|
198
243
|
model.addText('Describe this image.');
|
|
199
244
|
const description = await model.message();
|
|
200
245
|
```
|
|
201
246
|
|
|
202
|
-
|
|
247
|
+
All image methods accept an optional second argument `{ role }` (default `"user"`).
|
|
248
|
+
|
|
249
|
+
### Templates with placeholders
|
|
203
250
|
|
|
204
251
|
```javascript
|
|
205
252
|
const model = ModelMix.new().gpt5mini();
|
|
@@ -221,12 +268,11 @@ const pool = ModelMix.new({ config: { roundRobin: true } })
|
|
|
221
268
|
.sonnet45()
|
|
222
269
|
.gemini3flash();
|
|
223
270
|
|
|
224
|
-
// Each call rotates to the next model
|
|
225
271
|
const r1 = await pool.new().addText("Request 1").message();
|
|
226
272
|
const r2 = await pool.new().addText("Request 2").message();
|
|
227
273
|
```
|
|
228
274
|
|
|
229
|
-
### MCP integration
|
|
275
|
+
### MCP integration
|
|
230
276
|
|
|
231
277
|
```javascript
|
|
232
278
|
const model = ModelMix.new({ config: { max_history: 10 } }).gpt5nano();
|
|
@@ -238,7 +284,7 @@ console.log(await model.message());
|
|
|
238
284
|
|
|
239
285
|
Requires `BRAVE_API_KEY` in `.env` for Brave Search MCP.
|
|
240
286
|
|
|
241
|
-
### Custom local tools
|
|
287
|
+
### Custom local tools
|
|
242
288
|
|
|
243
289
|
```javascript
|
|
244
290
|
const model = ModelMix.new({ config: { max_history: 10 } }).gpt5mini();
|
|
@@ -259,7 +305,18 @@ model.addText("What's the weather in Tokyo?");
|
|
|
259
305
|
console.log(await model.message());
|
|
260
306
|
```
|
|
261
307
|
|
|
262
|
-
|
|
308
|
+
Register multiple tools at once:
|
|
309
|
+
|
|
310
|
+
```javascript
|
|
311
|
+
model.addTools([
|
|
312
|
+
{ tool: { name: "tool_a", description: "...", inputSchema: {...} }, callback: async (args) => {...} },
|
|
313
|
+
{ tool: { name: "tool_b", description: "...", inputSchema: {...} }, callback: async (args) => {...} }
|
|
314
|
+
]);
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
Manage tools: `model.removeTool("tool_a")` and `model.listTools()` → `{ local, mcp }`.
|
|
318
|
+
|
|
319
|
+
### Rate limiting
|
|
263
320
|
|
|
264
321
|
```javascript
|
|
265
322
|
const model = ModelMix.new({
|
|
@@ -272,20 +329,31 @@ const model = ModelMix.new({
|
|
|
272
329
|
}).gpt5mini();
|
|
273
330
|
```
|
|
274
331
|
|
|
332
|
+
### Conversation history
|
|
333
|
+
|
|
334
|
+
```javascript
|
|
335
|
+
const chat = ModelMix.new({ config: { max_history: 10 } }).gpt5mini();
|
|
336
|
+
chat.addText("My name is Martin.");
|
|
337
|
+
await chat.message();
|
|
338
|
+
chat.addText("What's my name?");
|
|
339
|
+
const reply = await chat.message(); // "Martin"
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
`max_history`: 0 = no history (default), N = keep last N exchanges, -1 = unlimited.
|
|
343
|
+
|
|
275
344
|
### Debug mode
|
|
276
345
|
|
|
277
346
|
```javascript
|
|
278
347
|
const model = ModelMix.new({
|
|
279
|
-
config: { debug: 2 } // 0=silent, 1=minimal, 2=summary, 3=full
|
|
348
|
+
config: { debug: 2 } // 0=silent, 1=minimal, 2=summary, 3=full, 4=verbose
|
|
280
349
|
}).gpt5mini();
|
|
281
350
|
```
|
|
282
351
|
|
|
283
|
-
For full debug output, also set
|
|
352
|
+
For full debug output, also set: `DEBUG=ModelMix* node script.js`
|
|
284
353
|
|
|
285
|
-
###
|
|
354
|
+
### Free-tier models
|
|
286
355
|
|
|
287
356
|
```javascript
|
|
288
|
-
// These use providers with free quotas (OpenRouter, Groq, Cerebras)
|
|
289
357
|
const model = ModelMix.new()
|
|
290
358
|
.gptOss()
|
|
291
359
|
.kimiK2()
|
|
@@ -295,48 +363,61 @@ const model = ModelMix.new()
|
|
|
295
363
|
console.log(await model.message());
|
|
296
364
|
```
|
|
297
365
|
|
|
298
|
-
|
|
366
|
+
These use providers with free quotas (OpenRouter, Groq, Cerebras). If one runs out of quota, ModelMix falls back to the next.
|
|
367
|
+
|
|
368
|
+
### Multi-provider routing
|
|
369
|
+
|
|
370
|
+
Some model shorthands register the same model across multiple providers for maximum resilience. Control which providers are enabled via the `mix` parameter:
|
|
299
371
|
|
|
300
372
|
```javascript
|
|
301
|
-
const
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
373
|
+
const model = ModelMix.new({
|
|
374
|
+
mix: {
|
|
375
|
+
openrouter: true, // default: true
|
|
376
|
+
cerebras: true, // default: true
|
|
377
|
+
groq: true, // default: true
|
|
378
|
+
together: false, // default: false
|
|
379
|
+
lambda: false, // default: false
|
|
380
|
+
minimax: false, // default: false
|
|
381
|
+
fireworks: false // default: false
|
|
382
|
+
}
|
|
383
|
+
}).deepseekR1();
|
|
306
384
|
```
|
|
307
385
|
|
|
308
386
|
## Agent Usage Rules
|
|
309
387
|
|
|
310
|
-
-
|
|
311
|
-
- Use `ModelMix.new()` static factory
|
|
388
|
+
- Check `package.json` for `modelmix` before running `npm install`.
|
|
389
|
+
- Use `ModelMix.new()` static factory (not `new ModelMix()`).
|
|
312
390
|
- Store API keys in `.env` and load with `dotenv/config` or `process.loadEnvFile()`. Never hardcode keys.
|
|
313
391
|
- Chain models for resilience: primary model first, fallbacks after.
|
|
314
|
-
- When using MCP tools or `addTool()`, set `max_history` to at least 3.
|
|
315
|
-
- Use `.json()` for structured output instead of parsing text manually. Use descriptor objects `{ description, required, enum, default }`
|
|
392
|
+
- When using MCP tools or `addTool()`, set `max_history` to at least 3 — tool call/response pairs consume history slots.
|
|
393
|
+
- Use `.json()` for structured output instead of parsing text manually. Use descriptor objects `{ description, required, enum, default }` for richer schema control.
|
|
316
394
|
- Use `.message()` for simple text, `.raw()` when you need tokens/thinking/toolCalls.
|
|
317
395
|
- For thinking models, append `think` to the method name (e.g. `sonnet45think()`).
|
|
318
396
|
- Template placeholders use `{key}` syntax in both system prompts and user messages.
|
|
319
|
-
- The library uses CommonJS internally
|
|
320
|
-
-
|
|
397
|
+
- The library uses CommonJS internally but supports ESM import via `{ ModelMix }`.
|
|
398
|
+
- GPT-5+ models automatically use `max_completion_tokens` instead of `max_tokens`.
|
|
399
|
+
- o-series models (o3, o4mini) automatically strip `max_tokens` and `temperature` since those APIs don't support them.
|
|
400
|
+
- `addText()`, `addImage()`, `addImageFromUrl()`, and `addImageFromBuffer()` all accept `{ role }` as second argument (default `"user"`).
|
|
321
401
|
|
|
322
402
|
## API Quick Reference
|
|
323
403
|
|
|
324
404
|
| Method | Returns | Description |
|
|
325
405
|
| --- | --- | --- |
|
|
326
|
-
| `.addText(text)` | `this` | Add user message |
|
|
327
|
-
| `.addTextFromFile(path)` | `this` | Add user message from file |
|
|
406
|
+
| `.addText(text, {role?})` | `this` | Add user message |
|
|
407
|
+
| `.addTextFromFile(path, {role?})` | `this` | Add user message from file |
|
|
328
408
|
| `.setSystem(text)` | `this` | Set system prompt |
|
|
329
409
|
| `.setSystemFromFile(path)` | `this` | Set system prompt from file |
|
|
330
|
-
| `.addImage(path)` | `this` | Add image from file |
|
|
331
|
-
| `.addImageFromUrl(url)` | `this` | Add image from URL or data URI |
|
|
410
|
+
| `.addImage(path, {role?})` | `this` | Add image from file |
|
|
411
|
+
| `.addImageFromUrl(url, {role?})` | `this` | Add image from URL or data URI |
|
|
412
|
+
| `.addImageFromBuffer(buffer, {role?})` | `this` | Add image from Buffer |
|
|
332
413
|
| `.replace({})` | `this` | Set placeholder replacements |
|
|
333
414
|
| `.replaceKeyFromFile(key, path)` | `this` | Replace placeholder with file content |
|
|
334
415
|
| `.message()` | `Promise<string>` | Get text response |
|
|
335
|
-
| `.json(example, desc?, opts?)` | `Promise<object\|array>` | Get structured JSON
|
|
416
|
+
| `.json(example, desc?, opts?)` | `Promise<object\|array>` | Get structured JSON |
|
|
336
417
|
| `.raw()` | `Promise<{message, think, toolCalls, tokens, response}>` | Full response |
|
|
337
|
-
| `.lastRaw` | `object \| null` | Full response from last
|
|
418
|
+
| `.lastRaw` | `object \| null` | Full response from last call |
|
|
338
419
|
| `.stream(callback)` | `Promise` | Stream response |
|
|
339
|
-
| `.block()` | `Promise<string>` | Extract code block from response |
|
|
420
|
+
| `.block({addSystemExtra?})` | `Promise<string>` | Extract code block from response |
|
|
340
421
|
| `.addMCP(package)` | `Promise` | Add MCP server tools |
|
|
341
422
|
| `.addTool(def, callback)` | `this` | Register custom local tool |
|
|
342
423
|
| `.addTools([{tool, callback}])` | `this` | Register multiple tools |
|
|
@@ -345,6 +426,30 @@ const reply = await chat.message(); // "Martin"
|
|
|
345
426
|
| `.new()` | `ModelMix` | Clone instance sharing models |
|
|
346
427
|
| `.attach(key, provider)` | `this` | Attach custom provider |
|
|
347
428
|
|
|
429
|
+
## Available Provider Classes
|
|
430
|
+
|
|
431
|
+
`MixOpenAI` `MixAnthropic` `MixGoogle` `MixPerplexity` `MixGroq` `MixTogether` `MixGrok` `MixOpenRouter` `MixOllama` `MixLMStudio` `MixCustom` `MixCerebras` `MixFireworks` `MixMiniMax` `MixLambda`
|
|
432
|
+
|
|
433
|
+
## Troubleshooting
|
|
434
|
+
|
|
435
|
+
**Model fails with "API key not found"**
|
|
436
|
+
The provider's API key env var is not set. Add it to `.env` and ensure it loads before ModelMix runs. Each provider looks for its standard env var (e.g. `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`).
|
|
437
|
+
|
|
438
|
+
**Tool calls not working**
|
|
439
|
+
Set `max_history` to at least 3. Tool call/response pairs are stored in history and the model needs to see them to complete the conversation loop.
|
|
440
|
+
|
|
441
|
+
**JSON response parsing fails**
|
|
442
|
+
Add `{ addNote: true }` to the `json()` options — this injects instructions about JSON escaping that prevent common parsing errors. For complex schemas, also try `{ addExample: true }`.
|
|
443
|
+
|
|
444
|
+
**Model returns empty or truncated response**
|
|
445
|
+
Increase `max_tokens` in options. Default is 8192 but some tasks need more. For GPT-5+ models, `max_completion_tokens` is used automatically.
|
|
446
|
+
|
|
447
|
+
**Rate limit errors**
|
|
448
|
+
Configure Bottleneck: `config: { bottleneck: { maxConcurrent: 2, minTime: 2000 } }`. This throttles requests to stay within provider limits.
|
|
449
|
+
|
|
450
|
+
**MCP server fails to connect**
|
|
451
|
+
Ensure the MCP package is installed (`npm install @modelcontextprotocol/server-brave-search`) and required env vars are set. Call `addMCP()` with `await` — it's async.
|
|
452
|
+
|
|
348
453
|
## References
|
|
349
454
|
|
|
350
455
|
- [GitHub Repository](https://github.com/clasen/ModelMix)
|