smoltalk 0.0.53 → 0.0.55
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +75 -23
- package/dist/client.js +19 -10
- package/dist/model.js +4 -2
- package/dist/models.d.ts +50 -19
- package/dist/models.js +33 -9
- package/dist/strategies/index.d.ts +1 -1
- package/dist/strategies/index.js +11 -1
- package/dist/types.d.ts +89 -2
- package/package.json +1 -2
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Smoltalk
|
|
2
2
|
|
|
3
|
-
Smoltalk
|
|
3
|
+
Smoltalk exposes a common API to different LLM providers. There are other packages that do this, but Smoltalk allows you to build strategies on top of it. Here is a simple example.
|
|
4
4
|
|
|
5
5
|
## Install
|
|
6
6
|
|
|
@@ -8,26 +8,83 @@ Smoltalk is a package that exposes a common interface across different LLM provi
|
|
|
8
8
|
pnpm install smoltalk
|
|
9
9
|
```
|
|
10
10
|
|
|
11
|
-
##
|
|
11
|
+
## Hello world example
|
|
12
12
|
|
|
13
13
|
```typescript
|
|
14
|
-
import {
|
|
15
|
-
|
|
16
|
-
const client = getClient({
|
|
17
|
-
openAiApiKey: process.env.OPENAI_API_KEY || "",
|
|
18
|
-
googleApiKey: process.env.GEMINI_API_KEY || "",
|
|
19
|
-
logLevel: "debug",
|
|
20
|
-
model: "gemini-2.0-flash-lite",
|
|
21
|
-
});
|
|
14
|
+
import { text, userMessage } from "smoltalk";
|
|
22
15
|
|
|
23
16
|
async function main() {
|
|
24
|
-
const
|
|
25
|
-
|
|
17
|
+
const messages = [userMessage("Write me a 10 word story.")];
|
|
18
|
+
const response = await text({
|
|
19
|
+
messages,
|
|
20
|
+
model: "gpt-5.4",
|
|
21
|
+
});
|
|
22
|
+
console.log(response);
|
|
26
23
|
}
|
|
27
24
|
|
|
28
25
|
main();
|
|
29
26
|
```
|
|
30
27
|
|
|
28
|
+
This is functionality that other packages allow.
|
|
29
|
+
<details>
|
|
30
|
+
<summary>Response</summary>
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
{
|
|
34
|
+
success: true,
|
|
35
|
+
value: {
|
|
36
|
+
output: 'Clock stopped; everyone smiled as tomorrow finally arrived before yesterday.',
|
|
37
|
+
toolCalls: [],
|
|
38
|
+
usage: {
|
|
39
|
+
inputTokens: 14,
|
|
40
|
+
outputTokens: 15,
|
|
41
|
+
cachedInputTokens: 0,
|
|
42
|
+
totalTokens: 29
|
|
43
|
+
},
|
|
44
|
+
cost: {
|
|
45
|
+
inputCost: 0.000035,
|
|
46
|
+
outputCost: 0.000225,
|
|
47
|
+
cachedInputCost: undefined,
|
|
48
|
+
totalCost: 0.00026,
|
|
49
|
+
currency: 'USD'
|
|
50
|
+
},
|
|
51
|
+
model: 'gpt-5.4'
|
|
52
|
+
}
|
|
53
|
+
}
|
|
54
|
+
```
|
|
55
|
+
</details>
|
|
56
|
+
|
|
57
|
+
What if you wanted to have fallbacks in case the OpenAI API was down? Just change the `model` field:
|
|
58
|
+
|
|
59
|
+
```ts
|
|
60
|
+
const response = await text({
|
|
61
|
+
messages,
|
|
62
|
+
model: fallback("gpt-5.4", "gemini-2.5-flash-lite"),
|
|
63
|
+
// or multiple fallbacks:
|
|
64
|
+
// model: fallback("gpt-5.4", ["gemini-2.5-flash-lite", "gemini-3-flash-preview"]),
|
|
65
|
+
});
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
Or what if you wanted to try a couple of models and take the first response?
|
|
69
|
+
|
|
70
|
+
```ts
|
|
71
|
+
const response = await text({
|
|
72
|
+
messages,
|
|
73
|
+
model: race("gpt-5.4", "gemini-2.5-flash-lite", "o4-mini"),
|
|
74
|
+
});
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Or combine them:
|
|
78
|
+
|
|
79
|
+
```ts
|
|
80
|
+
const response = await text({
|
|
81
|
+
messages,
|
|
82
|
+
model: race(fallback("gpt-5.4", "gemini-2.5-flash-lite"), "o4-mini"),
|
|
83
|
+
});
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
You get the idea.
|
|
87
|
+
|
|
31
88
|
## Longer tutorial
|
|
32
89
|
To use Smoltak, you first create a client:
|
|
33
90
|
|
|
@@ -157,20 +214,15 @@ Detects when the model is stuck in a repetitive tool-call loop.
|
|
|
157
214
|
| `intervention` | `string` | Action to take: `"remove-tool"`, `"remove-all-tools"`, `"throw-error"`, or `"halt-execution"`. |
|
|
158
215
|
| `excludeTools` | `string[]` | Tool names to ignore when counting consecutive calls. |
|
|
159
216
|
|
|
160
|
-
##
|
|
217
|
+
## Limitations
|
|
218
|
+
Smoltalk has support for a limited number of providers right now, and is mostly focused on the stateless APIs for text completion, though I plan to add support for more providers as well as image and speech models later. Smoltalk is also a personal project, and there are alternatives backed by companies:
|
|
219
|
+
|
|
161
220
|
- Langchain
|
|
162
|
-
OpenRouter
|
|
221
|
+
- OpenRouter
|
|
163
222
|
- Vercel AI
|
|
164
223
|
|
|
165
|
-
These are all good options, but they are quite heavy, and I wanted a lighter option. That said, you may be better off with one of the above alternatives:
|
|
166
|
-
- They are backed by a business and are more likely to be responsive.
|
|
167
|
-
- They support way more functionality and providers. Smoltalk currently supports just a subset of functionality for OpenAI and Google.
|
|
168
|
-
|
|
169
|
-
## Functionality
|
|
170
|
-
Smoltalk pretty much lets you generate text using an OpenAI or Google model, with support for function calling and structured output, and that's it. I will add functionality and providers sporadically when I have time and need.
|
|
171
|
-
|
|
172
224
|
## Contributing
|
|
173
|
-
|
|
225
|
+
Contributions are welcome. Any of the following contributions would be helpful:
|
|
174
226
|
- Adding support for API parameters or endpoints
|
|
175
227
|
- Adding support for different providers
|
|
176
|
-
- Updating the list of models
|
|
228
|
+
- Updating the list of models
|
package/dist/client.js
CHANGED
|
@@ -26,29 +26,38 @@ export function getClient(config) {
|
|
|
26
26
|
}
|
|
27
27
|
provider = model.provider;
|
|
28
28
|
}
|
|
29
|
-
const
|
|
29
|
+
const resolvedKeys = {
|
|
30
|
+
openAiApiKey: config.openAiApiKey || process.env.OPENAI_API_KEY,
|
|
31
|
+
googleApiKey: config.googleApiKey || process.env.GEMINI_API_KEY,
|
|
32
|
+
anthropicApiKey: config.anthropicApiKey || process.env.ANTHROPIC_API_KEY,
|
|
33
|
+
};
|
|
34
|
+
const clientConfig = {
|
|
35
|
+
...config,
|
|
36
|
+
...resolvedKeys,
|
|
37
|
+
model: modelName,
|
|
38
|
+
};
|
|
30
39
|
switch (provider) {
|
|
31
40
|
case "anthropic":
|
|
32
|
-
if (!
|
|
33
|
-
throw new SmolError("No Anthropic API key provided. Please provide an Anthropic API key in the config using anthropicApiKey.");
|
|
41
|
+
if (!resolvedKeys.anthropicApiKey) {
|
|
42
|
+
throw new SmolError("No Anthropic API key provided. Please provide an Anthropic API key in the config using anthropicApiKey, or set the ANTHROPIC_API_KEY environment variable.");
|
|
34
43
|
}
|
|
35
44
|
return new SmolAnthropic({
|
|
36
45
|
...clientConfig,
|
|
37
|
-
anthropicApiKey:
|
|
46
|
+
anthropicApiKey: resolvedKeys.anthropicApiKey,
|
|
38
47
|
});
|
|
39
48
|
case "openai":
|
|
40
|
-
if (!
|
|
41
|
-
throw new SmolError("No OpenAI API key provided. Please provide an OpenAI API key in the config using openAiApiKey.");
|
|
49
|
+
if (!resolvedKeys.openAiApiKey) {
|
|
50
|
+
throw new SmolError("No OpenAI API key provided. Please provide an OpenAI API key in the config using openAiApiKey, or set the OPENAI_API_KEY environment variable.");
|
|
42
51
|
}
|
|
43
52
|
return new SmolOpenAi(clientConfig);
|
|
44
53
|
case "openai-responses":
|
|
45
|
-
if (!
|
|
46
|
-
throw new SmolError("No OpenAI API key provided. Please provide an OpenAI API key in the config using openAiApiKey.");
|
|
54
|
+
if (!resolvedKeys.openAiApiKey) {
|
|
55
|
+
throw new SmolError("No OpenAI API key provided. Please provide an OpenAI API key in the config using openAiApiKey, or set the OPENAI_API_KEY environment variable.");
|
|
47
56
|
}
|
|
48
57
|
return new SmolOpenAiResponses(clientConfig);
|
|
49
58
|
case "google":
|
|
50
|
-
if (!
|
|
51
|
-
throw new SmolError("No Google API key provided. Please provide a Google API key in the config using googleApiKey.");
|
|
59
|
+
if (!resolvedKeys.googleApiKey) {
|
|
60
|
+
throw new SmolError("No Google API key provided. Please provide a Google API key in the config using googleApiKey, or set the GEMINI_API_KEY environment variable.");
|
|
52
61
|
}
|
|
53
62
|
return new SmolGoogle(clientConfig);
|
|
54
63
|
case "ollama":
|
package/dist/model.js
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
import { getModel, isTextModel, textModels, } from "./models.js";
|
|
1
|
+
import { getModel, isTextModel, textModels, registeredTextModels, } from "./models.js";
|
|
2
2
|
import { SmolError } from "./smolError.js";
|
|
3
3
|
import { ModelConfigSchema, ModelNameAndProviderSchema, ModelNameSchema, } from "./strategies/types.js";
|
|
4
4
|
import { round } from "./util.js";
|
|
@@ -41,7 +41,7 @@ export class Model {
|
|
|
41
41
|
}
|
|
42
42
|
return undefined;
|
|
43
43
|
}
|
|
44
|
-
resolveModel(models = textModels) {
|
|
44
|
+
resolveModel(models = [...registeredTextModels, ...textModels]) {
|
|
45
45
|
if (ModelNameSchema.safeParse(this.model).success) {
|
|
46
46
|
return this.model;
|
|
47
47
|
}
|
|
@@ -120,6 +120,8 @@ export class Model {
|
|
|
120
120
|
return (m.inputTokenCost ?? 0) + (m.outputTokenCost ?? 0);
|
|
121
121
|
case "large-context":
|
|
122
122
|
return m.maxInputTokens;
|
|
123
|
+
default:
|
|
124
|
+
throw new SmolError(`Unknown optimization: ${optimization}`);
|
|
123
125
|
}
|
|
124
126
|
}
|
|
125
127
|
isLowerBetter(optimization) {
|
package/dist/models.d.ts
CHANGED
|
@@ -13,7 +13,7 @@ export declare const ProviderSchema: z.ZodEnum<{
|
|
|
13
13
|
export type Provider = z.infer<typeof ProviderSchema>;
|
|
14
14
|
export type BaseModel = {
|
|
15
15
|
modelName: string;
|
|
16
|
-
provider:
|
|
16
|
+
provider: string;
|
|
17
17
|
description?: string;
|
|
18
18
|
inputTokenCost?: number;
|
|
19
19
|
cachedInputTokenCost?: number;
|
|
@@ -32,7 +32,6 @@ export type ImageModel = BaseModel & {
|
|
|
32
32
|
};
|
|
33
33
|
export type TextModel = BaseModel & {
|
|
34
34
|
type: "text";
|
|
35
|
-
modelName: string;
|
|
36
35
|
maxInputTokens: number;
|
|
37
36
|
maxOutputTokens: number;
|
|
38
37
|
outputTokensPerSecond?: number;
|
|
@@ -68,7 +67,7 @@ export declare const speechToTextModels: readonly [{
|
|
|
68
67
|
export declare const textModels: readonly [{
|
|
69
68
|
readonly type: "text";
|
|
70
69
|
readonly modelName: "gpt-4o-mini";
|
|
71
|
-
readonly description: "GPT-4o mini (
|
|
70
|
+
readonly description: "GPT-4o mini ('o' for 'omni') is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. Knowledge cutoff: July 2025.";
|
|
72
71
|
readonly maxInputTokens: 128000;
|
|
73
72
|
readonly maxOutputTokens: 16384;
|
|
74
73
|
readonly inputTokenCost: 0.15;
|
|
@@ -79,7 +78,7 @@ export declare const textModels: readonly [{
|
|
|
79
78
|
}, {
|
|
80
79
|
readonly type: "text";
|
|
81
80
|
readonly modelName: "gpt-4o";
|
|
82
|
-
readonly description: "GPT-4o (
|
|
81
|
+
readonly description: "GPT-4o ('o' for 'omni') is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). Knowledge cutoff: April 2024.";
|
|
83
82
|
readonly maxInputTokens: 128000;
|
|
84
83
|
readonly maxOutputTokens: 16384;
|
|
85
84
|
readonly inputTokenCost: 2.5;
|
|
@@ -90,7 +89,7 @@ export declare const textModels: readonly [{
|
|
|
90
89
|
}, {
|
|
91
90
|
readonly type: "text";
|
|
92
91
|
readonly modelName: "o3";
|
|
93
|
-
readonly description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models.
|
|
92
|
+
readonly description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. Knowledge cutoff: June 2024.";
|
|
94
93
|
readonly maxInputTokens: 200000;
|
|
95
94
|
readonly maxOutputTokens: 100000;
|
|
96
95
|
readonly inputTokenCost: 2;
|
|
@@ -108,8 +107,8 @@ export declare const textModels: readonly [{
|
|
|
108
107
|
}, {
|
|
109
108
|
readonly type: "text";
|
|
110
109
|
readonly modelName: "o3-mini";
|
|
111
|
-
readonly description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.
|
|
112
|
-
readonly maxInputTokens:
|
|
110
|
+
readonly description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks. Knowledge cutoff: June 2024.";
|
|
111
|
+
readonly maxInputTokens: 500000;
|
|
113
112
|
readonly maxOutputTokens: 100000;
|
|
114
113
|
readonly inputTokenCost: 1.1;
|
|
115
114
|
readonly cachedInputTokenCost: 0.55;
|
|
@@ -129,9 +128,9 @@ export declare const textModels: readonly [{
|
|
|
129
128
|
readonly description: "Latest small o-series model optimized for fast, effective reasoning with exceptional performance in coding and visual tasks. Knowledge cutoff: June 2024.";
|
|
130
129
|
readonly maxInputTokens: 200000;
|
|
131
130
|
readonly maxOutputTokens: 100000;
|
|
132
|
-
readonly inputTokenCost:
|
|
133
|
-
readonly cachedInputTokenCost: 0.
|
|
134
|
-
readonly outputTokenCost:
|
|
131
|
+
readonly inputTokenCost: 0.6;
|
|
132
|
+
readonly cachedInputTokenCost: 0.3;
|
|
133
|
+
readonly outputTokenCost: 2.4;
|
|
135
134
|
readonly outputTokensPerSecond: 135;
|
|
136
135
|
readonly reasoning: {
|
|
137
136
|
readonly levels: readonly ["low", "medium", "high"];
|
|
@@ -325,6 +324,20 @@ export declare const textModels: readonly [{
|
|
|
325
324
|
readonly outputsSignatures: false;
|
|
326
325
|
};
|
|
327
326
|
readonly provider: "openai";
|
|
327
|
+
}, {
|
|
328
|
+
readonly type: "text";
|
|
329
|
+
readonly modelName: "gpt-5.2-pro";
|
|
330
|
+
readonly description: "GPT-5.2 Pro uses more compute for complex reasoning tasks. 400K context window. Knowledge cutoff: August 2025.";
|
|
331
|
+
readonly maxInputTokens: 400000;
|
|
332
|
+
readonly maxOutputTokens: 128000;
|
|
333
|
+
readonly inputTokenCost: 21;
|
|
334
|
+
readonly outputTokenCost: 168;
|
|
335
|
+
readonly reasoning: {
|
|
336
|
+
readonly canDisable: false;
|
|
337
|
+
readonly outputsThinking: false;
|
|
338
|
+
readonly outputsSignatures: false;
|
|
339
|
+
};
|
|
340
|
+
readonly provider: "openai";
|
|
328
341
|
}, {
|
|
329
342
|
readonly type: "text";
|
|
330
343
|
readonly modelName: "gpt-5.4";
|
|
@@ -676,7 +689,11 @@ export type ImageModelName = (typeof imageModels)[number]["modelName"];
|
|
|
676
689
|
export type SpeechToTextModelName = (typeof speechToTextModels)[number]["modelName"];
|
|
677
690
|
export type EmbeddingsModelName = (typeof embeddingsModels)[number]["modelName"];
|
|
678
691
|
export type ModelName = TextModelName | ImageModelName | SpeechToTextModelName;
|
|
679
|
-
export declare
|
|
692
|
+
export declare const registeredTextModels: TextModel[];
|
|
693
|
+
export declare function registerTextModel(model: Omit<TextModel, "type"> & {
|
|
694
|
+
type?: "text";
|
|
695
|
+
}): void;
|
|
696
|
+
export declare function getModel(modelName: ModelName): TextModel | {
|
|
680
697
|
readonly type: "speech-to-text";
|
|
681
698
|
readonly modelName: "whisper-local";
|
|
682
699
|
readonly provider: "local";
|
|
@@ -688,7 +705,7 @@ export declare function getModel(modelName: ModelName): {
|
|
|
688
705
|
} | {
|
|
689
706
|
readonly type: "text";
|
|
690
707
|
readonly modelName: "gpt-4o-mini";
|
|
691
|
-
readonly description: "GPT-4o mini (
|
|
708
|
+
readonly description: "GPT-4o mini ('o' for 'omni') is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. Knowledge cutoff: July 2025.";
|
|
692
709
|
readonly maxInputTokens: 128000;
|
|
693
710
|
readonly maxOutputTokens: 16384;
|
|
694
711
|
readonly inputTokenCost: 0.15;
|
|
@@ -699,7 +716,7 @@ export declare function getModel(modelName: ModelName): {
|
|
|
699
716
|
} | {
|
|
700
717
|
readonly type: "text";
|
|
701
718
|
readonly modelName: "gpt-4o";
|
|
702
|
-
readonly description: "GPT-4o (
|
|
719
|
+
readonly description: "GPT-4o ('o' for 'omni') is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). Knowledge cutoff: April 2024.";
|
|
703
720
|
readonly maxInputTokens: 128000;
|
|
704
721
|
readonly maxOutputTokens: 16384;
|
|
705
722
|
readonly inputTokenCost: 2.5;
|
|
@@ -710,7 +727,7 @@ export declare function getModel(modelName: ModelName): {
|
|
|
710
727
|
} | {
|
|
711
728
|
readonly type: "text";
|
|
712
729
|
readonly modelName: "o3";
|
|
713
|
-
readonly description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models.
|
|
730
|
+
readonly description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. Knowledge cutoff: June 2024.";
|
|
714
731
|
readonly maxInputTokens: 200000;
|
|
715
732
|
readonly maxOutputTokens: 100000;
|
|
716
733
|
readonly inputTokenCost: 2;
|
|
@@ -728,8 +745,8 @@ export declare function getModel(modelName: ModelName): {
|
|
|
728
745
|
} | {
|
|
729
746
|
readonly type: "text";
|
|
730
747
|
readonly modelName: "o3-mini";
|
|
731
|
-
readonly description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.
|
|
732
|
-
readonly maxInputTokens:
|
|
748
|
+
readonly description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks. Knowledge cutoff: June 2024.";
|
|
749
|
+
readonly maxInputTokens: 500000;
|
|
733
750
|
readonly maxOutputTokens: 100000;
|
|
734
751
|
readonly inputTokenCost: 1.1;
|
|
735
752
|
readonly cachedInputTokenCost: 0.55;
|
|
@@ -749,9 +766,9 @@ export declare function getModel(modelName: ModelName): {
|
|
|
749
766
|
readonly description: "Latest small o-series model optimized for fast, effective reasoning with exceptional performance in coding and visual tasks. Knowledge cutoff: June 2024.";
|
|
750
767
|
readonly maxInputTokens: 200000;
|
|
751
768
|
readonly maxOutputTokens: 100000;
|
|
752
|
-
readonly inputTokenCost:
|
|
753
|
-
readonly cachedInputTokenCost: 0.
|
|
754
|
-
readonly outputTokenCost:
|
|
769
|
+
readonly inputTokenCost: 0.6;
|
|
770
|
+
readonly cachedInputTokenCost: 0.3;
|
|
771
|
+
readonly outputTokenCost: 2.4;
|
|
755
772
|
readonly outputTokensPerSecond: 135;
|
|
756
773
|
readonly reasoning: {
|
|
757
774
|
readonly levels: readonly ["low", "medium", "high"];
|
|
@@ -945,6 +962,20 @@ export declare function getModel(modelName: ModelName): {
|
|
|
945
962
|
readonly outputsSignatures: false;
|
|
946
963
|
};
|
|
947
964
|
readonly provider: "openai";
|
|
965
|
+
} | {
|
|
966
|
+
readonly type: "text";
|
|
967
|
+
readonly modelName: "gpt-5.2-pro";
|
|
968
|
+
readonly description: "GPT-5.2 Pro uses more compute for complex reasoning tasks. 400K context window. Knowledge cutoff: August 2025.";
|
|
969
|
+
readonly maxInputTokens: 400000;
|
|
970
|
+
readonly maxOutputTokens: 128000;
|
|
971
|
+
readonly inputTokenCost: 21;
|
|
972
|
+
readonly outputTokenCost: 168;
|
|
973
|
+
readonly reasoning: {
|
|
974
|
+
readonly canDisable: false;
|
|
975
|
+
readonly outputsThinking: false;
|
|
976
|
+
readonly outputsSignatures: false;
|
|
977
|
+
};
|
|
978
|
+
readonly provider: "openai";
|
|
948
979
|
} | {
|
|
949
980
|
readonly type: "text";
|
|
950
981
|
readonly modelName: "gpt-5.4";
|
package/dist/models.js
CHANGED
|
@@ -33,7 +33,7 @@ export const textModels = [
|
|
|
33
33
|
{
|
|
34
34
|
type: "text",
|
|
35
35
|
modelName: "gpt-4o-mini",
|
|
36
|
-
description: "GPT-4o mini (
|
|
36
|
+
description: "GPT-4o mini ('o' for 'omni') is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. Knowledge cutoff: July 2025.",
|
|
37
37
|
maxInputTokens: 128000,
|
|
38
38
|
maxOutputTokens: 16384,
|
|
39
39
|
inputTokenCost: 0.15,
|
|
@@ -45,7 +45,7 @@ export const textModels = [
|
|
|
45
45
|
{
|
|
46
46
|
type: "text",
|
|
47
47
|
modelName: "gpt-4o",
|
|
48
|
-
description: "GPT-4o (
|
|
48
|
+
description: "GPT-4o ('o' for 'omni') is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). Knowledge cutoff: April 2024.",
|
|
49
49
|
maxInputTokens: 128000,
|
|
50
50
|
maxOutputTokens: 16384,
|
|
51
51
|
inputTokenCost: 2.5,
|
|
@@ -57,7 +57,7 @@ export const textModels = [
|
|
|
57
57
|
{
|
|
58
58
|
type: "text",
|
|
59
59
|
modelName: "o3",
|
|
60
|
-
description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models.
|
|
60
|
+
description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. Knowledge cutoff: June 2024.",
|
|
61
61
|
maxInputTokens: 200000,
|
|
62
62
|
maxOutputTokens: 100000,
|
|
63
63
|
inputTokenCost: 2,
|
|
@@ -76,8 +76,8 @@ export const textModels = [
|
|
|
76
76
|
{
|
|
77
77
|
type: "text",
|
|
78
78
|
modelName: "o3-mini",
|
|
79
|
-
description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.
|
|
80
|
-
maxInputTokens:
|
|
79
|
+
description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks. Knowledge cutoff: June 2024.",
|
|
80
|
+
maxInputTokens: 500000,
|
|
81
81
|
maxOutputTokens: 100000,
|
|
82
82
|
inputTokenCost: 1.1,
|
|
83
83
|
cachedInputTokenCost: 0.55,
|
|
@@ -98,9 +98,9 @@ export const textModels = [
|
|
|
98
98
|
description: "Latest small o-series model optimized for fast, effective reasoning with exceptional performance in coding and visual tasks. Knowledge cutoff: June 2024.",
|
|
99
99
|
maxInputTokens: 200000,
|
|
100
100
|
maxOutputTokens: 100000,
|
|
101
|
-
inputTokenCost:
|
|
102
|
-
cachedInputTokenCost: 0.
|
|
103
|
-
outputTokenCost:
|
|
101
|
+
inputTokenCost: 0.6,
|
|
102
|
+
cachedInputTokenCost: 0.3,
|
|
103
|
+
outputTokenCost: 2.4,
|
|
104
104
|
outputTokensPerSecond: 135,
|
|
105
105
|
reasoning: {
|
|
106
106
|
levels: ["low", "medium", "high"],
|
|
@@ -308,6 +308,21 @@ export const textModels = [
|
|
|
308
308
|
},
|
|
309
309
|
provider: "openai",
|
|
310
310
|
},
|
|
311
|
+
{
|
|
312
|
+
type: "text",
|
|
313
|
+
modelName: "gpt-5.2-pro",
|
|
314
|
+
description: "GPT-5.2 Pro uses more compute for complex reasoning tasks. 400K context window. Knowledge cutoff: August 2025.",
|
|
315
|
+
maxInputTokens: 400000,
|
|
316
|
+
maxOutputTokens: 128000,
|
|
317
|
+
inputTokenCost: 21,
|
|
318
|
+
outputTokenCost: 168,
|
|
319
|
+
reasoning: {
|
|
320
|
+
canDisable: false,
|
|
321
|
+
outputsThinking: false,
|
|
322
|
+
outputsSignatures: false,
|
|
323
|
+
},
|
|
324
|
+
provider: "openai",
|
|
325
|
+
},
|
|
311
326
|
{
|
|
312
327
|
type: "text",
|
|
313
328
|
modelName: "gpt-5.4",
|
|
@@ -715,8 +730,17 @@ export const imageModels = [
|
|
|
715
730
|
export const embeddingsModels = [
|
|
716
731
|
{ type: "embeddings", modelName: "text-embedding-3-small", tokenCost: 0.02 },
|
|
717
732
|
];
|
|
733
|
+
export const registeredTextModels = [];
|
|
734
|
+
export function registerTextModel(model) {
|
|
735
|
+
registeredTextModels.push({ ...model, type: "text" });
|
|
736
|
+
}
|
|
718
737
|
export function getModel(modelName) {
|
|
719
|
-
const allModels = [
|
|
738
|
+
const allModels = [
|
|
739
|
+
...textModels,
|
|
740
|
+
...imageModels,
|
|
741
|
+
...speechToTextModels,
|
|
742
|
+
...registeredTextModels,
|
|
743
|
+
];
|
|
720
744
|
return allModels.find((model) => model.modelName === modelName);
|
|
721
745
|
}
|
|
722
746
|
export function isImageModel(model) {
|
|
@@ -7,5 +7,5 @@ export * from "./raceStrategy.js";
|
|
|
7
7
|
export * from "./types.js";
|
|
8
8
|
export declare function race(...strategies: ModelParam[]): Strategy;
|
|
9
9
|
export declare function id(model: ModelLike): Strategy;
|
|
10
|
-
export declare function fallback(primaryStrategy: ModelParam, config: FallbackStrategyConfig): Strategy;
|
|
10
|
+
export declare function fallback(primaryStrategy: ModelParam, config: FallbackStrategyConfig | string | string[]): Strategy;
|
|
11
11
|
export declare function fromJSON(json: StrategyJSON): Strategy;
|
package/dist/strategies/index.js
CHANGED
|
@@ -14,7 +14,17 @@ export function id(model) {
|
|
|
14
14
|
return new IDStrategy(model);
|
|
15
15
|
}
|
|
16
16
|
export function fallback(primaryStrategy, config) {
|
|
17
|
-
|
|
17
|
+
let resolvedConfig;
|
|
18
|
+
if (typeof config === "string") {
|
|
19
|
+
resolvedConfig = { error: [config] };
|
|
20
|
+
}
|
|
21
|
+
else if (Array.isArray(config)) {
|
|
22
|
+
resolvedConfig = { error: config };
|
|
23
|
+
}
|
|
24
|
+
else {
|
|
25
|
+
resolvedConfig = config;
|
|
26
|
+
}
|
|
27
|
+
return new FallbackStrategy(primaryStrategy, resolvedConfig);
|
|
18
28
|
}
|
|
19
29
|
export function fromJSON(json) {
|
|
20
30
|
if (IDStrategyJSONSchema.safeParse(json).success) {
|
package/dist/types.d.ts
CHANGED
|
@@ -4,7 +4,7 @@ import z, { ZodType } from "zod";
|
|
|
4
4
|
import { Message } from "./classes/message/index.js";
|
|
5
5
|
import { ToolCall } from "./classes/ToolCall.js";
|
|
6
6
|
import { Model } from "./model.js";
|
|
7
|
-
import { ModelName
|
|
7
|
+
import { ModelName } from "./models.js";
|
|
8
8
|
import { ModelConfig, ModelNameAndProvider, Strategy, StrategyJSON } from "./strategies/types.js";
|
|
9
9
|
import { Result } from "./types/result.js";
|
|
10
10
|
export type ThinkingBlock = {
|
|
@@ -72,8 +72,95 @@ export type SmolConfig = {
|
|
|
72
72
|
anthropicApiKey?: string;
|
|
73
73
|
ollamaApiKey?: string;
|
|
74
74
|
ollamaHost?: string;
|
|
75
|
+
/**
|
|
76
|
+
The given model determines both
|
|
77
|
+
- what client is used
|
|
78
|
+
- what strategy is executed.
|
|
79
|
+
|
|
80
|
+
## 1. Specifying a model directly
|
|
81
|
+
The simplest case is to specify the name of a model from lib/models.ts.
|
|
82
|
+
Example:
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
model: "claude-sonnet-4-6"
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
## 2. Specifying a model config (letting Smoltalk pick the model)
|
|
89
|
+
You can instead also choose to let Smoltalk pick the model that it thinks
|
|
90
|
+
will be best for certain parameters. For example:
|
|
91
|
+
```
|
|
92
|
+
model: {
|
|
93
|
+
// find the fastest model
|
|
94
|
+
optimizeFor: ["speed"],
|
|
95
|
+
|
|
96
|
+
// from either Anthropic or Google, whichever is faster
|
|
97
|
+
providers: ["anthropic", "google"],
|
|
98
|
+
limit: {
|
|
99
|
+
// 1 mil input tokens + 1 mil output tokens together
|
|
100
|
+
// should cost less than $10 for the models being considered
|
|
101
|
+
cost: 10,
|
|
102
|
+
},
|
|
103
|
+
}
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
This can be a good option because as better models come out,
|
|
107
|
+
you won't need to update your code. You can just update Smoltalk
|
|
108
|
+
and it will pick the best model automatically.
|
|
109
|
+
|
|
110
|
+
## 3. Specifying a strategy
|
|
111
|
+
Finally, you can instead specify a strategy to execute. For example:
|
|
112
|
+
|
|
113
|
+
```
|
|
114
|
+
model: {
|
|
115
|
+
type: "race",
|
|
116
|
+
params: {
|
|
117
|
+
strategies: ["gemini-2.5-flash-lite", "gemini-2.5-pro"],
|
|
118
|
+
},
|
|
119
|
+
}
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
In this case, Smoltalk will run your request over using both LLMs simultaneously,
|
|
123
|
+
and take the response that finishes first.
|
|
124
|
+
|
|
125
|
+
You can also choose to specify fallbacks in case the first model
|
|
126
|
+
returns an error for some reason. This can be a good way to try something
|
|
127
|
+
with a fast model and then use a slower but more powerful model if the first one fails.
|
|
128
|
+
|
|
129
|
+
```
|
|
130
|
+
model: {
|
|
131
|
+
type: "fallback",
|
|
132
|
+
params: {
|
|
133
|
+
primaryStrategy: "gemini-2.5-flash-lite",
|
|
134
|
+
config: {
|
|
135
|
+
error: ["gemini-2.5-pro"],
|
|
136
|
+
},
|
|
137
|
+
},
|
|
138
|
+
}
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
You can of course combine strategies together to create more complex behavior:
|
|
142
|
+
|
|
143
|
+
```
|
|
144
|
+
const geminiLiteWithFallback = {
|
|
145
|
+
type: "fallback",
|
|
146
|
+
params: {
|
|
147
|
+
primaryStrategy: "gemini-2.5-flash-lite",
|
|
148
|
+
config: {
|
|
149
|
+
error: ["gemini-2.5-pro"],
|
|
150
|
+
},
|
|
151
|
+
},
|
|
152
|
+
};
|
|
153
|
+
|
|
154
|
+
model: {
|
|
155
|
+
type: "race",
|
|
156
|
+
params: {
|
|
157
|
+
strategies: ["gemini-2.5-pro", geminiLiteWithFallback],
|
|
158
|
+
},
|
|
159
|
+
}
|
|
160
|
+
```
|
|
161
|
+
*/
|
|
75
162
|
model: ModelParam;
|
|
76
|
-
provider?:
|
|
163
|
+
provider?: string;
|
|
77
164
|
logLevel?: LogLevel;
|
|
78
165
|
statelog?: Partial<{
|
|
79
166
|
host: string;
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "smoltalk",
|
|
3
|
-
"version": "0.0.
|
|
3
|
+
"version": "0.0.55",
|
|
4
4
|
"description": "A common interface for LLM APIs",
|
|
5
5
|
"homepage": "https://github.com/egonSchiele/smoltalk",
|
|
6
6
|
"scripts": {
|
|
@@ -33,7 +33,6 @@
|
|
|
33
33
|
"devDependencies": {
|
|
34
34
|
"@types/node": "^25.0.3",
|
|
35
35
|
"prettier": "^3.7.4",
|
|
36
|
-
"termcolors": "github:egonSchiele/termcolors",
|
|
37
36
|
"typedoc": "^0.28.15",
|
|
38
37
|
"typescript": "^5.9.3",
|
|
39
38
|
"vitest": "^4.0.16"
|