smoltalk 0.0.53 → 0.0.55

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Smoltalk
2
2
 
3
- Smoltalk is a package that exposes a common interface across different LLM providers. It exists because I think it's important to have an npm package that allows users to try out different kinds of LLMs, and prevents vendor lock-in. Using a different LLM should be as simple as switching out a model name.
3
+ Smoltalk exposes a common API to different LLM providers. There are other packages that do this, but Smoltalk allows you to build strategies on top of it. Here is a simple example.
4
4
 
5
5
  ## Install
6
6
 
@@ -8,26 +8,83 @@ Smoltalk is a package that exposes a common interface across different LLM provi
8
8
  pnpm install smoltalk
9
9
  ```
10
10
 
11
- ## Quickstart
11
+ ## Hello world example
12
12
 
13
13
  ```typescript
14
- import { getClient } from "smoltalk";
15
-
16
- const client = getClient({
17
- openAiApiKey: process.env.OPENAI_API_KEY || "",
18
- googleApiKey: process.env.GEMINI_API_KEY || "",
19
- logLevel: "debug",
20
- model: "gemini-2.0-flash-lite",
21
- });
14
+ import { text, userMessage } from "smoltalk";
22
15
 
23
16
  async function main() {
24
- const resp = await client.prompt("Hello, how are you?");
25
- console.log(resp);
17
+ const messages = [userMessage("Write me a 10 word story.")];
18
+ const response = await text({
19
+ messages,
20
+ model: "gpt-5.4",
21
+ });
22
+ console.log(response);
26
23
  }
27
24
 
28
25
  main();
29
26
  ```
30
27
 
28
+ This is functionality that other packages allow.
29
+ <details>
30
+ <summary>Response</summary>
31
+
32
+ ```
33
+ {
34
+ success: true,
35
+ value: {
36
+ output: 'Clock stopped; everyone smiled as tomorrow finally arrived before yesterday.',
37
+ toolCalls: [],
38
+ usage: {
39
+ inputTokens: 14,
40
+ outputTokens: 15,
41
+ cachedInputTokens: 0,
42
+ totalTokens: 29
43
+ },
44
+ cost: {
45
+ inputCost: 0.000035,
46
+ outputCost: 0.000225,
47
+ cachedInputCost: undefined,
48
+ totalCost: 0.00026,
49
+ currency: 'USD'
50
+ },
51
+ model: 'gpt-5.4'
52
+ }
53
+ }
54
+ ```
55
+ </details>
56
+
57
+ What if you wanted to have fallbacks in case the OpenAI API was down? Just change the `model` field:
58
+
59
+ ```ts
60
+ const response = await text({
61
+ messages,
62
+ model: fallback("gpt-5.4", "gemini-2.5-flash-lite"),
63
+ // or multiple fallbacks:
64
+ // model: fallback("gpt-5.4", ["gemini-2.5-flash-lite", "gemini-3-flash-preview"]),
65
+ });
66
+ ```
67
+
68
+ Or what if you wanted to try a couple of models and take the first response?
69
+
70
+ ```ts
71
+ const response = await text({
72
+ messages,
73
+ model: race("gpt-5.4", "gemini-2.5-flash-lite", "o4-mini"),
74
+ });
75
+ ```
76
+
77
+ Or combine them:
78
+
79
+ ```ts
80
+ const response = await text({
81
+ messages,
82
+ model: race(fallback("gpt-5.4", "gemini-2.5-flash-lite"), "o4-mini"),
83
+ });
84
+ ```
85
+
86
+ You get the idea.
87
+
31
88
  ## Longer tutorial
32
89
  To use Smoltak, you first create a client:
33
90
 
@@ -157,20 +214,15 @@ Detects when the model is stuck in a repetitive tool-call loop.
157
214
  | `intervention` | `string` | Action to take: `"remove-tool"`, `"remove-all-tools"`, `"throw-error"`, or `"halt-execution"`. |
158
215
  | `excludeTools` | `string[]` | Tool names to ignore when counting consecutive calls. |
159
216
 
160
- ## Prior art
217
+ ## Limitations
218
+ Smoltalk has support for a limited number of providers right now, and is mostly focused on the stateless APIs for text completion, though I plan to add support for more providers as well as image and speech models later. Smoltalk is also a personal project, and there are alternatives backed by companies:
219
+
161
220
  - Langchain
162
- OpenRouter
221
+ - OpenRouter
163
222
  - Vercel AI
164
223
 
165
- These are all good options, but they are quite heavy, and I wanted a lighter option. That said, you may be better off with one of the above alternatives:
166
- - They are backed by a business and are more likely to be responsive.
167
- - They support way more functionality and providers. Smoltalk currently supports just a subset of functionality for OpenAI and Google.
168
-
169
- ## Functionality
170
- Smoltalk pretty much lets you generate text using an OpenAI or Google model, with support for function calling and structured output, and that's it. I will add functionality and providers sporadically when I have time and need.
171
-
172
224
  ## Contributing
173
- This repo could use some help! Any of the following contributions would be helpful:
225
+ Contributions are welcome. Any of the following contributions would be helpful:
174
226
  - Adding support for API parameters or endpoints
175
227
  - Adding support for different providers
176
- - Updating the list of models
228
+ - Updating the list of models
package/dist/client.js CHANGED
@@ -26,29 +26,38 @@ export function getClient(config) {
26
26
  }
27
27
  provider = model.provider;
28
28
  }
29
- const clientConfig = { ...config, model: modelName };
29
+ const resolvedKeys = {
30
+ openAiApiKey: config.openAiApiKey || process.env.OPENAI_API_KEY,
31
+ googleApiKey: config.googleApiKey || process.env.GEMINI_API_KEY,
32
+ anthropicApiKey: config.anthropicApiKey || process.env.ANTHROPIC_API_KEY,
33
+ };
34
+ const clientConfig = {
35
+ ...config,
36
+ ...resolvedKeys,
37
+ model: modelName,
38
+ };
30
39
  switch (provider) {
31
40
  case "anthropic":
32
- if (!config.anthropicApiKey) {
33
- throw new SmolError("No Anthropic API key provided. Please provide an Anthropic API key in the config using anthropicApiKey.");
41
+ if (!resolvedKeys.anthropicApiKey) {
42
+ throw new SmolError("No Anthropic API key provided. Please provide an Anthropic API key in the config using anthropicApiKey, or set the ANTHROPIC_API_KEY environment variable.");
34
43
  }
35
44
  return new SmolAnthropic({
36
45
  ...clientConfig,
37
- anthropicApiKey: config.anthropicApiKey,
46
+ anthropicApiKey: resolvedKeys.anthropicApiKey,
38
47
  });
39
48
  case "openai":
40
- if (!config.openAiApiKey) {
41
- throw new SmolError("No OpenAI API key provided. Please provide an OpenAI API key in the config using openAiApiKey.");
49
+ if (!resolvedKeys.openAiApiKey) {
50
+ throw new SmolError("No OpenAI API key provided. Please provide an OpenAI API key in the config using openAiApiKey, or set the OPENAI_API_KEY environment variable.");
42
51
  }
43
52
  return new SmolOpenAi(clientConfig);
44
53
  case "openai-responses":
45
- if (!config.openAiApiKey) {
46
- throw new SmolError("No OpenAI API key provided. Please provide an OpenAI API key in the config using openAiApiKey.");
54
+ if (!resolvedKeys.openAiApiKey) {
55
+ throw new SmolError("No OpenAI API key provided. Please provide an OpenAI API key in the config using openAiApiKey, or set the OPENAI_API_KEY environment variable.");
47
56
  }
48
57
  return new SmolOpenAiResponses(clientConfig);
49
58
  case "google":
50
- if (!config.googleApiKey) {
51
- throw new SmolError("No Google API key provided. Please provide a Google API key in the config using googleApiKey.");
59
+ if (!resolvedKeys.googleApiKey) {
60
+ throw new SmolError("No Google API key provided. Please provide a Google API key in the config using googleApiKey, or set the GEMINI_API_KEY environment variable.");
52
61
  }
53
62
  return new SmolGoogle(clientConfig);
54
63
  case "ollama":
package/dist/model.js CHANGED
@@ -1,4 +1,4 @@
1
- import { getModel, isTextModel, textModels, } from "./models.js";
1
+ import { getModel, isTextModel, textModels, registeredTextModels, } from "./models.js";
2
2
  import { SmolError } from "./smolError.js";
3
3
  import { ModelConfigSchema, ModelNameAndProviderSchema, ModelNameSchema, } from "./strategies/types.js";
4
4
  import { round } from "./util.js";
@@ -41,7 +41,7 @@ export class Model {
41
41
  }
42
42
  return undefined;
43
43
  }
44
- resolveModel(models = textModels) {
44
+ resolveModel(models = [...registeredTextModels, ...textModels]) {
45
45
  if (ModelNameSchema.safeParse(this.model).success) {
46
46
  return this.model;
47
47
  }
@@ -120,6 +120,8 @@ export class Model {
120
120
  return (m.inputTokenCost ?? 0) + (m.outputTokenCost ?? 0);
121
121
  case "large-context":
122
122
  return m.maxInputTokens;
123
+ default:
124
+ throw new SmolError(`Unknown optimization: ${optimization}`);
123
125
  }
124
126
  }
125
127
  isLowerBetter(optimization) {
package/dist/models.d.ts CHANGED
@@ -13,7 +13,7 @@ export declare const ProviderSchema: z.ZodEnum<{
13
13
  export type Provider = z.infer<typeof ProviderSchema>;
14
14
  export type BaseModel = {
15
15
  modelName: string;
16
- provider: Provider;
16
+ provider: string;
17
17
  description?: string;
18
18
  inputTokenCost?: number;
19
19
  cachedInputTokenCost?: number;
@@ -32,7 +32,6 @@ export type ImageModel = BaseModel & {
32
32
  };
33
33
  export type TextModel = BaseModel & {
34
34
  type: "text";
35
- modelName: string;
36
35
  maxInputTokens: number;
37
36
  maxOutputTokens: number;
38
37
  outputTokensPerSecond?: number;
@@ -68,7 +67,7 @@ export declare const speechToTextModels: readonly [{
68
67
  export declare const textModels: readonly [{
69
68
  readonly type: "text";
70
69
  readonly modelName: "gpt-4o-mini";
71
- readonly description: "GPT-4o mini (o for omni) is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. The knowledge cutoff for GPT-4o-mini models is October, 2023.";
70
+ readonly description: "GPT-4o mini ('o' for 'omni') is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. Knowledge cutoff: July 2025.";
72
71
  readonly maxInputTokens: 128000;
73
72
  readonly maxOutputTokens: 16384;
74
73
  readonly inputTokenCost: 0.15;
@@ -79,7 +78,7 @@ export declare const textModels: readonly [{
79
78
  }, {
80
79
  readonly type: "text";
81
80
  readonly modelName: "gpt-4o";
82
- readonly description: "GPT-4o (o for omni) is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). The knowledge cutoff for GPT-4o-mini models is October, 2023.";
81
+ readonly description: "GPT-4o ('o' for 'omni') is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). Knowledge cutoff: April 2024.";
83
82
  readonly maxInputTokens: 128000;
84
83
  readonly maxOutputTokens: 16384;
85
84
  readonly inputTokenCost: 2.5;
@@ -90,7 +89,7 @@ export declare const textModels: readonly [{
90
89
  }, {
91
90
  readonly type: "text";
92
91
  readonly modelName: "o3";
93
- readonly description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. The knowledge cutoff for o3 models is October, 2023.";
92
+ readonly description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. Knowledge cutoff: June 2024.";
94
93
  readonly maxInputTokens: 200000;
95
94
  readonly maxOutputTokens: 100000;
96
95
  readonly inputTokenCost: 2;
@@ -108,8 +107,8 @@ export declare const textModels: readonly [{
108
107
  }, {
109
108
  readonly type: "text";
110
109
  readonly modelName: "o3-mini";
111
- readonly description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.The knowledge cutoff for o3-mini models is October, 2023.";
112
- readonly maxInputTokens: 200000;
110
+ readonly description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks. Knowledge cutoff: June 2024.";
111
+ readonly maxInputTokens: 500000;
113
112
  readonly maxOutputTokens: 100000;
114
113
  readonly inputTokenCost: 1.1;
115
114
  readonly cachedInputTokenCost: 0.55;
@@ -129,9 +128,9 @@ export declare const textModels: readonly [{
129
128
  readonly description: "Latest small o-series model optimized for fast, effective reasoning with exceptional performance in coding and visual tasks. Knowledge cutoff: June 2024.";
130
129
  readonly maxInputTokens: 200000;
131
130
  readonly maxOutputTokens: 100000;
132
- readonly inputTokenCost: 1.1;
133
- readonly cachedInputTokenCost: 0.275;
134
- readonly outputTokenCost: 4.4;
131
+ readonly inputTokenCost: 0.6;
132
+ readonly cachedInputTokenCost: 0.3;
133
+ readonly outputTokenCost: 2.4;
135
134
  readonly outputTokensPerSecond: 135;
136
135
  readonly reasoning: {
137
136
  readonly levels: readonly ["low", "medium", "high"];
@@ -325,6 +324,20 @@ export declare const textModels: readonly [{
325
324
  readonly outputsSignatures: false;
326
325
  };
327
326
  readonly provider: "openai";
327
+ }, {
328
+ readonly type: "text";
329
+ readonly modelName: "gpt-5.2-pro";
330
+ readonly description: "GPT-5.2 Pro uses more compute for complex reasoning tasks. 400K context window. Knowledge cutoff: August 2025.";
331
+ readonly maxInputTokens: 400000;
332
+ readonly maxOutputTokens: 128000;
333
+ readonly inputTokenCost: 21;
334
+ readonly outputTokenCost: 168;
335
+ readonly reasoning: {
336
+ readonly canDisable: false;
337
+ readonly outputsThinking: false;
338
+ readonly outputsSignatures: false;
339
+ };
340
+ readonly provider: "openai";
328
341
  }, {
329
342
  readonly type: "text";
330
343
  readonly modelName: "gpt-5.4";
@@ -676,7 +689,11 @@ export type ImageModelName = (typeof imageModels)[number]["modelName"];
676
689
  export type SpeechToTextModelName = (typeof speechToTextModels)[number]["modelName"];
677
690
  export type EmbeddingsModelName = (typeof embeddingsModels)[number]["modelName"];
678
691
  export type ModelName = TextModelName | ImageModelName | SpeechToTextModelName;
679
- export declare function getModel(modelName: ModelName): {
692
+ export declare const registeredTextModels: TextModel[];
693
+ export declare function registerTextModel(model: Omit<TextModel, "type"> & {
694
+ type?: "text";
695
+ }): void;
696
+ export declare function getModel(modelName: ModelName): TextModel | {
680
697
  readonly type: "speech-to-text";
681
698
  readonly modelName: "whisper-local";
682
699
  readonly provider: "local";
@@ -688,7 +705,7 @@ export declare function getModel(modelName: ModelName): {
688
705
  } | {
689
706
  readonly type: "text";
690
707
  readonly modelName: "gpt-4o-mini";
691
- readonly description: "GPT-4o mini (o for omni) is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. The knowledge cutoff for GPT-4o-mini models is October, 2023.";
708
+ readonly description: "GPT-4o mini ('o' for 'omni') is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. Knowledge cutoff: July 2025.";
692
709
  readonly maxInputTokens: 128000;
693
710
  readonly maxOutputTokens: 16384;
694
711
  readonly inputTokenCost: 0.15;
@@ -699,7 +716,7 @@ export declare function getModel(modelName: ModelName): {
699
716
  } | {
700
717
  readonly type: "text";
701
718
  readonly modelName: "gpt-4o";
702
- readonly description: "GPT-4o (o for omni) is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). The knowledge cutoff for GPT-4o-mini models is October, 2023.";
719
+ readonly description: "GPT-4o ('o' for 'omni') is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). Knowledge cutoff: April 2024.";
703
720
  readonly maxInputTokens: 128000;
704
721
  readonly maxOutputTokens: 16384;
705
722
  readonly inputTokenCost: 2.5;
@@ -710,7 +727,7 @@ export declare function getModel(modelName: ModelName): {
710
727
  } | {
711
728
  readonly type: "text";
712
729
  readonly modelName: "o3";
713
- readonly description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. The knowledge cutoff for o3 models is October, 2023.";
730
+ readonly description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. Knowledge cutoff: June 2024.";
714
731
  readonly maxInputTokens: 200000;
715
732
  readonly maxOutputTokens: 100000;
716
733
  readonly inputTokenCost: 2;
@@ -728,8 +745,8 @@ export declare function getModel(modelName: ModelName): {
728
745
  } | {
729
746
  readonly type: "text";
730
747
  readonly modelName: "o3-mini";
731
- readonly description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.The knowledge cutoff for o3-mini models is October, 2023.";
732
- readonly maxInputTokens: 200000;
748
+ readonly description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks. Knowledge cutoff: June 2024.";
749
+ readonly maxInputTokens: 500000;
733
750
  readonly maxOutputTokens: 100000;
734
751
  readonly inputTokenCost: 1.1;
735
752
  readonly cachedInputTokenCost: 0.55;
@@ -749,9 +766,9 @@ export declare function getModel(modelName: ModelName): {
749
766
  readonly description: "Latest small o-series model optimized for fast, effective reasoning with exceptional performance in coding and visual tasks. Knowledge cutoff: June 2024.";
750
767
  readonly maxInputTokens: 200000;
751
768
  readonly maxOutputTokens: 100000;
752
- readonly inputTokenCost: 1.1;
753
- readonly cachedInputTokenCost: 0.275;
754
- readonly outputTokenCost: 4.4;
769
+ readonly inputTokenCost: 0.6;
770
+ readonly cachedInputTokenCost: 0.3;
771
+ readonly outputTokenCost: 2.4;
755
772
  readonly outputTokensPerSecond: 135;
756
773
  readonly reasoning: {
757
774
  readonly levels: readonly ["low", "medium", "high"];
@@ -945,6 +962,20 @@ export declare function getModel(modelName: ModelName): {
945
962
  readonly outputsSignatures: false;
946
963
  };
947
964
  readonly provider: "openai";
965
+ } | {
966
+ readonly type: "text";
967
+ readonly modelName: "gpt-5.2-pro";
968
+ readonly description: "GPT-5.2 Pro uses more compute for complex reasoning tasks. 400K context window. Knowledge cutoff: August 2025.";
969
+ readonly maxInputTokens: 400000;
970
+ readonly maxOutputTokens: 128000;
971
+ readonly inputTokenCost: 21;
972
+ readonly outputTokenCost: 168;
973
+ readonly reasoning: {
974
+ readonly canDisable: false;
975
+ readonly outputsThinking: false;
976
+ readonly outputsSignatures: false;
977
+ };
978
+ readonly provider: "openai";
948
979
  } | {
949
980
  readonly type: "text";
950
981
  readonly modelName: "gpt-5.4";
package/dist/models.js CHANGED
@@ -33,7 +33,7 @@ export const textModels = [
33
33
  {
34
34
  type: "text",
35
35
  modelName: "gpt-4o-mini",
36
- description: "GPT-4o mini (o for omni) is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. The knowledge cutoff for GPT-4o-mini models is October, 2023.",
36
+ description: "GPT-4o mini ('o' for 'omni') is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. Knowledge cutoff: July 2025.",
37
37
  maxInputTokens: 128000,
38
38
  maxOutputTokens: 16384,
39
39
  inputTokenCost: 0.15,
@@ -45,7 +45,7 @@ export const textModels = [
45
45
  {
46
46
  type: "text",
47
47
  modelName: "gpt-4o",
48
- description: "GPT-4o (o for omni) is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). The knowledge cutoff for GPT-4o-mini models is October, 2023.",
48
+ description: "GPT-4o ('o' for 'omni') is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). Knowledge cutoff: April 2024.",
49
49
  maxInputTokens: 128000,
50
50
  maxOutputTokens: 16384,
51
51
  inputTokenCost: 2.5,
@@ -57,7 +57,7 @@ export const textModels = [
57
57
  {
58
58
  type: "text",
59
59
  modelName: "o3",
60
- description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. The knowledge cutoff for o3 models is October, 2023.",
60
+ description: "o3 is a reasoning model that sets a new standard for math, science, coding, visual reasoning tasks, and technical writing. Part of the o-series of reasoning models. Knowledge cutoff: June 2024.",
61
61
  maxInputTokens: 200000,
62
62
  maxOutputTokens: 100000,
63
63
  inputTokenCost: 2,
@@ -76,8 +76,8 @@ export const textModels = [
76
76
  {
77
77
  type: "text",
78
78
  modelName: "o3-mini",
79
- description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.The knowledge cutoff for o3-mini models is October, 2023.",
80
- maxInputTokens: 200000,
79
+ description: "o3-mini is our most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks. Knowledge cutoff: June 2024.",
80
+ maxInputTokens: 500000,
81
81
  maxOutputTokens: 100000,
82
82
  inputTokenCost: 1.1,
83
83
  cachedInputTokenCost: 0.55,
@@ -98,9 +98,9 @@ export const textModels = [
98
98
  description: "Latest small o-series model optimized for fast, effective reasoning with exceptional performance in coding and visual tasks. Knowledge cutoff: June 2024.",
99
99
  maxInputTokens: 200000,
100
100
  maxOutputTokens: 100000,
101
- inputTokenCost: 1.1,
102
- cachedInputTokenCost: 0.275,
103
- outputTokenCost: 4.4,
101
+ inputTokenCost: 0.6,
102
+ cachedInputTokenCost: 0.3,
103
+ outputTokenCost: 2.4,
104
104
  outputTokensPerSecond: 135,
105
105
  reasoning: {
106
106
  levels: ["low", "medium", "high"],
@@ -308,6 +308,21 @@ export const textModels = [
308
308
  },
309
309
  provider: "openai",
310
310
  },
311
+ {
312
+ type: "text",
313
+ modelName: "gpt-5.2-pro",
314
+ description: "GPT-5.2 Pro uses more compute for complex reasoning tasks. 400K context window. Knowledge cutoff: August 2025.",
315
+ maxInputTokens: 400000,
316
+ maxOutputTokens: 128000,
317
+ inputTokenCost: 21,
318
+ outputTokenCost: 168,
319
+ reasoning: {
320
+ canDisable: false,
321
+ outputsThinking: false,
322
+ outputsSignatures: false,
323
+ },
324
+ provider: "openai",
325
+ },
311
326
  {
312
327
  type: "text",
313
328
  modelName: "gpt-5.4",
@@ -715,8 +730,17 @@ export const imageModels = [
715
730
  export const embeddingsModels = [
716
731
  { type: "embeddings", modelName: "text-embedding-3-small", tokenCost: 0.02 },
717
732
  ];
733
+ export const registeredTextModels = [];
734
+ export function registerTextModel(model) {
735
+ registeredTextModels.push({ ...model, type: "text" });
736
+ }
718
737
  export function getModel(modelName) {
719
- const allModels = [...textModels, ...imageModels, ...speechToTextModels];
738
+ const allModels = [
739
+ ...textModels,
740
+ ...imageModels,
741
+ ...speechToTextModels,
742
+ ...registeredTextModels,
743
+ ];
720
744
  return allModels.find((model) => model.modelName === modelName);
721
745
  }
722
746
  export function isImageModel(model) {
@@ -7,5 +7,5 @@ export * from "./raceStrategy.js";
7
7
  export * from "./types.js";
8
8
  export declare function race(...strategies: ModelParam[]): Strategy;
9
9
  export declare function id(model: ModelLike): Strategy;
10
- export declare function fallback(primaryStrategy: ModelParam, config: FallbackStrategyConfig): Strategy;
10
+ export declare function fallback(primaryStrategy: ModelParam, config: FallbackStrategyConfig | string | string[]): Strategy;
11
11
  export declare function fromJSON(json: StrategyJSON): Strategy;
@@ -14,7 +14,17 @@ export function id(model) {
14
14
  return new IDStrategy(model);
15
15
  }
16
16
  export function fallback(primaryStrategy, config) {
17
- return new FallbackStrategy(primaryStrategy, config);
17
+ let resolvedConfig;
18
+ if (typeof config === "string") {
19
+ resolvedConfig = { error: [config] };
20
+ }
21
+ else if (Array.isArray(config)) {
22
+ resolvedConfig = { error: config };
23
+ }
24
+ else {
25
+ resolvedConfig = config;
26
+ }
27
+ return new FallbackStrategy(primaryStrategy, resolvedConfig);
18
28
  }
19
29
  export function fromJSON(json) {
20
30
  if (IDStrategyJSONSchema.safeParse(json).success) {
package/dist/types.d.ts CHANGED
@@ -4,7 +4,7 @@ import z, { ZodType } from "zod";
4
4
  import { Message } from "./classes/message/index.js";
5
5
  import { ToolCall } from "./classes/ToolCall.js";
6
6
  import { Model } from "./model.js";
7
- import { ModelName, Provider } from "./models.js";
7
+ import { ModelName } from "./models.js";
8
8
  import { ModelConfig, ModelNameAndProvider, Strategy, StrategyJSON } from "./strategies/types.js";
9
9
  import { Result } from "./types/result.js";
10
10
  export type ThinkingBlock = {
@@ -72,8 +72,95 @@ export type SmolConfig = {
72
72
  anthropicApiKey?: string;
73
73
  ollamaApiKey?: string;
74
74
  ollamaHost?: string;
75
+ /**
76
+ The given model determines both
77
+ - what client is used
78
+ - what strategy is executed.
79
+
80
+ ## 1. Specifying a model directly
81
+ The simplest case is to specify the name of a model from lib/models.ts.
82
+ Example:
83
+
84
+ ```
85
+ model: "claude-sonnet-4-6"
86
+ ```
87
+
88
+ ## 2. Specifying a model config (letting Smoltalk pick the model)
89
+ You can instead also choose to let Smoltalk pick the model that it thinks
90
+ will be best for certain parameters. For example:
91
+ ```
92
+ model: {
93
+ // find the fastest model
94
+ optimizeFor: ["speed"],
95
+
96
+ // from either Anthropic or Google, whichever is faster
97
+ providers: ["anthropic", "google"],
98
+ limit: {
99
+ // 1 mil input tokens + 1 mil output tokens together
100
+ // should cost less than $10 for the models being considered
101
+ cost: 10,
102
+ },
103
+ }
104
+ ```
105
+
106
+ This can be a good option because as better models come out,
107
+ you won't need to update your code. You can just update Smoltalk
108
+ and it will pick the best model automatically.
109
+
110
+ ## 3. Specifying a strategy
111
+ Finally, you can instead specify a strategy to execute. For example:
112
+
113
+ ```
114
+ model: {
115
+ type: "race",
116
+ params: {
117
+ strategies: ["gemini-2.5-flash-lite", "gemini-2.5-pro"],
118
+ },
119
+ }
120
+ ```
121
+
122
+ In this case, Smoltalk will run your request over using both LLMs simultaneously,
123
+ and take the response that finishes first.
124
+
125
+ You can also choose to specify fallbacks in case the first model
126
+ returns an error for some reason. This can be a good way to try something
127
+ with a fast model and then use a slower but more powerful model if the first one fails.
128
+
129
+ ```
130
+ model: {
131
+ type: "fallback",
132
+ params: {
133
+ primaryStrategy: "gemini-2.5-flash-lite",
134
+ config: {
135
+ error: ["gemini-2.5-pro"],
136
+ },
137
+ },
138
+ }
139
+ ```
140
+
141
+ You can of course combine strategies together to create more complex behavior:
142
+
143
+ ```
144
+ const geminiLiteWithFallback = {
145
+ type: "fallback",
146
+ params: {
147
+ primaryStrategy: "gemini-2.5-flash-lite",
148
+ config: {
149
+ error: ["gemini-2.5-pro"],
150
+ },
151
+ },
152
+ };
153
+
154
+ model: {
155
+ type: "race",
156
+ params: {
157
+ strategies: ["gemini-2.5-pro", geminiLiteWithFallback],
158
+ },
159
+ }
160
+ ```
161
+ */
75
162
  model: ModelParam;
76
- provider?: Provider;
163
+ provider?: string;
77
164
  logLevel?: LogLevel;
78
165
  statelog?: Partial<{
79
166
  host: string;
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "smoltalk",
3
- "version": "0.0.53",
3
+ "version": "0.0.55",
4
4
  "description": "A common interface for LLM APIs",
5
5
  "homepage": "https://github.com/egonSchiele/smoltalk",
6
6
  "scripts": {
@@ -33,7 +33,6 @@
33
33
  "devDependencies": {
34
34
  "@types/node": "^25.0.3",
35
35
  "prettier": "^3.7.4",
36
- "termcolors": "github:egonSchiele/termcolors",
37
36
  "typedoc": "^0.28.15",
38
37
  "typescript": "^5.9.3",
39
38
  "vitest": "^4.0.16"