@pwshub/aisdk 0.0.2 → 0.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,6 +2,10 @@
2
2
 
3
3
  A thin, unified AI client for OpenAI, Anthropic, Google, DashScope, and DeepSeek with automatic parameter normalization and fallback support.
4
4
 
5
+ [![npm version](https://badge.fury.io/js/@pwshub%2Faisdk.svg)](https://badge.fury.io/js/@pwshub%2Faisdk)
6
+ ![CodeQL](https://github.com/pwshub/aisdk/workflows/CodeQL/badge.svg)
7
+ ![CI test](https://github.com/pwshub/aisdk/workflows/ci-test/badge.svg)
8
+
5
9
  ## Features
6
10
 
7
11
  - **Unified API**: Single interface for multiple AI providers
@@ -11,6 +15,17 @@ A thin, unified AI client for OpenAI, Anthropic, Google, DashScope, and DeepSeek
11
15
  - **Token usage tracking**: Detailed token counts and estimated cost per request
12
16
  - **Provider-specific options**: Pass provider-specific parameters when needed
13
17
 
18
+ ## Limitations
19
+
20
+ This package is designed for **personal project usage** with a focus on simplicity:
21
+
22
+ - **Text-only chat**: Supports basic text generation and conversation
23
+ - **No streaming**: All responses are returned as complete results
24
+ - **No multimodal inputs**: Images, audio, video, and file uploads are not supported
25
+ - **No function calling**: Tool use and function calling features are not available
26
+
27
+ For production applications requiring advanced features, consider using the official provider SDKs directly.
28
+
14
29
  ## Installation
15
30
 
16
31
  ```bash
@@ -80,7 +95,8 @@ Sends a text generation request.
80
95
  inputTokens: number,
81
96
  outputTokens: number,
82
97
  cacheTokens: number,
83
- estimatedCost: number // USD
98
+ reasoningTokens: number, // Reasoning/thinking tokens (0 for non-reasoning models)
99
+ estimatedCost: number // USD
84
100
  }
85
101
  }
86
102
  ```
@@ -133,6 +149,27 @@ const result = await ai.ask({
133
149
  })
134
150
  ```
135
151
 
152
+ ### Google (Disable Thinking Mode)
153
+
154
+ Gemini 2.5 Pro and other reasoning models use thinking tokens by default. Disable thinking mode to reduce latency and cost:
155
+
156
+ ```javascript
157
+ const result = await ai.ask({
158
+ model: 'gemini-2.5-pro',
159
+ apikey: process.env.GOOGLE_API_KEY,
160
+ prompt: 'What is the capital of Vietnam?',
161
+ maxTokens: 256,
162
+ providerOptions: {
163
+ thinkingConfig: {
164
+ thinkingBudget: 0, // Disable reasoning tokens
165
+ includeThoughts: false, // Don't include thought process in response
166
+ },
167
+ },
168
+ })
169
+ ```
170
+
171
+ > **Note:** When thinking mode is enabled (default for Gemini 2.5 Pro), the model may use most of the `maxTokens` budget for reasoning. Set a higher `maxTokens` (e.g., 2048) or disable thinking with `thinkingBudget: 0`.
172
+
136
173
  ### With Fallbacks
137
174
 
138
175
  ```javascript
@@ -166,6 +203,36 @@ const result = await ai.ask({
166
203
  })
167
204
  ```
168
205
 
206
+ ### DashScope with Custom Region
207
+
208
+ DashScope endpoints vary by region. Use `gatewayUrl` to specify your region:
209
+
210
+ ```javascript
211
+ import { createAi } from '@pwshub/aisdk'
212
+
213
+ // Singapore region
214
+ const aiSingapore = createAi({
215
+ gatewayUrl: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1',
216
+ })
217
+
218
+ // Virginia region (US)
219
+ const aiUS = createAi({
220
+ gatewayUrl: 'https://dashscope-us.aliyuncs.com/compatible-mode/v1',
221
+ })
222
+
223
+ // Beijing region (China)
224
+ const aiCN = createAi({
225
+ gatewayUrl: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
226
+ })
227
+
228
+ // Use the regional client
229
+ const result = await aiSingapore.ask({
230
+ model: 'qwen3.5-plus',
231
+ apikey: process.env.DASHSCOPE_API_KEY,
232
+ prompt: 'Hello from Singapore!',
233
+ })
234
+ ```
235
+
169
236
  ### DeepSeek
170
237
 
171
238
  ```javascript
@@ -178,22 +245,40 @@ const result = await ai.ask({
178
245
 
179
246
  ## Supported Models
180
247
 
181
- This library does not ship with a predefined list of models. Instead, it accepts **any model** from the supported providers:
248
+ The library comes with **30 pre-configured models** from all supported providers:
182
249
 
183
- - **OpenAI**: Any OpenAI model
184
- - **Anthropic**: Any Anthropic model
185
- - **Google**: Any Google model
186
- - **DashScope**: Any DashScope model
187
- - **DeepSeek**: Any DeepSeek model
250
+ - **OpenAI**: gpt-4.1-nano, gpt-4.1-mini, gpt-4.1, gpt-4o, gpt-4o-mini, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, gpt-5.2, gpt-5.4, o3-mini, o4-mini
251
+ - **Anthropic**: claude-haiku-4-5, claude-sonnet-4-6, claude-sonnet-4-5, claude-opus-4-6
252
+ - **Google**: gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.5-pro, gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview
253
+ - **DashScope**: qwen-flash, qwen3.5-flash, qwen-plus, qwen3.5-plus, qwen-max, qwen3-max
254
+ - **DeepSeek**: deepseek-chat, deepseek-reasoner
188
255
 
189
- ### Loading Models
256
+ ### Managing Models
190
257
 
191
- Models are loaded programmatically via `setModels()` from external sources (CMS, API, or local files for evaluation):
258
+ Models are managed via `addModels()` and `setModels()`:
192
259
 
193
260
  ```javascript
194
- import { createAi, setModels } from '@pwshub/aisdk'
261
+ import { createAi, addModels, setModels, listModels } from '@pwshub/aisdk'
262
+
263
+ // List all available models (30 models loaded by default)
264
+ console.log(listModels())
265
+
266
+ // Add more models to the existing list
267
+ addModels([
268
+ {
269
+ id: 'my-custom-model',
270
+ name: 'my-custom-model',
271
+ provider: 'openai',
272
+ input_price: 1,
273
+ output_price: 2,
274
+ cache_price: 0.5,
275
+ max_in: 128000,
276
+ max_out: 16384,
277
+ enable: true,
278
+ },
279
+ ])
195
280
 
196
- // Load models from your CMS or API
281
+ // Replace all models with your own list (e.g., from CMS)
197
282
  const modelsFromCms = await fetch('https://cms.example.com/api/models').then(r => r.json())
198
283
  setModels(modelsFromCms)
199
284
 
@@ -205,6 +290,8 @@ const result = await ai.ask({
205
290
  })
206
291
  ```
207
292
 
293
+ > **Note:** Models are loaded automatically from `src/models.js` when the library is imported. You don't need to call `setModels()` unless you want to use a custom model list.
294
+
208
295
  ### Model Record Format
209
296
 
210
297
  Each model record should include:
@@ -219,8 +306,6 @@ Each model record should include:
219
306
  - `enable`: Boolean to enable/disable the model
220
307
  - `supportedParams` (optional): Array of supported parameter names
221
308
 
222
- > **Note**: The `examples/` folder includes `models.json` as a reference for running evaluation scripts.
223
-
224
309
  ## Error Handling
225
310
 
226
311
  ```javascript
package/index.d.ts CHANGED
@@ -25,6 +25,7 @@ export interface Usage {
25
25
  inputTokens: number;
26
26
  outputTokens: number;
27
27
  cacheTokens: number;
28
+ reasoningTokens: number;
28
29
  estimatedCost: number;
29
30
  }
30
31
 
@@ -69,5 +70,6 @@ export interface AiClient {
69
70
  }
70
71
 
71
72
  export function createAi(opts?: AiOptions): AiClient;
73
+ export function addModels(models: ModelRecord[]): void;
72
74
  export function setModels(models: ModelRecord[]): void;
73
75
  export function listModels(): ModelRecord[];
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@pwshub/aisdk",
3
- "version": "0.0.2",
3
+ "version": "0.0.4",
4
4
  "description": "A thin, unified AI client for OpenAI, Anthropic, Google, DashScope, and DeepSeek with automatic param normalization and fallback support",
5
5
  "repository": {
6
6
  "type": "git",
@@ -11,11 +11,12 @@
11
11
  "bun": ">=1.0.0"
12
12
  },
13
13
  "type": "module",
14
- "main": "./src/index.js",
15
14
  "exports": {
16
- ".": "./src/index.js"
15
+ ".": {
16
+ "types": "./index.d.ts",
17
+ "default": "./src/index.js"
18
+ }
17
19
  },
18
- "types": "./index.d.ts",
19
20
  "files": [
20
21
  "src",
21
22
  "index.d.ts"
package/src/index.js CHANGED
@@ -12,7 +12,7 @@
12
12
  * temperature: 0.5,
13
13
  * })
14
14
  * console.log(result.text)
15
- * console.log(result.usage) // { inputTokens, outputTokens, cacheTokens, estimatedCost }
15
+ * console.log(result.usage) // { inputTokens, outputTokens, cacheTokens, reasoningTokens, estimatedCost }
16
16
  *
17
17
  * @example With fallbacks
18
18
  * const result = await ai.ask({
@@ -38,10 +38,21 @@
38
38
  * },
39
39
  * })
40
40
  *
41
+ * @example Using messages array for multi-turn conversations
42
+ * const result = await ai.ask({
43
+ * model: 'claude-sonnet-4-20250514',
44
+ * apikey: 'your-api-key',
45
+ * messages: [
46
+ * { role: 'user', content: 'What is the capital of Vietnam?' },
47
+ * { role: 'assistant', content: 'The capital of Vietnam is Hanoi.' },
48
+ * { role: 'user', content: 'What is its population?' },
49
+ * ],
50
+ * })
51
+ *
41
52
  */
42
53
 
43
54
  import {
44
- getModel, listModels, setModels,
55
+ getModel, listModels, setModels, addModels,
45
56
  } from './registry.js'
46
57
  import { normalizeConfig } from './config.js'
47
58
  import { coerceConfig } from './coerce.js'
@@ -64,8 +75,9 @@ export {
64
75
  * @typedef {Object} AskParams
65
76
  * @property {string} model - Model ID (must be registered via setModels())
66
77
  * @property {string} apikey - API key for the provider
67
- * @property {string} prompt - The user message
68
- * @property {string} [system] - Optional system prompt
78
+ * @property {string} [prompt] - The user message (alternative to messages)
79
+ * @property {string} [system] - Optional system prompt (used with prompt)
80
+ * @property {import('./providers.js').Message[]} [messages] - Array of messages with role and content (alternative to prompt)
69
81
  * @property {string[]} [fallbacks] - Ordered list of fallback model IDs
70
82
  * @property {Record<string, unknown>} [providerOptions] - Provider-specific options merged into body
71
83
  * @property {number} [temperature]
@@ -81,6 +93,7 @@ export {
81
93
  * @property {number} inputTokens
82
94
  * @property {number} outputTokens
83
95
  * @property {number} cacheTokens
96
+ * @property {number} reasoningTokens
84
97
  * @property {number} estimatedCost - In USD, based on models.json pricing
85
98
  */
86
99
 
@@ -113,7 +126,7 @@ const extractGenConfig = (params) => {
113
126
  const calcCost = (usage, record) => {
114
127
  const M = 1_000_000
115
128
  const inputCost = (usage.inputTokens / M) * record.input_price
116
- const outputCost = (usage.outputTokens / M) * record.output_price
129
+ const outputCost = ((usage.outputTokens + usage.reasoningTokens) / M) * record.output_price
117
130
  const cacheCost = (usage.cacheTokens / M) * record.cache_price
118
131
 
119
132
  // Round to 8 decimal places to avoid floating point noise
@@ -151,11 +164,11 @@ const callModel = async (modelId, params, gatewayUrl) => {
151
164
  const normalizedConfig = normalizeConfig(coerced, providerId, supportedParams, modelId)
152
165
 
153
166
  const {
154
- prompt, system, providerOptions = {},
167
+ prompt, system, messages, providerOptions = {},
155
168
  } = params
156
169
 
157
170
  /** @type {import('./providers.js').Message[]} */
158
- const messages = [
171
+ const messageList = messages ?? [
159
172
  ...(system ? [{
160
173
  role: 'system', content: system,
161
174
  }] : []),
@@ -165,7 +178,7 @@ const callModel = async (modelId, params, gatewayUrl) => {
165
178
  ]
166
179
 
167
180
  const url = gatewayUrl ?? adapter.url(modelName, apikey)
168
- const body = adapter.buildBody(modelName, messages, normalizedConfig, providerOptions)
181
+ const body = adapter.buildBody(modelName, messageList, normalizedConfig, providerOptions)
169
182
 
170
183
  let res
171
184
  try {
@@ -266,4 +279,4 @@ export const createAi = (opts = {}) => {
266
279
  }
267
280
  }
268
281
 
269
- export { setModels }
282
+ export { addModels, setModels, listModels }
package/src/models.js ADDED
@@ -0,0 +1,345 @@
1
+ /**
2
+ * @fileoverview Default model registry for @pwshub/aisdk.
3
+ *
4
+ * This module exports a default list of models that are loaded automatically
5
+ * when the library is imported. Users can modify this list via addModels()
6
+ * and setModels() from the main export.
7
+ */
8
+
9
+ /**
10
+ * @typedef {import('./registry.js').ModelRecord} ModelRecord
11
+ */
12
+
13
+ /** @type {ModelRecord[]} */
14
+ export const DEFAULT_MODELS = [
15
+ {
16
+ id: 'claude-haiku-4-5',
17
+ name: 'claude-haiku-4-5',
18
+ provider: 'anthropic',
19
+ input_price: 1,
20
+ output_price: 5,
21
+ cache_price: 0,
22
+ max_in: 200000,
23
+ max_out: 64000,
24
+ enable: true,
25
+ },
26
+ {
27
+ id: 'claude-sonnet-4-6',
28
+ name: 'claude-sonnet-4-6',
29
+ provider: 'anthropic',
30
+ input_price: 3,
31
+ output_price: 15,
32
+ cache_price: 0,
33
+ max_in: 200000,
34
+ max_out: 64000,
35
+ enable: true,
36
+ },
37
+ {
38
+ id: 'claude-sonnet-4-5',
39
+ name: 'claude-sonnet-4-5',
40
+ provider: 'anthropic',
41
+ input_price: 3,
42
+ output_price: 15,
43
+ cache_price: 0,
44
+ max_in: 200000,
45
+ max_out: 1000000,
46
+ enable: true,
47
+ },
48
+ {
49
+ id: 'claude-opus-4-6',
50
+ name: 'claude-opus-4-6',
51
+ provider: 'anthropic',
52
+ input_price: 5,
53
+ output_price: 25,
54
+ cache_price: 0,
55
+ max_in: 200000,
56
+ max_out: 128000,
57
+ enable: true,
58
+ },
59
+ {
60
+ id: 'gemini-2.5-flash',
61
+ name: 'gemini-2.5-flash',
62
+ provider: 'google',
63
+ input_price: 0.3,
64
+ output_price: 2.5,
65
+ cache_price: 0.03,
66
+ max_in: 1048576,
67
+ max_out: 65536,
68
+ enable: true,
69
+ },
70
+ {
71
+ id: 'gemini-2.5-flash-lite',
72
+ name: 'gemini-2.5-flash-lite',
73
+ provider: 'google',
74
+ input_price: 0.1,
75
+ output_price: 0.4,
76
+ cache_price: 0.01,
77
+ max_in: 1048576,
78
+ max_out: 65536,
79
+ enable: true,
80
+ },
81
+ {
82
+ id: 'gemini-2.5-pro',
83
+ name: 'gemini-2.5-pro',
84
+ provider: 'google',
85
+ input_price: 1.25,
86
+ output_price: 10,
87
+ cache_price: 0.125,
88
+ max_in: 1048576,
89
+ max_out: 65536,
90
+ enable: true,
91
+ },
92
+ {
93
+ id: 'gemini-3.1-pro-preview',
94
+ name: 'gemini-3.1-pro-preview',
95
+ provider: 'google',
96
+ input_price: 2,
97
+ output_price: 12,
98
+ cache_price: 0.2,
99
+ max_in: 1048576,
100
+ max_out: 65536,
101
+ enable: true,
102
+ },
103
+ {
104
+ id: 'gemini-3.1-flash-lite-preview',
105
+ name: 'gemini-3.1-flash-lite-preview',
106
+ provider: 'google',
107
+ input_price: 0.25,
108
+ output_price: 1.5,
109
+ cache_price: 0.025,
110
+ max_in: 1048576,
111
+ max_out: 65536,
112
+ enable: true,
113
+ },
114
+ {
115
+ id: 'gpt-4.1-nano',
116
+ name: 'gpt-4.1-nano',
117
+ provider: 'openai',
118
+ input_price: 0.1,
119
+ output_price: 0.4,
120
+ cache_price: 0.025,
121
+ max_in: 1047576,
122
+ max_out: 32768,
123
+ enable: true,
124
+ },
125
+ {
126
+ id: 'gpt-4.1-mini',
127
+ name: 'gpt-4.1-mini',
128
+ provider: 'openai',
129
+ input_price: 0.4,
130
+ output_price: 1.6,
131
+ cache_price: 0.1,
132
+ max_in: 1047576,
133
+ max_out: 32768,
134
+ enable: true,
135
+ },
136
+ {
137
+ id: 'gpt-4.1',
138
+ name: 'gpt-4.1',
139
+ provider: 'openai',
140
+ input_price: 2,
141
+ output_price: 8,
142
+ cache_price: 0.5,
143
+ max_in: 1047576,
144
+ max_out: 32768,
145
+ enable: true,
146
+ },
147
+ {
148
+ id: 'gpt-4o',
149
+ name: 'gpt-4o',
150
+ provider: 'openai',
151
+ input_price: 2.5,
152
+ output_price: 10,
153
+ cache_price: 1.25,
154
+ max_in: 128000,
155
+ max_out: 16384,
156
+ enable: true,
157
+ },
158
+ {
159
+ id: 'gpt-4o-mini',
160
+ name: 'gpt-4o-mini',
161
+ provider: 'openai',
162
+ input_price: 0.15,
163
+ output_price: 0.6,
164
+ cache_price: 0.075,
165
+ max_in: 128000,
166
+ max_out: 16384,
167
+ enable: true,
168
+ },
169
+ {
170
+ id: 'gpt-5',
171
+ name: 'gpt-5',
172
+ provider: 'openai',
173
+ input_price: 1.25,
174
+ output_price: 10,
175
+ cache_price: 0.125,
176
+ max_in: 400000,
177
+ max_out: 128000,
178
+ enable: true,
179
+ },
180
+ {
181
+ id: 'gpt-5-mini',
182
+ name: 'gpt-5-mini',
183
+ provider: 'openai',
184
+ input_price: 0.25,
185
+ output_price: 2,
186
+ cache_price: 0.025,
187
+ max_in: 400000,
188
+ max_out: 128000,
189
+ enable: true,
190
+ },
191
+ {
192
+ id: 'gpt-5-nano',
193
+ name: 'gpt-5-nano',
194
+ provider: 'openai',
195
+ input_price: 0.05,
196
+ output_price: 0.4,
197
+ cache_price: 0.005,
198
+ max_in: 400000,
199
+ max_out: 128000,
200
+ enable: true,
201
+ },
202
+ {
203
+ id: 'gpt-5.1',
204
+ name: 'gpt-5.1',
205
+ provider: 'openai',
206
+ input_price: 1.25,
207
+ output_price: 10,
208
+ cache_price: 0.125,
209
+ max_in: 400000,
210
+ max_out: 128000,
211
+ enable: true,
212
+ },
213
+ {
214
+ id: 'gpt-5.2',
215
+ name: 'gpt-5.2',
216
+ provider: 'openai',
217
+ input_price: 1.75,
218
+ output_price: 14,
219
+ cache_price: 0.175,
220
+ max_in: 400000,
221
+ max_out: 128000,
222
+ enable: true,
223
+ },
224
+ {
225
+ id: 'gpt-5.4',
226
+ name: 'gpt-5.4',
227
+ provider: 'openai',
228
+ input_price: 2.5,
229
+ output_price: 15,
230
+ cache_price: 0.25,
231
+ max_in: 1050000,
232
+ max_out: 128000,
233
+ enable: true,
234
+ },
235
+ {
236
+ id: 'o3-mini',
237
+ name: 'o3-mini',
238
+ provider: 'openai',
239
+ input_price: 1.1,
240
+ output_price: 4.4,
241
+ cache_price: 0.55,
242
+ max_in: 200000,
243
+ max_out: 100000,
244
+ enable: true,
245
+ },
246
+ {
247
+ id: 'o4-mini',
248
+ name: 'o4-mini',
249
+ provider: 'openai',
250
+ input_price: 1.1,
251
+ output_price: 4.4,
252
+ cache_price: 0.275,
253
+ max_in: 200000,
254
+ max_out: 100000,
255
+ enable: true,
256
+ },
257
+ {
258
+ id: 'deepseek-chat',
259
+ name: 'deepseek-chat',
260
+ provider: 'deepseek',
261
+ input_price: 0.28,
262
+ output_price: 0.42,
263
+ cache_price: 0.028,
264
+ max_in: 128000,
265
+ max_out: 8000,
266
+ enable: true,
267
+ },
268
+ {
269
+ id: 'deepseek-reasoner',
270
+ name: 'deepseek-reasoner',
271
+ provider: 'deepseek',
272
+ input_price: 0.28,
273
+ output_price: 0.42,
274
+ cache_price: 0.028,
275
+ max_in: 128000,
276
+ max_out: 64000,
277
+ enable: true,
278
+ },
279
+ {
280
+ id: 'qwen-flash',
281
+ name: 'qwen-flash',
282
+ provider: 'dashscope',
283
+ input_price: 0.05,
284
+ output_price: 0.4,
285
+ cache_price: 0,
286
+ max_in: 995904,
287
+ max_out: 32768,
288
+ enable: true,
289
+ },
290
+ {
291
+ id: 'qwen3.5-flash',
292
+ name: 'qwen3.5-flash',
293
+ provider: 'dashscope',
294
+ input_price: 0.1,
295
+ output_price: 0.4,
296
+ cache_price: 0,
297
+ max_in: 983616,
298
+ max_out: 65536,
299
+ enable: true,
300
+ },
301
+ {
302
+ id: 'qwen-plus',
303
+ name: 'qwen-plus',
304
+ provider: 'dashscope',
305
+ input_price: 0.4,
306
+ output_price: 1.2,
307
+ cache_price: 0,
308
+ max_in: 997952,
309
+ max_out: 32768,
310
+ enable: true,
311
+ },
312
+ {
313
+ id: 'qwen3.5-plus',
314
+ name: 'qwen3.5-plus',
315
+ provider: 'dashscope',
316
+ input_price: 0.4,
317
+ output_price: 2.4,
318
+ cache_price: 0,
319
+ max_in: 991808,
320
+ max_out: 65536,
321
+ enable: true,
322
+ },
323
+ {
324
+ id: 'qwen-max',
325
+ name: 'qwen-max',
326
+ provider: 'dashscope',
327
+ input_price: 1.6,
328
+ output_price: 6.4,
329
+ cache_price: 0,
330
+ max_in: 30720,
331
+ max_out: 8192,
332
+ enable: true,
333
+ },
334
+ {
335
+ id: 'qwen3-max',
336
+ name: 'qwen3-max',
337
+ provider: 'dashscope',
338
+ input_price: 1.2,
339
+ output_price: 6,
340
+ cache_price: 0,
341
+ max_in: 258048,
342
+ max_out: 65536,
343
+ enable: true,
344
+ },
345
+ ]
package/src/providers.js CHANGED
@@ -23,7 +23,8 @@
23
23
  * @typedef {Object} RawUsage
24
24
  * @property {number} inputTokens
25
25
  * @property {number} outputTokens
26
- * @property {number} cacheTokens - 0 when not applicable
26
+ * @property {number} cacheTokens - 0 when not applicable
27
+ * @property {number} reasoningTokens - 0 when not applicable
27
28
  */
28
29
 
29
30
  /**
@@ -84,6 +85,7 @@ const openai = {
84
85
  inputTokens: data.usage?.prompt_tokens ?? 0,
85
86
  outputTokens: data.usage?.completion_tokens ?? 0,
86
87
  cacheTokens: data.usage?.prompt_tokens_details?.cached_tokens ?? 0,
88
+ reasoningTokens: data.usage?.completion_tokens_details?.reasoning_tokens ?? 0,
87
89
  }),
88
90
  }
89
91
 
@@ -119,7 +121,8 @@ const anthropic = {
119
121
  extractUsage: (data) => ({
120
122
  inputTokens: data.usage?.input_tokens ?? 0,
121
123
  outputTokens: data.usage?.output_tokens ?? 0,
122
- cacheTokens: data.usage?.cache_read_input_tokens ?? 0,
124
+ cacheTokens: (data.usage?.cache_read_input_tokens ?? 0) + (data.usage?.cache_creation_input_tokens ?? 0),
125
+ reasoningTokens: 0,
123
126
  }),
124
127
  }
125
128
 
@@ -136,10 +139,19 @@ const google = {
136
139
  role: m.role === 'assistant' ? 'model' : 'user',
137
140
  parts: [{ text: m.content }],
138
141
  }))
142
+
143
+ // Thinking models (e.g., gemini-2.5-pro) need more tokens for reasoning
144
+ // Set a higher default maxOutputTokens if not specified
145
+ const hasMaxTokens = config.generationConfig?.maxOutputTokens !== undefined
146
+ const defaultGenerationConfig = hasMaxTokens ? {} : { maxOutputTokens: 8192 }
147
+
139
148
  return {
140
149
  contents,
141
150
  ...(system && { systemInstruction: { parts: [{ text: system }] } }),
142
- ...config, // includes nested generationConfig
151
+ generationConfig: {
152
+ ...defaultGenerationConfig,
153
+ ...config.generationConfig,
154
+ },
143
155
  ...providerOptions, // safetySettings, thinkingConfig, etc.
144
156
  }
145
157
  },
@@ -155,17 +167,72 @@ const google = {
155
167
  throw new Error('Google response blocked by safety filters')
156
168
  }
157
169
 
158
- const text = candidate.content?.parts?.[0]?.text
159
- if (!text) {
170
+ // Handle different content structures
171
+ const content = candidate.content
172
+ if (!content) {
160
173
  throw new Error('Google response missing content')
161
174
  }
162
- return text
175
+
176
+ // Gemini 2.5 Pro (thinking model) may return content without parts
177
+ // when all tokens were used for reasoning
178
+ if (!content.parts || (Array.isArray(content.parts) && content.parts.length === 0)) {
179
+ const thoughts = data.usageMetadata?.thoughtsTokenCount ?? 0
180
+ const totalTokens = data.usageMetadata?.totalTokenCount ?? 0
181
+
182
+ if (finishReason === 'MAX_TOKENS' && thoughts > 0) {
183
+ throw new Error(
184
+ `Google model used ${thoughts}/${totalTokens} tokens for internal reasoning and has no tokens left for output. ` +
185
+ `Increase maxTokens to allow room for both thinking and response.`
186
+ )
187
+ }
188
+
189
+ throw new Error('Google response has no content parts')
190
+ }
191
+
192
+ // Gemini may return parts as array or direct text
193
+ if (Array.isArray(content.parts)) {
194
+ // Concatenate all text parts (model may return multiple text blocks)
195
+ const texts = content.parts.filter((p) => p.text).map((p) => p.text)
196
+ if (texts.length === 0) {
197
+ const thoughts = data.usageMetadata?.thoughtsTokenCount ?? 0
198
+ if (finishReason === 'MAX_TOKENS' && thoughts > 0) {
199
+ throw new Error(
200
+ `Google model used ${thoughts}/${data.usageMetadata?.totalTokenCount ?? 0} tokens for internal reasoning and has no tokens left for output. ` +
201
+ `Increase maxTokens to allow room for both thinking and response.`
202
+ )
203
+ }
204
+ throw new Error('Google response has no text content')
205
+ }
206
+ return texts.join('')
207
+ }
208
+
209
+ // Some models may return content directly as string
210
+ if (typeof content.parts === 'string') {
211
+ return content.parts
212
+ }
213
+
214
+ throw new Error('Google response missing content')
215
+ },
216
+ extractUsage: (data) => {
217
+ // For Gemini models with reasoning, candidatesTokenCount may be undefined
218
+ // when all tokens were used for thinking. Calculate output tokens from
219
+ // totalTokenCount - promptTokenCount to get actual tokens used.
220
+ const totalTokens = data.usageMetadata?.totalTokenCount ?? 0
221
+ const promptTokens = data.usageMetadata?.promptTokenCount ?? 0
222
+ const candidatesTokens = data.usageMetadata?.candidatesTokenCount ?? 0
223
+ const thoughtsTokens = data.usageMetadata?.thoughtsTokenCount ?? 0
224
+
225
+ // outputTokens = actual generated tokens (including reasoning)
226
+ // If candidatesTokenCount is missing, derive from total - prompt
227
+ const outputTokens = candidatesTokens || (totalTokens - promptTokens)
228
+
229
+ return {
230
+ inputTokens: promptTokens,
231
+ outputTokens,
232
+ cacheTokens: data.usageMetadata?.cachedContentTokenCount ?? 0,
233
+ reasoningTokens: thoughtsTokens,
234
+ }
163
235
  },
164
- extractUsage: (data) => ({
165
- inputTokens: data.usageMetadata?.promptTokenCount ?? 0,
166
- outputTokens: data.usageMetadata?.candidatesTokenCount ?? 0,
167
- cacheTokens: data.usageMetadata?.cachedContentTokenCount ?? 0,
168
- }),
169
236
  }
170
237
 
171
238
  /** @type {ProviderAdapter} */
@@ -198,6 +265,7 @@ const dashscope = {
198
265
  inputTokens: usage?.input_tokens ?? usage?.prompt_tokens ?? 0,
199
266
  outputTokens: usage?.output_tokens ?? usage?.completion_tokens ?? 0,
200
267
  cacheTokens: 0,
268
+ reasoningTokens: 0,
201
269
  }
202
270
  },
203
271
  }
@@ -225,7 +293,8 @@ const deepseek = {
225
293
  extractUsage: (data) => ({
226
294
  inputTokens: data.usage?.prompt_tokens ?? 0,
227
295
  outputTokens: data.usage?.completion_tokens ?? 0,
228
- cacheTokens: 0,
296
+ cacheTokens: data.usage?.prompt_cache_hit_tokens ?? 0,
297
+ reasoningTokens: data.usage?.completion_tokens_details?.reasoning_tokens ?? 0,
229
298
  }),
230
299
  }
231
300
 
package/src/registry.js CHANGED
@@ -1,9 +1,10 @@
1
1
  /**
2
2
  * @fileoverview Model registry — in-memory store for model records.
3
3
  *
4
- * Models are loaded programmatically via setModels() from external sources
5
- * (CMS, API, or local files for evaluation). This module provides O(1) lookups
6
- * at runtime via a Map indexed by model ID.
4
+ * Default models are loaded automatically from ./models.js at import time.
5
+ * Users can modify the registry via addModels() and setModels().
6
+ *
7
+ * This module provides O(1) lookups at runtime via a Map indexed by model ID.
7
8
  *
8
9
  * `supportedParams` is optional per record. When absent, the provider's
9
10
  * default param set is used.
@@ -11,6 +12,8 @@
11
12
  * @typedef {'openai'|'anthropic'|'google'|'dashscope'|'deepseek'} ProviderId
12
13
  */
13
14
 
15
+ import { DEFAULT_MODELS } from './models.js'
16
+
14
17
  /**
15
18
  * Mirrors the Directus collection schema exactly.
16
19
  * `supportedParams` is optional — added later via Directus field.
@@ -48,6 +51,17 @@ const VALID_PROVIDERS = ['openai', 'anthropic', 'google', 'dashscope', 'deepseek
48
51
  /** @type {Map<string, ModelRecord>} */
49
52
  let REGISTRY = new Map()
50
53
 
54
+ /**
55
+ * Initializes the registry with default models.
56
+ * Called automatically at module import.
57
+ */
58
+ const initRegistry = () => {
59
+ REGISTRY = new Map(DEFAULT_MODELS.map((model) => [model.id, model]))
60
+ }
61
+
62
+ // Initialize with default models on import
63
+ initRegistry()
64
+
51
65
  /**
52
66
  * Validates a single model record structure and types.
53
67
  *
@@ -143,11 +157,33 @@ export const listModels = () =>
143
157
  [...REGISTRY.values()].filter((m) => m.enable)
144
158
 
145
159
  /**
146
- * Programmatically sets the model registry from an array of model records.
147
- * Use this when loading models from a CMS or other external source instead of
148
- * the built-in models.json file.
160
+ * Adds one or more models to the registry.
161
+ * Existing models with the same ID are overwritten.
162
+ *
163
+ * @param {ModelRecord[]} models - Array of model records to add
164
+ * @throws {Error} When models is not an array or contains invalid records
165
+ */
166
+ export const addModels = (models) => {
167
+ if (!Array.isArray(models)) {
168
+ throw new Error(`addModels expects an array. Got: ${typeof models}`)
169
+ }
170
+
171
+ // Validate each model record
172
+ models.forEach((model, index) => {
173
+ validateModelRecord(model, index)
174
+ })
175
+
176
+ // Add models to the registry
177
+ models.forEach((model) => {
178
+ REGISTRY.set(model.id, model)
179
+ })
180
+ }
181
+
182
+ /**
183
+ * Replaces the entire model registry with a new list of models.
184
+ * Use this to load models from a CMS or other external source.
149
185
  *
150
- * @param {ModelRecord[]} models - Array of model records (same format as models.json)
186
+ * @param {ModelRecord[]} models - Array of model records
151
187
  * @throws {Error} When models is not an array or contains invalid records
152
188
  */
153
189
  export const setModels = (models) => {
package/src/validation.js CHANGED
@@ -9,7 +9,7 @@
9
9
  * @typedef {Object} AskParams
10
10
  * @property {string} model
11
11
  * @property {string} apikey
12
- * @property {string} prompt
12
+ * @property {string} [prompt]
13
13
  * @property {string} [system]
14
14
  * @property {import('../index.js').Message[]} [messages]
15
15
  * @property {number} [temperature]
@@ -42,8 +42,14 @@ export const validateAskOptions = (params) => {
42
42
  errors.push('"apikey" must be a non-empty string')
43
43
  }
44
44
 
45
- if (!params.prompt || typeof params.prompt !== 'string') {
46
- errors.push('"prompt" must be a non-empty string')
45
+ // Either prompt or messages must be provided (but not both required)
46
+ if (params.prompt === undefined && params.messages === undefined) {
47
+ errors.push('either "prompt" or "messages" must be provided')
48
+ }
49
+
50
+ // When using messages, system can still be provided (will be prepended)
51
+ if (params.prompt !== undefined && typeof params.prompt !== 'string') {
52
+ errors.push('"prompt" must be a string')
47
53
  }
48
54
 
49
55
  // Optional string fields