npm - @pwshub/aisdk - Versions diffs - 0.0.2 → 0.0.4 - Mend

@pwshub/aisdk 0.0.2 → 0.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md CHANGED Viewed

@@ -2,6 +2,10 @@
 A thin, unified AI client for OpenAI, Anthropic, Google, DashScope, and DeepSeek with automatic parameter normalization and fallback support.
+[![npm version](https://badge.fury.io/js/@pwshub%2Faisdk.svg)](https://badge.fury.io/js/@pwshub%2Faisdk)
+![CodeQL](https://github.com/pwshub/aisdk/workflows/CodeQL/badge.svg)
+![CI test](https://github.com/pwshub/aisdk/workflows/ci-test/badge.svg)
 ## Features
 - **Unified API**: Single interface for multiple AI providers
@@ -11,6 +15,17 @@ A thin, unified AI client for OpenAI, Anthropic, Google, DashScope, and DeepSeek
 - **Token usage tracking**: Detailed token counts and estimated cost per request
 - **Provider-specific options**: Pass provider-specific parameters when needed
+## Limitations
+This package is designed for **personal project usage** with a focus on simplicity:
+- **Text-only chat**: Supports basic text generation and conversation
+- **No streaming**: All responses are returned as complete results
+- **No multimodal inputs**: Images, audio, video, and file uploads are not supported
+- **No function calling**: Tool use and function calling features are not available
+For production applications requiring advanced features, consider using the official provider SDKs directly.
 ## Installation
 ```bash
@@ -80,7 +95,8 @@ Sends a text generation request.
     inputTokens: number,
     outputTokens: number,
     cacheTokens: number,
-    estimatedCost: number // USD
+    reasoningTokens: number,  // Reasoning/thinking tokens (0 for non-reasoning models)
+    estimatedCost: number     // USD
   }
 }
 ```
@@ -133,6 +149,27 @@ const result = await ai.ask({
 })
 ```
+### Google (Disable Thinking Mode)
+Gemini 2.5 Pro and other reasoning models use thinking tokens by default. Disable thinking mode to reduce latency and cost:
+```javascript
+const result = await ai.ask({
+  model: 'gemini-2.5-pro',
+  apikey: process.env.GOOGLE_API_KEY,
+  prompt: 'What is the capital of Vietnam?',
+  maxTokens: 256,
+  providerOptions: {
+    thinkingConfig: {
+      thinkingBudget: 0,      // Disable reasoning tokens
+      includeThoughts: false, // Don't include thought process in response
+    },
+  },
+})
+```
+> **Note:** When thinking mode is enabled (default for Gemini 2.5 Pro), the model may use most of the `maxTokens` budget for reasoning. Set a higher `maxTokens` (e.g., 2048) or disable thinking with `thinkingBudget: 0`.
 ### With Fallbacks
 ```javascript
@@ -166,6 +203,36 @@ const result = await ai.ask({
 })
 ```
+### DashScope with Custom Region
+DashScope endpoints vary by region. Use `gatewayUrl` to specify your region:
+```javascript
+import { createAi } from '@pwshub/aisdk'
+// Singapore region
+const aiSingapore = createAi({
+  gatewayUrl: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1',
+})
+// Virginia region (US)
+const aiUS = createAi({
+  gatewayUrl: 'https://dashscope-us.aliyuncs.com/compatible-mode/v1',
+})
+// Beijing region (China)
+const aiCN = createAi({
+  gatewayUrl: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
+})
+// Use the regional client
+const result = await aiSingapore.ask({
+  model: 'qwen3.5-plus',
+  apikey: process.env.DASHSCOPE_API_KEY,
+  prompt: 'Hello from Singapore!',
+})
+```
 ### DeepSeek
 ```javascript
@@ -178,22 +245,40 @@ const result = await ai.ask({
 ## Supported Models
-This library does not ship with a predefined list of models. Instead, it accepts **any model** from the supported providers:
+The library comes with **30 pre-configured models** from all supported providers:
-- **OpenAI**: Any OpenAI model
-- **Anthropic**: Any Anthropic model
-- **Google**: Any Google model
-- **DashScope**: Any DashScope model
-- **DeepSeek**: Any DeepSeek model
+- **OpenAI**: gpt-4.1-nano, gpt-4.1-mini, gpt-4.1, gpt-4o, gpt-4o-mini, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, gpt-5.2, gpt-5.4, o3-mini, o4-mini
+- **Anthropic**: claude-haiku-4-5, claude-sonnet-4-6, claude-sonnet-4-5, claude-opus-4-6
+- **Google**: gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.5-pro, gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview
+- **DashScope**: qwen-flash, qwen3.5-flash, qwen-plus, qwen3.5-plus, qwen-max, qwen3-max
+- **DeepSeek**: deepseek-chat, deepseek-reasoner
-### Loading Models
+### Managing Models
-Models are loaded programmatically via `setModels()` from external sources (CMS, API, or local files for evaluation):
+Models are managed via `addModels()` and `setModels()`:
 ```javascript
-import { createAi, setModels } from '@pwshub/aisdk'
+import { createAi, addModels, setModels, listModels } from '@pwshub/aisdk'
+// List all available models (30 models loaded by default)
+console.log(listModels())
+// Add more models to the existing list
+addModels([
+  {
+    id: 'my-custom-model',
+    name: 'my-custom-model',
+    provider: 'openai',
+    input_price: 1,
+    output_price: 2,
+    cache_price: 0.5,
+    max_in: 128000,
+    max_out: 16384,
+    enable: true,
+  },
+])
-// Load models from your CMS or API
+// Replace all models with your own list (e.g., from CMS)
 const modelsFromCms = await fetch('https://cms.example.com/api/models').then(r => r.json())
 setModels(modelsFromCms)
@@ -205,6 +290,8 @@ const result = await ai.ask({
 })
 ```
+> **Note:** Models are loaded automatically from `src/models.js` when the library is imported. You don't need to call `setModels()` unless you want to use a custom model list.
 ### Model Record Format
 Each model record should include:
@@ -219,8 +306,6 @@ Each model record should include:
 - `enable`: Boolean to enable/disable the model
 - `supportedParams` (optional): Array of supported parameter names
-> **Note**: The `examples/` folder includes `models.json` as a reference for running evaluation scripts.
 ## Error Handling
 ```javascript

package/index.d.ts CHANGED Viewed

@@ -25,6 +25,7 @@ export interface Usage {
   inputTokens: number;
   outputTokens: number;
   cacheTokens: number;
+  reasoningTokens: number;
   estimatedCost: number;
 }
@@ -69,5 +70,6 @@ export interface AiClient {
 }
 export function createAi(opts?: AiOptions): AiClient;
+export function addModels(models: ModelRecord[]): void;
 export function setModels(models: ModelRecord[]): void;
 export function listModels(): ModelRecord[];

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@pwshub/aisdk",
-  "version": "0.0.2",
+  "version": "0.0.4",
   "description": "A thin, unified AI client for OpenAI, Anthropic, Google, DashScope, and DeepSeek with automatic param normalization and fallback support",
   "repository": {
     "type": "git",
@@ -11,11 +11,12 @@
     "bun": ">=1.0.0"
   },
   "type": "module",
-  "main": "./src/index.js",
   "exports": {
-    ".": "./src/index.js"
+    ".": {
+      "types": "./index.d.ts",
+      "default": "./src/index.js"
+    }
   },
-  "types": "./index.d.ts",
   "files": [
     "src",
     "index.d.ts"

package/src/index.js CHANGED Viewed

@@ -12,7 +12,7 @@
  *   temperature: 0.5,
  * })
  * console.log(result.text)
- * console.log(result.usage) // { inputTokens, outputTokens, cacheTokens, estimatedCost }
+ * console.log(result.usage) // { inputTokens, outputTokens, cacheTokens, reasoningTokens, estimatedCost }
  *
  * @example With fallbacks
  * const result = await ai.ask({
@@ -38,10 +38,21 @@
  *   },
  * })
  *
+ * @example Using messages array for multi-turn conversations
+ * const result = await ai.ask({
+ *   model: 'claude-sonnet-4-20250514',
+ *   apikey: 'your-api-key',
+ *   messages: [
+ *     { role: 'user', content: 'What is the capital of Vietnam?' },
+ *     { role: 'assistant', content: 'The capital of Vietnam is Hanoi.' },
+ *     { role: 'user', content: 'What is its population?' },
+ *   ],
+ * })
+ *
  */
 import {
-  getModel, listModels, setModels,
+  getModel, listModels, setModels, addModels,
 } from './registry.js'
 import { normalizeConfig } from './config.js'
 import { coerceConfig } from './coerce.js'
@@ -64,8 +75,9 @@ export {
  * @typedef {Object} AskParams
  * @property {string} model                       - Model ID (must be registered via setModels())
  * @property {string} apikey                      - API key for the provider
- * @property {string} prompt                      - The user message
- * @property {string} [system]                    - Optional system prompt
+ * @property {string} [prompt]                    - The user message (alternative to messages)
+ * @property {string} [system]                    - Optional system prompt (used with prompt)
+ * @property {import('./providers.js').Message[]} [messages] - Array of messages with role and content (alternative to prompt)
  * @property {string[]} [fallbacks]               - Ordered list of fallback model IDs
  * @property {Record<string, unknown>} [providerOptions] - Provider-specific options merged into body
  * @property {number} [temperature]
@@ -81,6 +93,7 @@ export {
  * @property {number} inputTokens
  * @property {number} outputTokens
  * @property {number} cacheTokens
+ * @property {number} reasoningTokens
  * @property {number} estimatedCost   - In USD, based on models.json pricing
  */
@@ -113,7 +126,7 @@ const extractGenConfig = (params) => {
 const calcCost = (usage, record) => {
   const M = 1_000_000
   const inputCost = (usage.inputTokens / M) * record.input_price
-  const outputCost = (usage.outputTokens / M) * record.output_price
+  const outputCost = ((usage.outputTokens + usage.reasoningTokens) / M) * record.output_price
   const cacheCost = (usage.cacheTokens / M) * record.cache_price
   // Round to 8 decimal places to avoid floating point noise
@@ -151,11 +164,11 @@ const callModel = async (modelId, params, gatewayUrl) => {
   const normalizedConfig = normalizeConfig(coerced, providerId, supportedParams, modelId)
   const {
-    prompt, system, providerOptions = {},
+    prompt, system, messages, providerOptions = {},
   } = params
   /** @type {import('./providers.js').Message[]} */
-  const messages = [
+  const messageList = messages ?? [
     ...(system ? [{
       role: 'system', content: system,
     }] : []),
@@ -165,7 +178,7 @@ const callModel = async (modelId, params, gatewayUrl) => {
   ]
   const url = gatewayUrl ?? adapter.url(modelName, apikey)
-  const body = adapter.buildBody(modelName, messages, normalizedConfig, providerOptions)
+  const body = adapter.buildBody(modelName, messageList, normalizedConfig, providerOptions)
   let res
   try {
@@ -266,4 +279,4 @@ export const createAi = (opts = {}) => {
   }
 }
-export { setModels }
+export { addModels, setModels, listModels }

package/src/models.js ADDED Viewed

@@ -0,0 +1,345 @@
+/**
+ * @fileoverview Default model registry for @pwshub/aisdk.
+ *
+ * This module exports a default list of models that are loaded automatically
+ * when the library is imported. Users can modify this list via addModels()
+ * and setModels() from the main export.
+ */
+/**
+ * @typedef {import('./registry.js').ModelRecord} ModelRecord
+ */
+/** @type {ModelRecord[]} */
+export const DEFAULT_MODELS = [
+  {
+    id: 'claude-haiku-4-5',
+    name: 'claude-haiku-4-5',
+    provider: 'anthropic',
+    input_price: 1,
+    output_price: 5,
+    cache_price: 0,
+    max_in: 200000,
+    max_out: 64000,
+    enable: true,
+  },
+  {
+    id: 'claude-sonnet-4-6',
+    name: 'claude-sonnet-4-6',
+    provider: 'anthropic',
+    input_price: 3,
+    output_price: 15,
+    cache_price: 0,
+    max_in: 200000,
+    max_out: 64000,
+    enable: true,
+  },
+  {
+    id: 'claude-sonnet-4-5',
+    name: 'claude-sonnet-4-5',
+    provider: 'anthropic',
+    input_price: 3,
+    output_price: 15,
+    cache_price: 0,
+    max_in: 200000,
+    max_out: 1000000,
+    enable: true,
+  },
+  {
+    id: 'claude-opus-4-6',
+    name: 'claude-opus-4-6',
+    provider: 'anthropic',
+    input_price: 5,
+    output_price: 25,
+    cache_price: 0,
+    max_in: 200000,
+    max_out: 128000,
+    enable: true,
+  },
+  {
+    id: 'gemini-2.5-flash',
+    name: 'gemini-2.5-flash',
+    provider: 'google',
+    input_price: 0.3,
+    output_price: 2.5,
+    cache_price: 0.03,
+    max_in: 1048576,
+    max_out: 65536,
+    enable: true,
+  },
+  {
+    id: 'gemini-2.5-flash-lite',
+    name: 'gemini-2.5-flash-lite',
+    provider: 'google',
+    input_price: 0.1,
+    output_price: 0.4,
+    cache_price: 0.01,
+    max_in: 1048576,
+    max_out: 65536,
+    enable: true,
+  },
+  {
+    id: 'gemini-2.5-pro',
+    name: 'gemini-2.5-pro',
+    provider: 'google',
+    input_price: 1.25,
+    output_price: 10,
+    cache_price: 0.125,
+    max_in: 1048576,
+    max_out: 65536,
+    enable: true,
+  },
+  {
+    id: 'gemini-3.1-pro-preview',
+    name: 'gemini-3.1-pro-preview',
+    provider: 'google',
+    input_price: 2,
+    output_price: 12,
+    cache_price: 0.2,
+    max_in: 1048576,
+    max_out: 65536,
+    enable: true,
+  },
+  {
+    id: 'gemini-3.1-flash-lite-preview',
+    name: 'gemini-3.1-flash-lite-preview',
+    provider: 'google',
+    input_price: 0.25,
+    output_price: 1.5,
+    cache_price: 0.025,
+    max_in: 1048576,
+    max_out: 65536,
+    enable: true,
+  },
+  {
+    id: 'gpt-4.1-nano',
+    name: 'gpt-4.1-nano',
+    provider: 'openai',
+    input_price: 0.1,
+    output_price: 0.4,
+    cache_price: 0.025,
+    max_in: 1047576,
+    max_out: 32768,
+    enable: true,
+  },
+  {
+    id: 'gpt-4.1-mini',
+    name: 'gpt-4.1-mini',
+    provider: 'openai',
+    input_price: 0.4,
+    output_price: 1.6,
+    cache_price: 0.1,
+    max_in: 1047576,
+    max_out: 32768,
+    enable: true,
+  },
+  {
+    id: 'gpt-4.1',
+    name: 'gpt-4.1',
+    provider: 'openai',
+    input_price: 2,
+    output_price: 8,
+    cache_price: 0.5,
+    max_in: 1047576,
+    max_out: 32768,
+    enable: true,
+  },
+  {
+    id: 'gpt-4o',
+    name: 'gpt-4o',
+    provider: 'openai',
+    input_price: 2.5,
+    output_price: 10,
+    cache_price: 1.25,
+    max_in: 128000,
+    max_out: 16384,
+    enable: true,
+  },
+  {
+    id: 'gpt-4o-mini',
+    name: 'gpt-4o-mini',
+    provider: 'openai',
+    input_price: 0.15,
+    output_price: 0.6,
+    cache_price: 0.075,
+    max_in: 128000,
+    max_out: 16384,
+    enable: true,
+  },
+  {
+    id: 'gpt-5',
+    name: 'gpt-5',
+    provider: 'openai',
+    input_price: 1.25,
+    output_price: 10,
+    cache_price: 0.125,
+    max_in: 400000,
+    max_out: 128000,
+    enable: true,
+  },
+  {
+    id: 'gpt-5-mini',
+    name: 'gpt-5-mini',
+    provider: 'openai',
+    input_price: 0.25,
+    output_price: 2,
+    cache_price: 0.025,
+    max_in: 400000,
+    max_out: 128000,
+    enable: true,
+  },
+  {
+    id: 'gpt-5-nano',
+    name: 'gpt-5-nano',
+    provider: 'openai',
+    input_price: 0.05,
+    output_price: 0.4,
+    cache_price: 0.005,
+    max_in: 400000,
+    max_out: 128000,
+    enable: true,
+  },
+  {
+    id: 'gpt-5.1',
+    name: 'gpt-5.1',
+    provider: 'openai',
+    input_price: 1.25,
+    output_price: 10,
+    cache_price: 0.125,
+    max_in: 400000,
+    max_out: 128000,
+    enable: true,
+  },
+  {
+    id: 'gpt-5.2',
+    name: 'gpt-5.2',
+    provider: 'openai',
+    input_price: 1.75,
+    output_price: 14,
+    cache_price: 0.175,
+    max_in: 400000,
+    max_out: 128000,
+    enable: true,
+  },
+  {
+    id: 'gpt-5.4',
+    name: 'gpt-5.4',
+    provider: 'openai',
+    input_price: 2.5,
+    output_price: 15,
+    cache_price: 0.25,
+    max_in: 1050000,
+    max_out: 128000,
+    enable: true,
+  },
+  {
+    id: 'o3-mini',
+    name: 'o3-mini',
+    provider: 'openai',
+    input_price: 1.1,
+    output_price: 4.4,
+    cache_price: 0.55,
+    max_in: 200000,
+    max_out: 100000,
+    enable: true,
+  },
+  {
+    id: 'o4-mini',
+    name: 'o4-mini',
+    provider: 'openai',
+    input_price: 1.1,
+    output_price: 4.4,
+    cache_price: 0.275,
+    max_in: 200000,
+    max_out: 100000,
+    enable: true,
+  },
+  {
+    id: 'deepseek-chat',
+    name: 'deepseek-chat',
+    provider: 'deepseek',
+    input_price: 0.28,
+    output_price: 0.42,
+    cache_price: 0.028,
+    max_in: 128000,
+    max_out: 8000,
+    enable: true,
+  },
+  {
+    id: 'deepseek-reasoner',
+    name: 'deepseek-reasoner',
+    provider: 'deepseek',
+    input_price: 0.28,
+    output_price: 0.42,
+    cache_price: 0.028,
+    max_in: 128000,
+    max_out: 64000,
+    enable: true,
+  },
+  {
+    id: 'qwen-flash',
+    name: 'qwen-flash',
+    provider: 'dashscope',
+    input_price: 0.05,
+    output_price: 0.4,
+    cache_price: 0,
+    max_in: 995904,
+    max_out: 32768,
+    enable: true,
+  },
+  {
+    id: 'qwen3.5-flash',
+    name: 'qwen3.5-flash',
+    provider: 'dashscope',
+    input_price: 0.1,
+    output_price: 0.4,
+    cache_price: 0,
+    max_in: 983616,
+    max_out: 65536,
+    enable: true,
+  },
+  {
+    id: 'qwen-plus',
+    name: 'qwen-plus',
+    provider: 'dashscope',
+    input_price: 0.4,
+    output_price: 1.2,
+    cache_price: 0,
+    max_in: 997952,
+    max_out: 32768,
+    enable: true,
+  },
+  {
+    id: 'qwen3.5-plus',
+    name: 'qwen3.5-plus',
+    provider: 'dashscope',
+    input_price: 0.4,
+    output_price: 2.4,
+    cache_price: 0,
+    max_in: 991808,
+    max_out: 65536,
+    enable: true,
+  },
+  {
+    id: 'qwen-max',
+    name: 'qwen-max',
+    provider: 'dashscope',
+    input_price: 1.6,
+    output_price: 6.4,
+    cache_price: 0,
+    max_in: 30720,
+    max_out: 8192,
+    enable: true,
+  },
+  {
+    id: 'qwen3-max',
+    name: 'qwen3-max',
+    provider: 'dashscope',
+    input_price: 1.2,
+    output_price: 6,
+    cache_price: 0,
+    max_in: 258048,
+    max_out: 65536,
+    enable: true,
+  },
+]

package/src/providers.js CHANGED Viewed

@@ -23,7 +23,8 @@
  * @typedef {Object} RawUsage
  * @property {number} inputTokens
  * @property {number} outputTokens
- * @property {number} cacheTokens   - 0 when not applicable
+ * @property {number} cacheTokens     - 0 when not applicable
+ * @property {number} reasoningTokens - 0 when not applicable
  */
 /**
@@ -84,6 +85,7 @@ const openai = {
     inputTokens: data.usage?.prompt_tokens ?? 0,
     outputTokens: data.usage?.completion_tokens ?? 0,
     cacheTokens: data.usage?.prompt_tokens_details?.cached_tokens ?? 0,
+    reasoningTokens: data.usage?.completion_tokens_details?.reasoning_tokens ?? 0,
   }),
 }
@@ -119,7 +121,8 @@ const anthropic = {
   extractUsage: (data) => ({
     inputTokens: data.usage?.input_tokens ?? 0,
     outputTokens: data.usage?.output_tokens ?? 0,
-    cacheTokens: data.usage?.cache_read_input_tokens ?? 0,
+    cacheTokens: (data.usage?.cache_read_input_tokens ?? 0) + (data.usage?.cache_creation_input_tokens ?? 0),
+    reasoningTokens: 0,
   }),
 }
@@ -136,10 +139,19 @@ const google = {
         role: m.role === 'assistant' ? 'model' : 'user',
         parts: [{ text: m.content }],
       }))
+    // Thinking models (e.g., gemini-2.5-pro) need more tokens for reasoning
+    // Set a higher default maxOutputTokens if not specified
+    const hasMaxTokens = config.generationConfig?.maxOutputTokens !== undefined
+    const defaultGenerationConfig = hasMaxTokens ? {} : { maxOutputTokens: 8192 }
     return {
       contents,
       ...(system && { systemInstruction: { parts: [{ text: system }] } }),
-      ...config, // includes nested generationConfig
+      generationConfig: {
+        ...defaultGenerationConfig,
+        ...config.generationConfig,
+      },
       ...providerOptions, // safetySettings, thinkingConfig, etc.
     }
   },
@@ -155,17 +167,72 @@ const google = {
       throw new Error('Google response blocked by safety filters')
     }
-    const text = candidate.content?.parts?.[0]?.text
-    if (!text) {
+    // Handle different content structures
+    const content = candidate.content
+    if (!content) {
       throw new Error('Google response missing content')
     }
-    return text
+    // Gemini 2.5 Pro (thinking model) may return content without parts
+    // when all tokens were used for reasoning
+    if (!content.parts || (Array.isArray(content.parts) && content.parts.length === 0)) {
+      const thoughts = data.usageMetadata?.thoughtsTokenCount ?? 0
+      const totalTokens = data.usageMetadata?.totalTokenCount ?? 0
+      if (finishReason === 'MAX_TOKENS' && thoughts > 0) {
+        throw new Error(
+          `Google model used ${thoughts}/${totalTokens} tokens for internal reasoning and has no tokens left for output. ` +
+          `Increase maxTokens to allow room for both thinking and response.`
+        )
+      }
+      throw new Error('Google response has no content parts')
+    }
+    // Gemini may return parts as array or direct text
+    if (Array.isArray(content.parts)) {
+      // Concatenate all text parts (model may return multiple text blocks)
+      const texts = content.parts.filter((p) => p.text).map((p) => p.text)
+      if (texts.length === 0) {
+        const thoughts = data.usageMetadata?.thoughtsTokenCount ?? 0
+        if (finishReason === 'MAX_TOKENS' && thoughts > 0) {
+          throw new Error(
+            `Google model used ${thoughts}/${data.usageMetadata?.totalTokenCount ?? 0} tokens for internal reasoning and has no tokens left for output. ` +
+            `Increase maxTokens to allow room for both thinking and response.`
+          )
+        }
+        throw new Error('Google response has no text content')
+      }
+      return texts.join('')
+    }
+    // Some models may return content directly as string
+    if (typeof content.parts === 'string') {
+      return content.parts
+    }
+    throw new Error('Google response missing content')
+  },
+  extractUsage: (data) => {
+    // For Gemini models with reasoning, candidatesTokenCount may be undefined
+    // when all tokens were used for thinking. Calculate output tokens from
+    // totalTokenCount - promptTokenCount to get actual tokens used.
+    const totalTokens = data.usageMetadata?.totalTokenCount ?? 0
+    const promptTokens = data.usageMetadata?.promptTokenCount ?? 0
+    const candidatesTokens = data.usageMetadata?.candidatesTokenCount ?? 0
+    const thoughtsTokens = data.usageMetadata?.thoughtsTokenCount ?? 0
+    // outputTokens = actual generated tokens (including reasoning)
+    // If candidatesTokenCount is missing, derive from total - prompt
+    const outputTokens = candidatesTokens || (totalTokens - promptTokens)
+    return {
+      inputTokens: promptTokens,
+      outputTokens,
+      cacheTokens: data.usageMetadata?.cachedContentTokenCount ?? 0,
+      reasoningTokens: thoughtsTokens,
+    }
   },
-  extractUsage: (data) => ({
-    inputTokens: data.usageMetadata?.promptTokenCount ?? 0,
-    outputTokens: data.usageMetadata?.candidatesTokenCount ?? 0,
-    cacheTokens: data.usageMetadata?.cachedContentTokenCount ?? 0,
-  }),
 }
 /** @type {ProviderAdapter} */
@@ -198,6 +265,7 @@ const dashscope = {
       inputTokens: usage?.input_tokens ?? usage?.prompt_tokens ?? 0,
       outputTokens: usage?.output_tokens ?? usage?.completion_tokens ?? 0,
       cacheTokens: 0,
+      reasoningTokens: 0,
     }
   },
 }
@@ -225,7 +293,8 @@ const deepseek = {
   extractUsage: (data) => ({
     inputTokens: data.usage?.prompt_tokens ?? 0,
     outputTokens: data.usage?.completion_tokens ?? 0,
-    cacheTokens: 0,
+    cacheTokens: data.usage?.prompt_cache_hit_tokens ?? 0,
+    reasoningTokens: data.usage?.completion_tokens_details?.reasoning_tokens ?? 0,
   }),
 }

package/src/registry.js CHANGED Viewed

@@ -1,9 +1,10 @@
 /**
  * @fileoverview Model registry — in-memory store for model records.
  *
- * Models are loaded programmatically via setModels() from external sources
- * (CMS, API, or local files for evaluation). This module provides O(1) lookups
- * at runtime via a Map indexed by model ID.
+ * Default models are loaded automatically from ./models.js at import time.
+ * Users can modify the registry via addModels() and setModels().
+ *
+ * This module provides O(1) lookups at runtime via a Map indexed by model ID.
  *
  * `supportedParams` is optional per record. When absent, the provider's
  * default param set is used.
@@ -11,6 +12,8 @@
  * @typedef {'openai'|'anthropic'|'google'|'dashscope'|'deepseek'} ProviderId
  */
+import { DEFAULT_MODELS } from './models.js'
 /**
  * Mirrors the Directus collection schema exactly.
  * `supportedParams` is optional — added later via Directus field.
@@ -48,6 +51,17 @@ const VALID_PROVIDERS = ['openai', 'anthropic', 'google', 'dashscope', 'deepseek
 /** @type {Map<string, ModelRecord>} */
 let REGISTRY = new Map()
+/**
+ * Initializes the registry with default models.
+ * Called automatically at module import.
+ */
+const initRegistry = () => {
+  REGISTRY = new Map(DEFAULT_MODELS.map((model) => [model.id, model]))
+}
+// Initialize with default models on import
+initRegistry()
 /**
  * Validates a single model record structure and types.
  *
@@ -143,11 +157,33 @@ export const listModels = () =>
   [...REGISTRY.values()].filter((m) => m.enable)
 /**
- * Programmatically sets the model registry from an array of model records.
- * Use this when loading models from a CMS or other external source instead of
- * the built-in models.json file.
+ * Adds one or more models to the registry.
+ * Existing models with the same ID are overwritten.
+ *
+ * @param {ModelRecord[]} models - Array of model records to add
+ * @throws {Error} When models is not an array or contains invalid records
+ */
+export const addModels = (models) => {
+  if (!Array.isArray(models)) {
+    throw new Error(`addModels expects an array. Got: ${typeof models}`)
+  }
+  // Validate each model record
+  models.forEach((model, index) => {
+    validateModelRecord(model, index)
+  })
+  // Add models to the registry
+  models.forEach((model) => {
+    REGISTRY.set(model.id, model)
+  })
+}
+/**
+ * Replaces the entire model registry with a new list of models.
+ * Use this to load models from a CMS or other external source.
  *
- * @param {ModelRecord[]} models - Array of model records (same format as models.json)
+ * @param {ModelRecord[]} models - Array of model records
  * @throws {Error} When models is not an array or contains invalid records
  */
 export const setModels = (models) => {

package/src/validation.js CHANGED Viewed

@@ -9,7 +9,7 @@
  * @typedef {Object} AskParams
  * @property {string} model
  * @property {string} apikey
- * @property {string} prompt
+ * @property {string} [prompt]
  * @property {string} [system]
  * @property {import('../index.js').Message[]} [messages]
  * @property {number} [temperature]
@@ -42,8 +42,14 @@ export const validateAskOptions = (params) => {
     errors.push('"apikey" must be a non-empty string')
   }
-  if (!params.prompt || typeof params.prompt !== 'string') {
-    errors.push('"prompt" must be a non-empty string')
+  // Either prompt or messages must be provided (but not both required)
+  if (params.prompt === undefined && params.messages === undefined) {
+    errors.push('either "prompt" or "messages" must be provided')
+  }
+  // When using messages, system can still be provided (will be prepended)
+  if (params.prompt !== undefined && typeof params.prompt !== 'string') {
+    errors.push('"prompt" must be a string')
   }
   // Optional string fields