@happyvertical/ai 0.74.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENT.md +33 -0
- package/LICENSE +7 -0
- package/README.md +384 -0
- package/dist/chunks/anthropic-BRwbhwIl.js +463 -0
- package/dist/chunks/anthropic-BRwbhwIl.js.map +1 -0
- package/dist/chunks/bedrock-Cf1xUerN.js +808 -0
- package/dist/chunks/bedrock-Cf1xUerN.js.map +1 -0
- package/dist/chunks/bifrost-3mXtQsTj.js +233 -0
- package/dist/chunks/bifrost-3mXtQsTj.js.map +1 -0
- package/dist/chunks/claude-cli-BrHRfkry.js +603 -0
- package/dist/chunks/claude-cli-BrHRfkry.js.map +1 -0
- package/dist/chunks/gateway-admin-C4GFPbZF.js +359 -0
- package/dist/chunks/gateway-admin-C4GFPbZF.js.map +1 -0
- package/dist/chunks/gemini-BfpHXDIQ.js +662 -0
- package/dist/chunks/gemini-BfpHXDIQ.js.map +1 -0
- package/dist/chunks/huggingface-280qv9iv.js +366 -0
- package/dist/chunks/huggingface-280qv9iv.js.map +1 -0
- package/dist/chunks/index-BT4thAvS.js +934 -0
- package/dist/chunks/index-BT4thAvS.js.map +1 -0
- package/dist/chunks/litellm-DhPKa_Jz.js +220 -0
- package/dist/chunks/litellm-DhPKa_Jz.js.map +1 -0
- package/dist/chunks/ollama-Di1ldur0.js +851 -0
- package/dist/chunks/ollama-Di1ldur0.js.map +1 -0
- package/dist/chunks/openai-5snI2diE.js +749 -0
- package/dist/chunks/openai-5snI2diE.js.map +1 -0
- package/dist/chunks/qwen-tts-DgPgdXxG.js +365 -0
- package/dist/chunks/qwen-tts-DgPgdXxG.js.map +1 -0
- package/dist/chunks/usage-DMWiJ2oB.js +21 -0
- package/dist/chunks/usage-DMWiJ2oB.js.map +1 -0
- package/dist/cli/claude-context.d.ts +3 -0
- package/dist/cli/claude-context.d.ts.map +1 -0
- package/dist/cli/claude-context.js +21 -0
- package/dist/cli/claude-context.js.map +1 -0
- package/dist/index.d.ts +20 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +21 -0
- package/dist/index.js.map +1 -0
- package/dist/node/factory.d.ts +27 -0
- package/dist/node/factory.d.ts.map +1 -0
- package/dist/shared/client.d.ts +410 -0
- package/dist/shared/client.d.ts.map +1 -0
- package/dist/shared/factory.d.ts +83 -0
- package/dist/shared/factory.d.ts.map +1 -0
- package/dist/shared/message.d.ts +71 -0
- package/dist/shared/message.d.ts.map +1 -0
- package/dist/shared/providers/anthropic.d.ts +82 -0
- package/dist/shared/providers/anthropic.d.ts.map +1 -0
- package/dist/shared/providers/bedrock.d.ts +49 -0
- package/dist/shared/providers/bedrock.d.ts.map +1 -0
- package/dist/shared/providers/bifrost.d.ts +25 -0
- package/dist/shared/providers/bifrost.d.ts.map +1 -0
- package/dist/shared/providers/claude-cli.d.ts +139 -0
- package/dist/shared/providers/claude-cli.d.ts.map +1 -0
- package/dist/shared/providers/gateway-admin.d.ts +35 -0
- package/dist/shared/providers/gateway-admin.d.ts.map +1 -0
- package/dist/shared/providers/gemini.d.ts +116 -0
- package/dist/shared/providers/gemini.d.ts.map +1 -0
- package/dist/shared/providers/huggingface.d.ts +33 -0
- package/dist/shared/providers/huggingface.d.ts.map +1 -0
- package/dist/shared/providers/litellm.d.ts +25 -0
- package/dist/shared/providers/litellm.d.ts.map +1 -0
- package/dist/shared/providers/ollama.d.ts +47 -0
- package/dist/shared/providers/ollama.d.ts.map +1 -0
- package/dist/shared/providers/openai.d.ts +272 -0
- package/dist/shared/providers/openai.d.ts.map +1 -0
- package/dist/shared/providers/qwen-tts.d.ts +85 -0
- package/dist/shared/providers/qwen-tts.d.ts.map +1 -0
- package/dist/shared/providers/usage.d.ts +14 -0
- package/dist/shared/providers/usage.d.ts.map +1 -0
- package/dist/shared/rate-limit.d.ts +13 -0
- package/dist/shared/rate-limit.d.ts.map +1 -0
- package/dist/shared/thread.d.ts +104 -0
- package/dist/shared/thread.d.ts.map +1 -0
- package/dist/shared/types.d.ts +1779 -0
- package/dist/shared/types.d.ts.map +1 -0
- package/metadata.json +35 -0
- package/package.json +62 -0
package/AGENT.md
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# @happyvertical/ai
|
|
2
|
+
|
|
3
|
+
<!-- BEGIN AGENT:GENERATED -->
|
|
4
|
+
## Purpose
|
|
5
|
+
Standardized AI interface supporting OpenAI, LiteLLM, Bifrost, Ollama, Anthropic, Gemini, Bedrock, Hugging Face, Claude CLI, and Qwen3-TTS
|
|
6
|
+
|
|
7
|
+
## Package Map
|
|
8
|
+
- Package: `@happyvertical/ai`
|
|
9
|
+
- Hierarchy path: `@happyvertical/sdk > packages > ai`
|
|
10
|
+
- Workspace position: `2 of 30` local packages
|
|
11
|
+
- Internal dependencies: `@happyvertical/utils`
|
|
12
|
+
- Internal dependents: `@happyvertical/sdk-mcp`
|
|
13
|
+
- Knowledge graph files: `AGENT.md`, `metadata.json`, `ecosystem-manifest.json`
|
|
14
|
+
|
|
15
|
+
## Build & Test
|
|
16
|
+
```bash
|
|
17
|
+
pnpm --filter @happyvertical/ai build
|
|
18
|
+
pnpm --filter @happyvertical/ai test
|
|
19
|
+
pnpm --filter @happyvertical/ai clean
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## Agent Correction Loops
|
|
23
|
+
- If module resolution or export errors mention a workspace dependency, build the dependency first (`pnpm --filter @happyvertical/utils build`) and then rerun `pnpm --filter @happyvertical/ai build`.
|
|
24
|
+
- If tests or exports fail after API, type, or bundle changes, run `pnpm --filter @happyvertical/ai clean` followed by `pnpm --filter @happyvertical/ai build` and `pnpm --filter @happyvertical/ai test`.
|
|
25
|
+
- If failures span multiple packages or Turborepo ordering looks wrong, run `pnpm build` and `pnpm typecheck` from the repo root before retrying package-scoped commands.
|
|
26
|
+
|
|
27
|
+
## Ecosystem Relationships
|
|
28
|
+
- Provides: Standardized AI interface supporting OpenAI, LiteLLM, Bifrost, Ollama, Anthropic, Gemini, Bedrock, Hugging Face, Claude CLI, and Qwen3-TTS
|
|
29
|
+
- Implements: none
|
|
30
|
+
- Requires: @happyvertical/utils, @anthropic-ai/sdk, @aws-sdk/client-bedrock-runtime, @google/genai, openai
|
|
31
|
+
- Stability: stable (Primary package surface is described as implemented and production-oriented.)
|
|
32
|
+
<!-- END AGENT:GENERATED -->
|
|
33
|
+
|
package/LICENSE
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
Copyright <2025> <Happy Vertical Corporation>
|
|
2
|
+
|
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
|
4
|
+
|
|
5
|
+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
|
6
|
+
|
|
7
|
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,384 @@
|
|
|
1
|
+
# @happyvertical/ai
|
|
2
|
+
|
|
3
|
+
Unified interface for AI model interactions across multiple providers. Supports OpenAI, LiteLLM, Bifrost, Ollama, Anthropic Claude, Google Gemini, AWS Bedrock, Hugging Face, Claude CLI, and Qwen3-TTS with a consistent API for chat, completions, embeddings, streaming, function calling, image operations, text-to-speech, and gateway admin provisioning where available.
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
pnpm add @happyvertical/ai
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
Requires `@happyvertical/utils` as a peer dependency.
|
|
12
|
+
|
|
13
|
+
## Quick Start
|
|
14
|
+
|
|
15
|
+
```typescript
|
|
16
|
+
import { getAI } from '@happyvertical/ai';
|
|
17
|
+
|
|
18
|
+
const ai = await getAI({
|
|
19
|
+
type: 'openai',
|
|
20
|
+
apiKey: process.env.OPENAI_API_KEY!,
|
|
21
|
+
defaultModel: 'gpt-4o'
|
|
22
|
+
});
|
|
23
|
+
|
|
24
|
+
// Chat completion
|
|
25
|
+
const response = await ai.chat([
|
|
26
|
+
{ role: 'system', content: 'You are a helpful assistant.' },
|
|
27
|
+
{ role: 'user', content: 'What is TypeScript?' }
|
|
28
|
+
]);
|
|
29
|
+
console.log(response.content);
|
|
30
|
+
|
|
31
|
+
// Simple message (convenience wrapper around chat)
|
|
32
|
+
const reply = await ai.message('Explain generics in one sentence');
|
|
33
|
+
|
|
34
|
+
// Streaming
|
|
35
|
+
for await (const chunk of ai.stream([
|
|
36
|
+
{ role: 'user', content: 'Write a haiku' }
|
|
37
|
+
])) {
|
|
38
|
+
process.stdout.write(chunk);
|
|
39
|
+
}
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## Providers
|
|
43
|
+
|
|
44
|
+
```typescript
|
|
45
|
+
// OpenAI (default when type is omitted)
|
|
46
|
+
const openai = await getAI({ apiKey: 'sk-...' });
|
|
47
|
+
|
|
48
|
+
// LiteLLM (OpenAI-compatible gateway)
|
|
49
|
+
const litellm = await getAI({
|
|
50
|
+
type: 'litellm',
|
|
51
|
+
apiKey: process.env.LITELLM_API_KEY!,
|
|
52
|
+
baseUrl: process.env.LITELLM_BASE_URL || 'https://llm.happyvertical.com/v1',
|
|
53
|
+
defaultModel: process.env.LITELLM_MODEL, // Use a model id returned by /v1/models
|
|
54
|
+
});
|
|
55
|
+
|
|
56
|
+
// Bifrost (OpenAI-compatible gateway with governance admin APIs)
|
|
57
|
+
const bifrost = await getAI({
|
|
58
|
+
type: 'bifrost',
|
|
59
|
+
apiKey: process.env.BIFROST_API_KEY!,
|
|
60
|
+
adminUser: process.env.BIFROST_ADMIN_USER,
|
|
61
|
+
adminPassword: process.env.BIFROST_ADMIN_PASSWORD,
|
|
62
|
+
adminUrl: process.env.BIFROST_ADMIN_URL,
|
|
63
|
+
baseUrl: process.env.BIFROST_BASE_URL || 'http://localhost:8080',
|
|
64
|
+
defaultModel: process.env.BIFROST_MODEL,
|
|
65
|
+
});
|
|
66
|
+
|
|
67
|
+
// Ollama (local by default)
|
|
68
|
+
const ollama = await getAI({
|
|
69
|
+
type: 'ollama',
|
|
70
|
+
baseUrl: process.env.OLLAMA_BASE_URL || process.env.OLLAMA_HOST || 'http://localhost:11434',
|
|
71
|
+
apiKey: process.env.OLLAMA_API_KEY, // Optional, only needed for remote/cloud hosts
|
|
72
|
+
defaultModel: process.env.OLLAMA_MODEL, // Optional; otherwise the first compatible local model is selected
|
|
73
|
+
});
|
|
74
|
+
|
|
75
|
+
// Bare host:port values are also accepted and normalized to http://
|
|
76
|
+
const ollamaNode = await getAI({
|
|
77
|
+
type: 'ollama',
|
|
78
|
+
baseUrl: 'warthog:11434',
|
|
79
|
+
});
|
|
80
|
+
|
|
81
|
+
// Anthropic Claude
|
|
82
|
+
const claude = await getAI({ type: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY! });
|
|
83
|
+
|
|
84
|
+
// Google Gemini
|
|
85
|
+
const gemini = await getAI({ type: 'gemini', apiKey: process.env.GEMINI_API_KEY! });
|
|
86
|
+
|
|
87
|
+
// AWS Bedrock
|
|
88
|
+
const bedrock = await getAI({
|
|
89
|
+
type: 'bedrock',
|
|
90
|
+
region: 'us-east-1',
|
|
91
|
+
credentials: { accessKeyId: '...', secretAccessKey: '...' }
|
|
92
|
+
});
|
|
93
|
+
|
|
94
|
+
// Hugging Face
|
|
95
|
+
const hf = await getAI({ type: 'huggingface', apiToken: process.env.HF_TOKEN! });
|
|
96
|
+
|
|
97
|
+
// Claude CLI (uses Claude Max subscription, no API key needed)
|
|
98
|
+
const cli = await getAI({ type: 'claude-cli', defaultModel: 'sonnet' });
|
|
99
|
+
|
|
100
|
+
// Qwen3-TTS (text-to-speech only)
|
|
101
|
+
const tts = await getAI({ type: 'qwen3-tts', endpoint: 'http://localhost:8880' });
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Gateway Admin
|
|
105
|
+
|
|
106
|
+
Gateway providers that support provisioning expose `ai.admin`.
|
|
107
|
+
|
|
108
|
+
```typescript
|
|
109
|
+
const ai = await getAI({
|
|
110
|
+
type: 'bifrost',
|
|
111
|
+
apiKey: process.env.BIFROST_API_KEY!,
|
|
112
|
+
adminUrl: process.env.BIFROST_ADMIN_URL || 'http://localhost:8080',
|
|
113
|
+
adminUser: process.env.BIFROST_ADMIN_USER!,
|
|
114
|
+
adminPassword: process.env.BIFROST_ADMIN_PASSWORD!,
|
|
115
|
+
baseUrl: 'http://localhost:8080',
|
|
116
|
+
});
|
|
117
|
+
|
|
118
|
+
const project = await ai.admin!.createProject({
|
|
119
|
+
name: 'Tenant A Production',
|
|
120
|
+
tenantId: 'customer-tenant-a',
|
|
121
|
+
budget: { maxLimit: 100, resetDuration: '1M' },
|
|
122
|
+
});
|
|
123
|
+
|
|
124
|
+
const key = await ai.admin!.createVirtualKey({
|
|
125
|
+
name: 'Tenant A API Key',
|
|
126
|
+
projectId: project.id,
|
|
127
|
+
providerConfigs: [
|
|
128
|
+
{
|
|
129
|
+
provider: 'openai',
|
|
130
|
+
weight: 1,
|
|
131
|
+
allowedModels: ['gpt-4o-mini'],
|
|
132
|
+
},
|
|
133
|
+
],
|
|
134
|
+
keyIds: ['*'],
|
|
135
|
+
budget: { maxLimit: 25, resetDuration: '1M' },
|
|
136
|
+
rateLimit: {
|
|
137
|
+
tokenMaxLimit: 10000,
|
|
138
|
+
tokenResetDuration: '1h',
|
|
139
|
+
requestMaxLimit: 100,
|
|
140
|
+
requestResetDuration: '1m',
|
|
141
|
+
},
|
|
142
|
+
});
|
|
143
|
+
|
|
144
|
+
console.log(key.key);
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
LiteLLM uses the same SDK surface, mapping projects to LiteLLM teams and virtual keys to `/key/generate`.
|
|
148
|
+
|
|
149
|
+
## Opt-In Rate-Limit Pacing
|
|
150
|
+
|
|
151
|
+
Use `rateLimit` when multiple calls share the same provider budget and you want
|
|
152
|
+
`getAI()` to serialize requests, honor `Retry-After` hints, and retry only
|
|
153
|
+
rate-limit failures.
|
|
154
|
+
|
|
155
|
+
Pacing is enabled when:
|
|
156
|
+
- you set `enabled: true`, or
|
|
157
|
+
- you omit `enabled` and set any pacing field such as `key`, `cooldownMs`, `initialDelayMs`, or `maxAttempts`
|
|
158
|
+
|
|
159
|
+
```typescript
|
|
160
|
+
const ai = await getAI({
|
|
161
|
+
type: 'gemini',
|
|
162
|
+
apiKey: process.env.GEMINI_API_KEY!,
|
|
163
|
+
defaultModel: 'gemini-2.5-flash',
|
|
164
|
+
rateLimit: {
|
|
165
|
+
enabled: true,
|
|
166
|
+
key: 'gemini:shared-batch-key',
|
|
167
|
+
cooldownMs: 2000,
|
|
168
|
+
initialDelayMs: 15000,
|
|
169
|
+
maxAttempts: 3,
|
|
170
|
+
},
|
|
171
|
+
});
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
- `key` coordinates pacing across multiple clients in the same process
|
|
175
|
+
- `cooldownMs` spaces successful calls that share the same budget
|
|
176
|
+
- `initialDelayMs` is the fallback retry delay when the provider omits `Retry-After`
|
|
177
|
+
- `maxAttempts` counts the first call plus any rate-limit retries
|
|
178
|
+
|
|
179
|
+
When `rateLimit` is omitted, or `enabled: false` is set explicitly, `getAI()`
|
|
180
|
+
behaves exactly as it did before.
|
|
181
|
+
|
|
182
|
+
### `rateLimit` Options
|
|
183
|
+
|
|
184
|
+
| Field | Type | Default | Notes |
|
|
185
|
+
|------|------|---------|------|
|
|
186
|
+
| `enabled` | `boolean` | unset | Set to `true` for explicit opt-in, or `false` to force pacing off even if other pacing fields are present |
|
|
187
|
+
| `key` | `string` | derived | Shared budget key; clients with the same key coordinate with each other |
|
|
188
|
+
| `cooldownMs` | `number` | `0` | Minimum delay after a successful call before the next call with the same key |
|
|
189
|
+
| `initialDelayMs` | `number` | `5000` | Fallback retry delay when the provider does not return `Retry-After` |
|
|
190
|
+
| `maxAttempts` | `number` | `3` | Total attempts, including the initial call |
|
|
191
|
+
| `requestsPerMinute` | `number` | provider-specific | Used by `qwen3-tts` local token-bucket limiting |
|
|
192
|
+
| `maxConcurrent` | `number` | provider-specific | Used by `qwen3-tts` local concurrency limiting |
|
|
193
|
+
|
|
194
|
+
- If `key` is omitted, `@happyvertical/ai` derives a provider-scoped key from the configured credentials
|
|
195
|
+
- Setting any of `key`, `cooldownMs`, `initialDelayMs`, or `maxAttempts` also opts in when `enabled` is omitted
|
|
196
|
+
- Only normalized rate-limit failures are retried
|
|
197
|
+
- `stream()` is left unchanged; pacing is applied to the promise-returning request methods
|
|
198
|
+
|
|
199
|
+
Example quota-sensitive batch workload:
|
|
200
|
+
|
|
201
|
+
```typescript
|
|
202
|
+
const ai = await getAI({
|
|
203
|
+
type: 'gemini',
|
|
204
|
+
apiKey: process.env.GEMINI_API_KEY!,
|
|
205
|
+
defaultModel: 'gemini-2.5-flash',
|
|
206
|
+
rateLimit: {
|
|
207
|
+
enabled: true,
|
|
208
|
+
key: 'praeco:multi-site-analysis',
|
|
209
|
+
cooldownMs: 2000,
|
|
210
|
+
initialDelayMs: 15000,
|
|
211
|
+
maxAttempts: 3,
|
|
212
|
+
},
|
|
213
|
+
});
|
|
214
|
+
|
|
215
|
+
for (const site of sites) {
|
|
216
|
+
const summary = await ai.message(`Summarize anomalies for ${site.name}`);
|
|
217
|
+
console.log(site.name, summary);
|
|
218
|
+
}
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
## Environment Variables
|
|
222
|
+
|
|
223
|
+
`getAI()` reads `HAVE_AI_*` variables. Explicit options passed to `getAI()` take precedence over those env vars.
|
|
224
|
+
|
|
225
|
+
| Variable | Purpose |
|
|
226
|
+
|----------|---------|
|
|
227
|
+
| `HAVE_AI_PROVIDER` / `HAVE_AI_TYPE` | Provider type |
|
|
228
|
+
| `HAVE_AI_MODEL` / `HAVE_AI_DEFAULT_MODEL` | Default model |
|
|
229
|
+
| `HAVE_AI_API_KEY` | API key (fallback) |
|
|
230
|
+
| `HAVE_AI_BASE_URL` | Custom base URL |
|
|
231
|
+
| `HAVE_AI_TIMEOUT` | Request timeout (ms) |
|
|
232
|
+
| `HAVE_AI_MAX_RETRIES` | Max retry attempts |
|
|
233
|
+
|
|
234
|
+
### Node Auto-Detection Env Vars
|
|
235
|
+
|
|
236
|
+
`getAIAuto()` also checks provider-specific Node.js environment variables:
|
|
237
|
+
|
|
238
|
+
- `LITELLM_BASE_URL`, `LITELLM_API_KEY`
|
|
239
|
+
- `OLLAMA_HOST`, `OLLAMA_BASE_URL`, `OLLAMA_API_KEY`
|
|
240
|
+
- `OPENAI_API_KEY`
|
|
241
|
+
- `ANTHROPIC_API_KEY`
|
|
242
|
+
- `GEMINI_API_KEY`, `GOOGLE_API_KEY`
|
|
243
|
+
- `HF_TOKEN`
|
|
244
|
+
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_DEFAULT_REGION`
|
|
245
|
+
|
|
246
|
+
## API Overview
|
|
247
|
+
|
|
248
|
+
### Factory Functions
|
|
249
|
+
|
|
250
|
+
- `getAI(options)` — Creates a provider instance by type
|
|
251
|
+
- `getAIAuto(options)` — Auto-detects provider from credentials
|
|
252
|
+
|
|
253
|
+
### AIInterface Methods
|
|
254
|
+
|
|
255
|
+
All providers implement `AIInterface`:
|
|
256
|
+
|
|
257
|
+
| Method | Description |
|
|
258
|
+
|--------|-------------|
|
|
259
|
+
| `chat(messages, options?)` | Chat completion returning `AIResponse` |
|
|
260
|
+
| `message(text, options?)` | Simple single-turn convenience method |
|
|
261
|
+
| `complete(prompt, options?)` | Text completion |
|
|
262
|
+
| `stream(messages, options?)` | Streaming chat (async iterable) |
|
|
263
|
+
| `embed(text, options?)` | Text embeddings |
|
|
264
|
+
| `embedImage(image, options?)` | Image embeddings (Gemini and Bedrock native, OpenAI and Ollama via describe-then-embed) |
|
|
265
|
+
| `describeImage(image, prompt?, options?)` | Image description via vision models |
|
|
266
|
+
| `generateImage(prompt, options?)` | Image generation (DALL-E, Imagen, Titan Image Generator, Ollama-compatible image models) |
|
|
267
|
+
| `countTokens(text)` | Token count estimation |
|
|
268
|
+
| `getModels()` | List available models |
|
|
269
|
+
| `getCapabilities()` | Query provider capabilities |
|
|
270
|
+
| `synthesizeSpeech(text, options?)` | Text-to-speech synthesis |
|
|
271
|
+
| `streamSpeech(text, options?)` | Streaming TTS |
|
|
272
|
+
| `cloneVoice(options)` | Clone a voice from audio sample |
|
|
273
|
+
| `designVoice(options)` | Design a voice via text description |
|
|
274
|
+
| `getVoices(options?)` | List available voices |
|
|
275
|
+
|
|
276
|
+
### Error Types
|
|
277
|
+
|
|
278
|
+
All extend `AIError`: `AuthenticationError`, `RateLimitError`, `ModelNotFoundError`, `ContextLengthError`, `ContentFilterError`.
|
|
279
|
+
|
|
280
|
+
- `AIError.retryable` distinguishes retryable failures from terminal ones
|
|
281
|
+
- `RateLimitError.retryAfter` exposes provider retry hints in seconds when available
|
|
282
|
+
|
|
283
|
+
```typescript
|
|
284
|
+
try {
|
|
285
|
+
await ai.chat(messages);
|
|
286
|
+
} catch (error) {
|
|
287
|
+
if (error instanceof RateLimitError && error.retryable) {
|
|
288
|
+
console.log('retry after seconds:', error.retryAfter);
|
|
289
|
+
}
|
|
290
|
+
}
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
### Legacy Classes
|
|
294
|
+
|
|
295
|
+
`AIClient`, `OpenAIClient`, `AIThread`, and `AIMessageClass` are exported for backward compatibility. New code should use `getAI()` and the `AIInterface` methods.
|
|
296
|
+
|
|
297
|
+
## Function Calling
|
|
298
|
+
|
|
299
|
+
```typescript
|
|
300
|
+
const response = await ai.chat([
|
|
301
|
+
{ role: 'user', content: 'What is the weather in Tokyo?' }
|
|
302
|
+
], {
|
|
303
|
+
tools: [{
|
|
304
|
+
type: 'function',
|
|
305
|
+
function: {
|
|
306
|
+
name: 'get_weather',
|
|
307
|
+
description: 'Get weather for a location',
|
|
308
|
+
parameters: {
|
|
309
|
+
type: 'object',
|
|
310
|
+
properties: { location: { type: 'string' } },
|
|
311
|
+
required: ['location']
|
|
312
|
+
}
|
|
313
|
+
}
|
|
314
|
+
}]
|
|
315
|
+
});
|
|
316
|
+
|
|
317
|
+
if (response.toolCalls) {
|
|
318
|
+
console.log(response.toolCalls[0].function.name);
|
|
319
|
+
}
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
## Usage Tracking
|
|
323
|
+
|
|
324
|
+
Track token usage, costs, and performance across all providers with the `onUsage` callback:
|
|
325
|
+
|
|
326
|
+
```typescript
|
|
327
|
+
const ai = await getAI({
|
|
328
|
+
type: 'openai',
|
|
329
|
+
apiKey: process.env.OPENAI_API_KEY!,
|
|
330
|
+
onUsage: (event) => {
|
|
331
|
+
console.log(`[${event.provider}/${event.model}] ${event.operation}: ${event.usage?.totalTokens} tokens in ${event.duration}ms`);
|
|
332
|
+
// Or: save to database, send to analytics, aggregate in-memory, etc.
|
|
333
|
+
},
|
|
334
|
+
});
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
The `UsageEvent` payload:
|
|
338
|
+
|
|
339
|
+
| Field | Type | Description |
|
|
340
|
+
|-------|------|-------------|
|
|
341
|
+
| `provider` | `string` | Provider name (`'openai'`, `'anthropic'`, `'gemini'`, etc.) |
|
|
342
|
+
| `model` | `string` | Model used (e.g. `'gpt-4o'`, `'claude-3-5-sonnet-20241022'`) |
|
|
343
|
+
| `operation` | `string` | `'chat'` \| `'complete'` \| `'message'` \| `'embed'` \| `'stream'` \| ... |
|
|
344
|
+
| `usage?` | `TokenUsage` | `{ promptTokens, completionTokens, totalTokens }` (if available) |
|
|
345
|
+
| `duration` | `number` | Wall-clock time in milliseconds |
|
|
346
|
+
| `timestamp` | `Date` | When the call completed |
|
|
347
|
+
| `tags?` | `Record<string, string>` | Merged from global + per-call `usageTags` |
|
|
348
|
+
|
|
349
|
+
- Works with all providers and methods (`chat`, `complete`, `message`, `embed`, `stream`)
|
|
350
|
+
- `complete()` and `message()` report through their underlying `chat()` call
|
|
351
|
+
- Errors thrown inside `onUsage` are silently caught and will not affect API results
|
|
352
|
+
|
|
353
|
+
### Tagging Usage Events
|
|
354
|
+
|
|
355
|
+
Attach custom tags to correlate usage with features, users, or workflows:
|
|
356
|
+
|
|
357
|
+
```typescript
|
|
358
|
+
// Global tags applied to every call
|
|
359
|
+
const ai = await getAI({
|
|
360
|
+
type: 'openai',
|
|
361
|
+
apiKey: process.env.OPENAI_API_KEY!,
|
|
362
|
+
usageTags: { app: 'indagator', team: 'news' },
|
|
363
|
+
onUsage: (event) => {
|
|
364
|
+
console.log(event.tags); // { app: 'indagator', team: 'news', feature: 'summarize' }
|
|
365
|
+
},
|
|
366
|
+
});
|
|
367
|
+
|
|
368
|
+
// Per-call tags merge over global tags
|
|
369
|
+
await ai.chat(messages, {
|
|
370
|
+
usageTags: { feature: 'summarize', userId: 'u_123' },
|
|
371
|
+
});
|
|
372
|
+
```
|
|
373
|
+
|
|
374
|
+
## Claude Code Context
|
|
375
|
+
|
|
376
|
+
Install context files for AI-assisted development:
|
|
377
|
+
|
|
378
|
+
```bash
|
|
379
|
+
npx have-ai-context
|
|
380
|
+
```
|
|
381
|
+
|
|
382
|
+
## License
|
|
383
|
+
|
|
384
|
+
MIT
|