@just-every/ensemble 0.1.17 → 0.1.19
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +102 -706
- package/dist/index.d.ts +3 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +2 -1
- package/dist/index.js.map +1 -1
- package/dist/tsconfig.tsbuildinfo +1 -1
- package/dist/utils/create_tool_function.d.ts +19 -0
- package/dist/utils/create_tool_function.d.ts.map +1 -0
- package/dist/utils/create_tool_function.js +118 -0
- package/dist/utils/create_tool_function.js.map +1 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -3,28 +3,7 @@
|
|
|
3
3
|
[](https://www.npmjs.com/package/@just-every/ensemble)
|
|
4
4
|
[](https://github.com/just-every/ensemble/actions)
|
|
5
5
|
|
|
6
|
-
A unified interface for interacting with multiple LLM providers
|
|
7
|
-
|
|
8
|
-
## Why Use an Ensemble Approach?
|
|
9
|
-
|
|
10
|
-
The ensemble pattern - rotating between multiple LLM providers dynamically - offers compelling advantages over relying on a single model. Research has shown that sampling multiple reasoning chains and using consensus answers can improve performance by double-digit margins on complex tasks. By automating this at runtime rather than prompt-engineering time, ensemble delivers more reliable and robust AI interactions.
|
|
11
|
-
|
|
12
|
-
Beyond accuracy improvements, ensemble requests provide practical benefits for production systems. Different models carry unique training biases and stylistic patterns - rotating between them dilutes individual quirks and prevents conversations from getting "stuck" in one voice. The approach also ensures resilience: when one provider experiences an outage, quota limit, or latency spike, requests seamlessly route to alternatives. You can optimize costs by routing simple tasks to cheaper models while reserving premium models for complex reasoning. Need regex help? Route to a code-specialized model. Need emotional calibration? Use a dialogue expert. The ensemble gives you this granularity without complex conditional logic.
|
|
13
|
-
|
|
14
|
-
Perhaps most importantly, the ensemble approach future-proofs your application. Model quality and pricing change weekly in the fast-moving LLM landscape. With ensemble, you can trial newcomers on a small percentage of traffic, compare real metrics, then scale up or roll back within minutes - all without changing your code.
|
|
15
|
-
|
|
16
|
-
## Features
|
|
17
|
-
|
|
18
|
-
- **Multi-provider support**: Claude, OpenAI, Gemini, Deepseek, Grok, OpenRouter
|
|
19
|
-
- **AsyncGenerator API**: Clean, native async iteration for streaming responses
|
|
20
|
-
- **Simple interface**: Direct async generator pattern matches native LLM APIs
|
|
21
|
-
- **Tool calling**: Function calling support where available
|
|
22
|
-
- **Stream conversion**: Convert streaming events to conversation history for chaining
|
|
23
|
-
- **Image processing**: Image-to-text and image utilities
|
|
24
|
-
- **Cost tracking**: Token usage and cost monitoring
|
|
25
|
-
- **Quota management**: Rate limiting and usage tracking
|
|
26
|
-
- **Pluggable logging**: Configurable request/response logging
|
|
27
|
-
- **Type safety**: Full TypeScript support
|
|
6
|
+
A unified interface for interacting with multiple LLM providers (OpenAI, Anthropic, Google, etc.) with streaming support, tool calling, and embeddings.
|
|
28
7
|
|
|
29
8
|
## Installation
|
|
30
9
|
|
|
@@ -32,753 +11,144 @@ Perhaps most importantly, the ensemble approach future-proofs your application.
|
|
|
32
11
|
npm install @just-every/ensemble
|
|
33
12
|
```
|
|
34
13
|
|
|
35
|
-
### Migration from OpenAI SDK
|
|
36
|
-
|
|
37
|
-
If you're currently using the OpenAI SDK, migration is simple:
|
|
38
|
-
|
|
39
|
-
```typescript
|
|
40
|
-
// Before:
|
|
41
|
-
import OpenAI from 'openai';
|
|
42
|
-
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
|
|
43
|
-
|
|
44
|
-
// After:
|
|
45
|
-
import OpenAIEnsemble from '@just-every/ensemble/openai-compat';
|
|
46
|
-
const client = OpenAIEnsemble;
|
|
47
|
-
|
|
48
|
-
// Your existing code works unchanged!
|
|
49
|
-
const completion = await client.chat.completions.create({ /* ... */ });
|
|
50
|
-
```
|
|
51
|
-
|
|
52
14
|
## Quick Start
|
|
53
15
|
|
|
54
16
|
```typescript
|
|
55
17
|
import { request } from '@just-every/ensemble';
|
|
56
18
|
|
|
57
|
-
// Simple request
|
|
58
|
-
const
|
|
59
|
-
{ type: 'message', role: 'user', content: 'Hello
|
|
60
|
-
])
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
for await (const event of stream) {
|
|
64
|
-
if (event.type === 'message_delta') {
|
|
65
|
-
console.log(event.content);
|
|
66
|
-
} else if (event.type === 'message_complete') {
|
|
67
|
-
console.log('Request completed!');
|
|
68
|
-
} else if (event.type === 'error') {
|
|
69
|
-
console.error('Request failed:', event.error);
|
|
70
|
-
}
|
|
71
|
-
}
|
|
72
|
-
|
|
73
|
-
// With tools
|
|
74
|
-
const toolStream = request('gpt-4o', [
|
|
75
|
-
{ type: 'message', role: 'user', content: 'What is the weather?' }
|
|
76
|
-
], {
|
|
77
|
-
tools: [{
|
|
78
|
-
function: async (location: string) => {
|
|
79
|
-
// Tool implementation
|
|
80
|
-
return `Weather in ${location}: Sunny, 72°F`;
|
|
81
|
-
},
|
|
82
|
-
definition: {
|
|
83
|
-
type: 'function',
|
|
84
|
-
function: {
|
|
85
|
-
name: 'get_weather',
|
|
86
|
-
description: 'Get current weather',
|
|
87
|
-
parameters: {
|
|
88
|
-
type: 'object',
|
|
89
|
-
properties: {
|
|
90
|
-
location: { type: 'string' }
|
|
91
|
-
},
|
|
92
|
-
required: ['location']
|
|
93
|
-
}
|
|
94
|
-
}
|
|
95
|
-
}
|
|
96
|
-
}]
|
|
97
|
-
});
|
|
98
|
-
|
|
99
|
-
// Process tool calls
|
|
100
|
-
for await (const event of toolStream) {
|
|
101
|
-
if (event.type === 'tool_start') {
|
|
102
|
-
console.log('Tool called:', event.tool_calls[0].function.name);
|
|
103
|
-
} else if (event.type === 'message_delta') {
|
|
104
|
-
console.log(event.content);
|
|
105
|
-
}
|
|
106
|
-
}
|
|
107
|
-
|
|
108
|
-
// Early termination
|
|
109
|
-
const earlyStream = request('claude-3-5-sonnet-20241022', [
|
|
110
|
-
{ type: 'message', role: 'user', content: 'Count to 100' }
|
|
111
|
-
]);
|
|
112
|
-
|
|
113
|
-
let count = 0;
|
|
114
|
-
for await (const event of earlyStream) {
|
|
115
|
-
if (event.type === 'message_delta') {
|
|
116
|
-
count++;
|
|
117
|
-
if (count >= 10) break; // Stop after 10 events
|
|
19
|
+
// Simple streaming request
|
|
20
|
+
for await (const event of request('gpt-4o-mini', [
|
|
21
|
+
{ type: 'message', role: 'user', content: 'Hello!' }
|
|
22
|
+
])) {
|
|
23
|
+
if (event.type === 'text_delta') {
|
|
24
|
+
process.stdout.write(event.delta);
|
|
118
25
|
}
|
|
119
26
|
}
|
|
120
27
|
```
|
|
121
28
|
|
|
122
|
-
##
|
|
123
|
-
|
|
124
|
-
### Core Functions
|
|
125
|
-
|
|
126
|
-
#### `request(model, messages, options?)`
|
|
127
|
-
|
|
128
|
-
Main function for making LLM requests with streaming responses and automatic tool execution.
|
|
129
|
-
|
|
130
|
-
**Parameters:**
|
|
131
|
-
- `model` (string): Model identifier (e.g., 'gpt-4o', 'claude-3.5-sonnet', 'gemini-2.0-flash')
|
|
132
|
-
- `messages` (ResponseInput): Array of message objects in the conversation
|
|
133
|
-
- `options` (RequestOptions): Optional configuration object
|
|
134
|
-
|
|
135
|
-
**Returns:** `AsyncGenerator<EnsembleStreamEvent>` - An async generator that yields streaming events
|
|
136
|
-
|
|
137
|
-
```typescript
|
|
138
|
-
interface RequestOptions {
|
|
139
|
-
agentId?: string; // Identifier for logging/tracking
|
|
140
|
-
tools?: ToolFunction[]; // Array of tool definitions
|
|
141
|
-
toolChoice?: ToolChoice; // Control tool selection behavior
|
|
142
|
-
maxToolCalls?: number; // Max rounds of tool execution (default: 10, 0 = disabled)
|
|
143
|
-
processToolCall?: (toolCalls: ToolCall[]) => Promise<any>; // Custom tool handler
|
|
144
|
-
modelSettings?: ModelSettings; // Temperature, maxTokens, etc.
|
|
145
|
-
modelClass?: ModelClassID; // 'standard' | 'code' | 'reasoning' | 'monologue'
|
|
146
|
-
responseFormat?: ResponseFormat; // JSON mode or structured output
|
|
147
|
-
maxImageDimension?: number; // Auto-resize images (default: provider-specific)
|
|
148
|
-
fallbackModels?: string[]; // Models to try if primary fails
|
|
149
|
-
}
|
|
150
|
-
|
|
151
|
-
// Stream event types
|
|
152
|
-
type EnsembleStreamEvent =
|
|
153
|
-
| { type: 'text_delta', delta: string }
|
|
154
|
-
| { type: 'text', text: string }
|
|
155
|
-
| { type: 'message_delta', content: string }
|
|
156
|
-
| { type: 'message_complete', content: string }
|
|
157
|
-
| { type: 'tool_start', tool_calls: ToolCall[] }
|
|
158
|
-
| { type: 'cost_update', usage: TokenUsage }
|
|
159
|
-
| { type: 'stream_end', timestamp: string }
|
|
160
|
-
| { type: 'error', error: Error };
|
|
161
|
-
```
|
|
162
|
-
|
|
163
|
-
#### `embed(text, options?)`
|
|
164
|
-
|
|
165
|
-
Generate an embedding vector for the given text using any supported embedding model.
|
|
166
|
-
|
|
167
|
-
**Parameters:**
|
|
168
|
-
- `text` (string): Text to embed
|
|
169
|
-
- `options` (object): Optional configuration
|
|
170
|
-
- `model` (string): Specific model to use (e.g., 'text-embedding-3-small')
|
|
171
|
-
- `modelClass` (ModelClassID): Model class to use (default: 'embedding')
|
|
172
|
-
- `agentId` (string): Agent identifier for tracking
|
|
173
|
-
- `opts` (EmbedOpts): Provider-specific embedding options
|
|
174
|
-
|
|
175
|
-
**Returns:** `Promise<number[]>` - The embedding vector
|
|
176
|
-
|
|
177
|
-
```typescript
|
|
178
|
-
// Simple embedding
|
|
179
|
-
const embedding = await embed('Hello, world!');
|
|
180
|
-
console.log(`Dimension: ${embedding.length}`);
|
|
181
|
-
|
|
182
|
-
// With specific model
|
|
183
|
-
const embedding = await embed('Search query', {
|
|
184
|
-
model: 'text-embedding-3-large'
|
|
185
|
-
});
|
|
186
|
-
|
|
187
|
-
// With provider options (e.g., for Gemini)
|
|
188
|
-
const embedding = await embed('Document text', {
|
|
189
|
-
opts: { taskType: 'RETRIEVAL_DOCUMENT' }
|
|
190
|
-
});
|
|
191
|
-
```
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
### Working with Models
|
|
195
|
-
|
|
196
|
-
#### Model Selection
|
|
197
|
-
|
|
198
|
-
```typescript
|
|
199
|
-
import { getModelFromClass, findModel, MODEL_REGISTRY } from '@just-every/ensemble';
|
|
200
|
-
|
|
201
|
-
// Get best model for a specific task type
|
|
202
|
-
const codeModel = getModelFromClass('code'); // Returns best available code model
|
|
203
|
-
const reasoningModel = getModelFromClass('reasoning'); // For complex reasoning tasks
|
|
204
|
-
|
|
205
|
-
// Check if a model exists
|
|
206
|
-
const modelInfo = findModel('gpt-4o');
|
|
207
|
-
if (modelInfo) {
|
|
208
|
-
console.log(`Provider: ${modelInfo.provider}`);
|
|
209
|
-
console.log(`Input cost: $${modelInfo.inputCost}/million tokens`);
|
|
210
|
-
}
|
|
211
|
-
|
|
212
|
-
// List all available models
|
|
213
|
-
for (const [modelName, info] of Object.entries(MODEL_REGISTRY)) {
|
|
214
|
-
console.log(`${modelName}: ${info.provider}`);
|
|
215
|
-
}
|
|
216
|
-
```
|
|
217
|
-
|
|
218
|
-
#### Model Classes
|
|
29
|
+
## Core Functions
|
|
219
30
|
|
|
220
|
-
|
|
221
|
-
- **code**: Optimized for programming and technical tasks
|
|
222
|
-
- **reasoning**: Advanced models for complex logical reasoning
|
|
223
|
-
- **monologue**: Models supporting extended thinking/reasoning traces
|
|
31
|
+
### `request(model, messages, options?)`
|
|
224
32
|
|
|
225
|
-
|
|
33
|
+
Make streaming LLM requests with automatic tool execution.
|
|
226
34
|
|
|
227
35
|
```typescript
|
|
228
|
-
//
|
|
229
|
-
|
|
230
|
-
type: 'message'
|
|
231
|
-
|
|
232
|
-
content: string | MessageContent[];
|
|
233
|
-
status?: 'completed' | 'in_progress';
|
|
234
|
-
}
|
|
235
|
-
|
|
236
|
-
// Multi-modal content
|
|
237
|
-
type MessageContent =
|
|
238
|
-
| { type: 'input_text', text: string }
|
|
239
|
-
| { type: 'input_image', image_url: string, detail?: 'auto' | 'low' | 'high' }
|
|
240
|
-
| { type: 'tool_use', id: string, name: string, arguments: any };
|
|
241
|
-
|
|
242
|
-
// Tool-related messages
|
|
243
|
-
interface FunctionCall {
|
|
244
|
-
type: 'function_call';
|
|
245
|
-
id: string;
|
|
246
|
-
name: string;
|
|
247
|
-
arguments: string;
|
|
248
|
-
}
|
|
249
|
-
|
|
250
|
-
interface FunctionCallOutput {
|
|
251
|
-
type: 'function_call_output';
|
|
252
|
-
id: string;
|
|
253
|
-
output: string;
|
|
254
|
-
}
|
|
255
|
-
```
|
|
256
|
-
|
|
257
|
-
## Common Use Cases
|
|
258
|
-
|
|
259
|
-
### 1. Basic Conversations
|
|
260
|
-
|
|
261
|
-
```typescript
|
|
262
|
-
import { request } from '@just-every/ensemble';
|
|
36
|
+
// Basic usage
|
|
37
|
+
const stream = request('claude-3.5-sonnet', [
|
|
38
|
+
{ type: 'message', role: 'user', content: 'Explain quantum computing' }
|
|
39
|
+
]);
|
|
263
40
|
|
|
264
|
-
|
|
265
|
-
for await (const event of request('gpt-4o-mini', [
|
|
266
|
-
{ type: 'message', role: 'user', content: 'Explain quantum computing in simple terms' }
|
|
267
|
-
])) {
|
|
41
|
+
for await (const event of stream) {
|
|
268
42
|
if (event.type === 'text_delta') {
|
|
269
43
|
process.stdout.write(event.delta);
|
|
44
|
+
} else if (event.type === 'cost_update') {
|
|
45
|
+
console.log(`Cost: $${event.usage.total_cost}`);
|
|
270
46
|
}
|
|
271
47
|
}
|
|
272
48
|
|
|
273
|
-
//
|
|
274
|
-
const
|
|
275
|
-
|
|
276
|
-
{ type: 'message', role: 'user', content: 'How do I center a div in CSS?' },
|
|
277
|
-
{ type: 'message', role: 'assistant', content: 'Here are several ways...' },
|
|
278
|
-
{ type: 'message', role: 'user', content: 'What about using flexbox?' }
|
|
279
|
-
];
|
|
280
|
-
|
|
281
|
-
for await (const event of request('claude-3.5-sonnet', messages)) {
|
|
282
|
-
// Handle streaming response
|
|
283
|
-
}
|
|
284
|
-
```
|
|
285
|
-
|
|
286
|
-
### 2. Tool Calling & Function Execution
|
|
287
|
-
|
|
288
|
-
```typescript
|
|
289
|
-
// Define tools with TypeScript types
|
|
290
|
-
interface WeatherParams {
|
|
291
|
-
city: string;
|
|
292
|
-
unit?: 'celsius' | 'fahrenheit';
|
|
293
|
-
}
|
|
294
|
-
|
|
295
|
-
const weatherTool: ToolFunction = {
|
|
296
|
-
function: async ({ city, unit = 'celsius' }: WeatherParams) => {
|
|
297
|
-
// Real implementation would call weather API
|
|
298
|
-
const temp = unit === 'celsius' ? 22 : 72;
|
|
299
|
-
return `${temp}°${unit[0].toUpperCase()} in ${city}`;
|
|
300
|
-
},
|
|
49
|
+
// With tools
|
|
50
|
+
const tools = [{
|
|
51
|
+
function: async ({ city }) => `Weather in ${city}: Sunny, 72°F`,
|
|
301
52
|
definition: {
|
|
302
53
|
type: 'function',
|
|
303
54
|
function: {
|
|
304
55
|
name: 'get_weather',
|
|
305
|
-
description: 'Get
|
|
56
|
+
description: 'Get weather for a city',
|
|
306
57
|
parameters: {
|
|
307
58
|
type: 'object',
|
|
308
59
|
properties: {
|
|
309
|
-
city: { type: 'string'
|
|
310
|
-
unit: {
|
|
311
|
-
type: 'string',
|
|
312
|
-
enum: ['celsius', 'fahrenheit'],
|
|
313
|
-
description: 'Temperature unit'
|
|
314
|
-
}
|
|
60
|
+
city: { type: 'string' }
|
|
315
61
|
},
|
|
316
62
|
required: ['city']
|
|
317
63
|
}
|
|
318
64
|
}
|
|
319
65
|
}
|
|
320
|
-
};
|
|
321
|
-
|
|
322
|
-
// Use with automatic execution
|
|
323
|
-
for await (const event of request('gpt-4o', [
|
|
324
|
-
{ type: 'message', role: 'user', content: 'What\'s the weather in Tokyo and New York?' }
|
|
325
|
-
], { tools: [weatherTool] })) {
|
|
326
|
-
if (event.type === 'tool_start') {
|
|
327
|
-
console.log('Calling tool:', event.tool_calls[0].function.name);
|
|
328
|
-
} else if (event.type === 'text_delta') {
|
|
329
|
-
process.stdout.write(event.delta);
|
|
330
|
-
}
|
|
331
|
-
}
|
|
332
|
-
```
|
|
333
|
-
|
|
334
|
-
### 3. Model Selection Strategies
|
|
335
|
-
|
|
336
|
-
```typescript
|
|
337
|
-
import { getModelFromClass, request } from '@just-every/ensemble';
|
|
338
|
-
|
|
339
|
-
// Route based on task type
|
|
340
|
-
async function intelligentRequest(task: string, messages: ResponseInput) {
|
|
341
|
-
let model: string;
|
|
342
|
-
|
|
343
|
-
if (task.includes('code') || task.includes('debug')) {
|
|
344
|
-
model = getModelFromClass('code'); // Best code model
|
|
345
|
-
} else if (task.includes('analyze') || task.includes('reasoning')) {
|
|
346
|
-
model = getModelFromClass('reasoning'); // Best reasoning model
|
|
347
|
-
} else {
|
|
348
|
-
model = getModelFromClass('standard'); // Cost-effective general model
|
|
349
|
-
}
|
|
350
|
-
|
|
351
|
-
console.log(`Using ${model} for ${task}`);
|
|
352
|
-
|
|
353
|
-
return request(model, messages, {
|
|
354
|
-
fallbackModels: ['gpt-4o-mini', 'claude-3-5-haiku'] // Fallback options
|
|
355
|
-
});
|
|
356
|
-
}
|
|
357
|
-
|
|
358
|
-
// Use model rotation for consensus
|
|
359
|
-
async function consensusRequest(messages: ResponseInput) {
|
|
360
|
-
const models = ['gpt-4o', 'claude-3.5-sonnet', 'gemini-2.0-flash'];
|
|
361
|
-
const responses = [];
|
|
362
|
-
|
|
363
|
-
for (const model of models) {
|
|
364
|
-
const stream = request(model, messages);
|
|
365
|
-
const result = await convertStreamToMessages(stream);
|
|
366
|
-
responses.push(result.fullResponse);
|
|
367
|
-
}
|
|
368
|
-
|
|
369
|
-
// Analyze responses for consensus
|
|
370
|
-
return analyzeConsensus(responses);
|
|
371
|
-
}
|
|
372
|
-
```
|
|
373
|
-
|
|
374
|
-
### 4. Structured Output & JSON Mode
|
|
375
|
-
|
|
376
|
-
```typescript
|
|
377
|
-
// JSON mode for reliable parsing
|
|
378
|
-
const jsonStream = request('gpt-4o', [
|
|
379
|
-
{ type: 'message', role: 'user', content: 'List 3 programming languages with their pros/cons as JSON' }
|
|
380
|
-
], {
|
|
381
|
-
responseFormat: { type: 'json_object' }
|
|
382
|
-
});
|
|
383
|
-
|
|
384
|
-
let jsonContent = '';
|
|
385
|
-
for await (const event of jsonStream) {
|
|
386
|
-
if (event.type === 'text_delta') {
|
|
387
|
-
jsonContent += event.delta;
|
|
388
|
-
}
|
|
389
|
-
}
|
|
390
|
-
|
|
391
|
-
const data = JSON.parse(jsonContent);
|
|
392
|
-
|
|
393
|
-
// Structured output with schema validation
|
|
394
|
-
const schema = {
|
|
395
|
-
type: 'object',
|
|
396
|
-
properties: {
|
|
397
|
-
name: { type: 'string' },
|
|
398
|
-
age: { type: 'number' },
|
|
399
|
-
skills: {
|
|
400
|
-
type: 'array',
|
|
401
|
-
items: { type: 'string' }
|
|
402
|
-
}
|
|
403
|
-
},
|
|
404
|
-
required: ['name', 'age', 'skills']
|
|
405
|
-
};
|
|
406
|
-
|
|
407
|
-
const structuredStream = request('gpt-4o', [
|
|
408
|
-
{ type: 'message', role: 'user', content: 'Generate a developer profile' }
|
|
409
|
-
], {
|
|
410
|
-
responseFormat: {
|
|
411
|
-
type: 'json_schema',
|
|
412
|
-
json_schema: {
|
|
413
|
-
name: 'developer_profile',
|
|
414
|
-
schema: schema,
|
|
415
|
-
strict: true
|
|
416
|
-
}
|
|
417
|
-
}
|
|
418
|
-
});
|
|
419
|
-
```
|
|
420
|
-
|
|
421
|
-
### 5. Image Processing
|
|
66
|
+
}];
|
|
422
67
|
|
|
423
|
-
|
|
424
|
-
|
|
425
|
-
|
|
426
|
-
{
|
|
427
|
-
type: 'message',
|
|
428
|
-
role: 'user',
|
|
429
|
-
content: [
|
|
430
|
-
{ type: 'input_text', text: 'What\'s in this image? Describe any text you see.' },
|
|
431
|
-
{
|
|
432
|
-
type: 'input_image',
|
|
433
|
-
image_url: 'data:image/jpeg;base64,...',
|
|
434
|
-
detail: 'high' // 'auto' | 'low' | 'high'
|
|
435
|
-
}
|
|
436
|
-
]
|
|
437
|
-
}
|
|
438
|
-
], {
|
|
439
|
-
maxImageDimension: 2048 // Auto-resize large images
|
|
440
|
-
});
|
|
441
|
-
|
|
442
|
-
// Multiple images
|
|
443
|
-
const comparison = request('claude-3.5-sonnet', [
|
|
444
|
-
{
|
|
445
|
-
type: 'message',
|
|
446
|
-
role: 'user',
|
|
447
|
-
content: [
|
|
448
|
-
{ type: 'input_text', text: 'Compare these two designs:' },
|
|
449
|
-
{ type: 'input_image', image_url: 'https://example.com/design1.png' },
|
|
450
|
-
{ type: 'input_image', image_url: 'https://example.com/design2.png' }
|
|
451
|
-
]
|
|
452
|
-
}
|
|
453
|
-
]);
|
|
68
|
+
const stream = request('gpt-4o', [
|
|
69
|
+
{ type: 'message', role: 'user', content: 'What\'s the weather in Paris?' }
|
|
70
|
+
], { tools });
|
|
454
71
|
```
|
|
455
72
|
|
|
456
|
-
###
|
|
73
|
+
### `embed(text, options?)`
|
|
457
74
|
|
|
458
|
-
|
|
459
|
-
import { isRateLimitError, isAuthenticationError } from '@just-every/ensemble';
|
|
460
|
-
|
|
461
|
-
async function robustRequest(model: string, messages: ResponseInput, options?: RequestOptions) {
|
|
462
|
-
const maxRetries = 3;
|
|
463
|
-
let lastError;
|
|
464
|
-
|
|
465
|
-
for (let i = 0; i < maxRetries; i++) {
|
|
466
|
-
try {
|
|
467
|
-
const events = [];
|
|
468
|
-
for await (const event of request(model, messages, options)) {
|
|
469
|
-
if (event.type === 'error') {
|
|
470
|
-
throw event.error;
|
|
471
|
-
}
|
|
472
|
-
events.push(event);
|
|
473
|
-
}
|
|
474
|
-
return events;
|
|
475
|
-
|
|
476
|
-
} catch (error) {
|
|
477
|
-
lastError = error;
|
|
478
|
-
|
|
479
|
-
if (isAuthenticationError(error)) {
|
|
480
|
-
throw error; // Don't retry auth errors
|
|
481
|
-
}
|
|
482
|
-
|
|
483
|
-
if (isRateLimitError(error)) {
|
|
484
|
-
const waitTime = error.retryAfter || Math.pow(2, i) * 1000;
|
|
485
|
-
console.log(`Rate limited. Waiting ${waitTime}ms...`);
|
|
486
|
-
await new Promise(resolve => setTimeout(resolve, waitTime));
|
|
487
|
-
continue;
|
|
488
|
-
}
|
|
489
|
-
|
|
490
|
-
// Try fallback model
|
|
491
|
-
if (options?.fallbackModels?.[i]) {
|
|
492
|
-
model = options.fallbackModels[i];
|
|
493
|
-
console.log(`Falling back to ${model}`);
|
|
494
|
-
continue;
|
|
495
|
-
}
|
|
496
|
-
}
|
|
497
|
-
}
|
|
498
|
-
|
|
499
|
-
throw lastError;
|
|
500
|
-
}
|
|
501
|
-
```
|
|
502
|
-
|
|
503
|
-
## Utilities
|
|
504
|
-
|
|
505
|
-
### Cost & Usage Tracking
|
|
75
|
+
Generate embeddings for semantic search and RAG applications.
|
|
506
76
|
|
|
507
77
|
```typescript
|
|
508
|
-
|
|
509
|
-
|
|
510
|
-
|
|
511
|
-
for await (const event of request('gpt-4o', messages)) {
|
|
512
|
-
if (event.type === 'cost_update') {
|
|
513
|
-
console.log(`Tokens: ${event.usage.input_tokens} in, ${event.usage.output_tokens} out`);
|
|
514
|
-
console.log(`Cost: $${event.usage.total_cost.toFixed(4)}`);
|
|
515
|
-
}
|
|
516
|
-
}
|
|
78
|
+
// Simple embedding
|
|
79
|
+
const embedding = await embed('Hello, world!');
|
|
80
|
+
console.log(`Dimension: ${embedding.length}`); // e.g., 1536
|
|
517
81
|
|
|
518
|
-
//
|
|
519
|
-
const
|
|
520
|
-
|
|
521
|
-
|
|
522
|
-
}
|
|
82
|
+
// With specific model
|
|
83
|
+
const embedding = await embed('Search query', {
|
|
84
|
+
model: 'text-embedding-3-large'
|
|
85
|
+
});
|
|
523
86
|
|
|
524
|
-
//
|
|
525
|
-
|
|
526
|
-
|
|
527
|
-
|
|
528
|
-
const
|
|
529
|
-
|
|
87
|
+
// Calculate similarity
|
|
88
|
+
function cosineSimilarity(a: number[], b: number[]): number {
|
|
89
|
+
const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
|
|
90
|
+
const normA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
|
|
91
|
+
const normB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
|
|
92
|
+
return dotProduct / (normA * normB);
|
|
530
93
|
}
|
|
531
|
-
```
|
|
532
|
-
|
|
533
|
-
### Stream Conversion & Chaining
|
|
534
94
|
|
|
535
|
-
|
|
536
|
-
import { convertStreamToMessages, chainRequests } from '@just-every/ensemble';
|
|
537
|
-
|
|
538
|
-
let currentMessages = [
|
|
539
|
-
{ type: 'message', role: 'user', content: 'Write a haiku about coding' },
|
|
540
|
-
{ type: 'message', role: 'user', content: 'Make it really long' }
|
|
541
|
-
];
|
|
542
|
-
|
|
543
|
-
let messages = [
|
|
544
|
-
{ type: 'message', role: 'developer', content: 'You are a helpful coding assistant' },
|
|
545
|
-
{ type: 'message', role: 'user', content: 'How do I center a div in CSS?' },
|
|
546
|
-
];
|
|
547
|
-
messages = [...messages, ...(await convertStreamToMessages(request('claude-4-sonnet', messages))).messages];
|
|
548
|
-
messages = [...messages, ...(await convertStreamToMessages(request(getModelFromClass('reasoning_mini'), messages))).messages];
|
|
549
|
-
messages = [...messages, ...(await convertStreamToMessages(request('gemini-2.5-flash', messages))).messages];
|
|
550
|
-
|
|
551
|
-
|
|
552
|
-
console.log(result.messages); // Full conversation history
|
|
553
|
-
console.log(result.fullResponse); // Just the assistant's response
|
|
554
|
-
|
|
555
|
-
// Chain multiple models for multi-step tasks
|
|
556
|
-
const analysis = await chainRequests(
|
|
557
|
-
[
|
|
558
|
-
{ type: 'message', role: 'user', content: codeToAnalyze }
|
|
559
|
-
],
|
|
560
|
-
[
|
|
561
|
-
{
|
|
562
|
-
model: getModelFromClass('code'),
|
|
563
|
-
systemPrompt: 'Analyze this code for bugs and security issues',
|
|
564
|
-
},
|
|
565
|
-
{
|
|
566
|
-
model: getModelFromClass('reasoning'),
|
|
567
|
-
systemPrompt: 'Prioritize the issues found and suggest fixes',
|
|
568
|
-
},
|
|
569
|
-
{
|
|
570
|
-
model: 'gpt-4.1-mini',
|
|
571
|
-
systemPrompt: 'Summarize the analysis in 3 bullet points',
|
|
572
|
-
}
|
|
573
|
-
]);
|
|
95
|
+
const similarity = cosineSimilarity(embedding1, embedding2);
|
|
574
96
|
```
|
|
575
97
|
|
|
576
|
-
###
|
|
98
|
+
### `chainRequests(messages, requests)`
|
|
577
99
|
|
|
578
|
-
|
|
579
|
-
import { resizeImageForModel, imageToText } from '@just-every/ensemble';
|
|
580
|
-
|
|
581
|
-
// Auto-resize for specific model requirements
|
|
582
|
-
const resized = await resizeImageForModel(
|
|
583
|
-
base64ImageData,
|
|
584
|
-
'gpt-4o', // Different models have different size limits
|
|
585
|
-
{ maxDimension: 2048 }
|
|
586
|
-
);
|
|
587
|
-
|
|
588
|
-
// Extract text from images
|
|
589
|
-
const extractedText = await imageToText(imageBuffer);
|
|
590
|
-
console.log('Found text:', extractedText);
|
|
591
|
-
```
|
|
592
|
-
|
|
593
|
-
### Logging & Debugging
|
|
100
|
+
Chain multiple LLM calls, using the output of one as input to the next.
|
|
594
101
|
|
|
595
102
|
```typescript
|
|
596
|
-
import {
|
|
597
|
-
|
|
598
|
-
// Production-ready logger example
|
|
599
|
-
class ProductionLogger implements EnsembleLogger {
|
|
600
|
-
log_llm_request(agentId: string, providerName: string, model: string, requestData: unknown, timestamp?: Date): string {
|
|
601
|
-
const requestId = `req_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
|
|
602
|
-
|
|
603
|
-
// Log to your monitoring system
|
|
604
|
-
logger.info('LLM Request', {
|
|
605
|
-
requestId,
|
|
606
|
-
agentId,
|
|
607
|
-
provider: providerName,
|
|
608
|
-
model,
|
|
609
|
-
timestamp,
|
|
610
|
-
// Be careful not to log sensitive data
|
|
611
|
-
messageCount: (requestData as any).messages?.length,
|
|
612
|
-
hasTools: !!(requestData as any).tools?.length
|
|
613
|
-
});
|
|
614
|
-
|
|
615
|
-
return requestId;
|
|
616
|
-
}
|
|
617
|
-
|
|
618
|
-
log_llm_response(requestId: string | undefined, responseData: unknown, timestamp?: Date): void {
|
|
619
|
-
const response = responseData as any;
|
|
620
|
-
|
|
621
|
-
logger.info('LLM Response', {
|
|
622
|
-
requestId,
|
|
623
|
-
timestamp,
|
|
624
|
-
inputTokens: response.usage?.input_tokens,
|
|
625
|
-
outputTokens: response.usage?.output_tokens,
|
|
626
|
-
totalCost: response.usage?.total_cost,
|
|
627
|
-
cached: response.usage?.cache_creation_input_tokens > 0
|
|
628
|
-
});
|
|
629
|
-
}
|
|
630
|
-
|
|
631
|
-
log_llm_error(requestId: string | undefined, errorData: unknown, timestamp?: Date): void {
|
|
632
|
-
logger.error('LLM Error', {
|
|
633
|
-
requestId,
|
|
634
|
-
timestamp,
|
|
635
|
-
error: errorData,
|
|
636
|
-
// Include retry information if available
|
|
637
|
-
retryAfter: (errorData as any).retryAfter
|
|
638
|
-
});
|
|
639
|
-
}
|
|
640
|
-
}
|
|
103
|
+
import { chainRequests } from '@just-every/ensemble';
|
|
641
104
|
|
|
642
|
-
|
|
643
|
-
|
|
644
|
-
|
|
645
|
-
|
|
646
|
-
|
|
647
|
-
|
|
648
|
-
log_llm_request: (agent, provider, model, data) => {
|
|
649
|
-
console.log(`[${new Date().toISOString()}] → ${provider}/${model}`);
|
|
650
|
-
return Date.now().toString();
|
|
105
|
+
const result = await chainRequests(
|
|
106
|
+
[{ type: 'message', role: 'user', content: 'Analyze this code for bugs: ...' }],
|
|
107
|
+
[
|
|
108
|
+
{
|
|
109
|
+
model: 'gpt-4o',
|
|
110
|
+
systemPrompt: 'You are a code reviewer. Find bugs and security issues.'
|
|
651
111
|
},
|
|
652
|
-
|
|
653
|
-
|
|
654
|
-
|
|
112
|
+
{
|
|
113
|
+
model: 'claude-3.5-sonnet',
|
|
114
|
+
systemPrompt: 'Prioritize the issues found and suggest fixes.'
|
|
655
115
|
},
|
|
656
|
-
|
|
657
|
-
|
|
116
|
+
{
|
|
117
|
+
model: 'gpt-4o-mini',
|
|
118
|
+
systemPrompt: 'Summarize the analysis in 3 bullet points.'
|
|
658
119
|
}
|
|
659
|
-
|
|
660
|
-
|
|
120
|
+
]
|
|
121
|
+
);
|
|
122
|
+
|
|
123
|
+
console.log(result.fullResponse);
|
|
661
124
|
```
|
|
662
125
|
|
|
663
|
-
##
|
|
126
|
+
## Supported Providers
|
|
127
|
+
|
|
128
|
+
- **OpenAI**: GPT-4o, GPT-4o-mini, o1-preview, o1-mini
|
|
129
|
+
- **Anthropic**: Claude 3.5 Sonnet, Claude 3.5 Haiku
|
|
130
|
+
- **Google**: Gemini 2.0 Flash, Gemini 1.5 Pro
|
|
131
|
+
- **DeepSeek**: DeepSeek Chat, DeepSeek Coder
|
|
132
|
+
- **xAI**: Grok 2, Grok Beta
|
|
133
|
+
- **OpenRouter**: Access to 100+ models
|
|
664
134
|
|
|
665
|
-
|
|
135
|
+
## OpenAI SDK Compatibility
|
|
666
136
|
|
|
667
|
-
|
|
137
|
+
Drop-in replacement for the OpenAI SDK:
|
|
668
138
|
|
|
669
139
|
```typescript
|
|
140
|
+
// Instead of: import OpenAI from 'openai';
|
|
670
141
|
import OpenAIEnsemble from '@just-every/ensemble/openai-compat';
|
|
671
|
-
// Or named imports: import { chat, completions } from '@just-every/ensemble';
|
|
672
|
-
|
|
673
|
-
// Replace OpenAI client
|
|
674
|
-
const openai = OpenAIEnsemble; // Instead of: new OpenAI({ apiKey: '...' })
|
|
675
|
-
|
|
676
|
-
// Use exactly like OpenAI SDK - but with any model!
|
|
677
|
-
const completion = await openai.chat.completions.create({
|
|
678
|
-
model: 'claude-3.5-sonnet', // or 'gpt-4o', 'gemini-2.0-flash', etc.
|
|
679
|
-
messages: [
|
|
680
|
-
{ role: 'system', content: 'You are a helpful assistant.' },
|
|
681
|
-
{ role: 'user', content: 'Hello!' }
|
|
682
|
-
],
|
|
683
|
-
temperature: 0.7
|
|
684
|
-
});
|
|
685
142
|
|
|
686
|
-
|
|
687
|
-
|
|
688
|
-
|
|
689
|
-
const stream = await openai.chat.completions.create({
|
|
690
|
-
model: 'gpt-4o-mini',
|
|
691
|
-
messages: [{ role: 'user', content: 'Tell me a story' }],
|
|
143
|
+
const completion = await OpenAIEnsemble.chat.completions.create({
|
|
144
|
+
model: 'claude-3.5-sonnet', // Use any supported model!
|
|
145
|
+
messages: [{ role: 'user', content: 'Hello!' }],
|
|
692
146
|
stream: true
|
|
693
147
|
});
|
|
694
|
-
|
|
695
|
-
for await (const chunk of stream) {
|
|
696
|
-
process.stdout.write(chunk.choices[0].delta.content || '');
|
|
697
|
-
}
|
|
698
|
-
|
|
699
|
-
// Legacy completions API also supported
|
|
700
|
-
const legacyCompletion = await openai.completions.create({
|
|
701
|
-
model: 'deepseek-chat',
|
|
702
|
-
prompt: 'Once upon a time',
|
|
703
|
-
max_tokens: 100
|
|
704
|
-
});
|
|
705
|
-
```
|
|
706
|
-
|
|
707
|
-
This compatibility layer supports:
|
|
708
|
-
- All chat.completions.create parameters (temperature, tools, response_format, etc.)
|
|
709
|
-
- Streaming and non-streaming responses
|
|
710
|
-
- Tool/function calling
|
|
711
|
-
- Legacy completions.create API
|
|
712
|
-
- Proper TypeScript types matching OpenAI's SDK
|
|
713
|
-
|
|
714
|
-
### Custom Model Providers
|
|
715
|
-
|
|
716
|
-
```typescript
|
|
717
|
-
import { ModelProvider, registerExternalModel } from '@just-every/ensemble';
|
|
718
|
-
|
|
719
|
-
// Register a custom model
|
|
720
|
-
registerExternalModel({
|
|
721
|
-
id: 'my-custom-model',
|
|
722
|
-
provider: 'custom',
|
|
723
|
-
inputCost: 0.001,
|
|
724
|
-
outputCost: 0.002,
|
|
725
|
-
contextWindow: 8192,
|
|
726
|
-
maxOutput: 4096,
|
|
727
|
-
supportsTools: true,
|
|
728
|
-
supportsVision: false,
|
|
729
|
-
supportsStreaming: true
|
|
730
|
-
});
|
|
731
|
-
|
|
732
|
-
// Use your custom model
|
|
733
|
-
const stream = request('my-custom-model', messages);
|
|
734
|
-
```
|
|
735
|
-
|
|
736
|
-
### Performance Optimization
|
|
737
|
-
|
|
738
|
-
```typescript
|
|
739
|
-
// Batch processing with concurrency control
|
|
740
|
-
async function batchProcess(items: string[], concurrency = 3) {
|
|
741
|
-
const results = [];
|
|
742
|
-
const queue = [...items];
|
|
743
|
-
|
|
744
|
-
async function worker() {
|
|
745
|
-
while (queue.length > 0) {
|
|
746
|
-
const item = queue.shift()!;
|
|
747
|
-
const stream = request('gpt-4o-mini', [
|
|
748
|
-
{ type: 'message', role: 'user', content: `Process: ${item}` }
|
|
749
|
-
]);
|
|
750
|
-
|
|
751
|
-
const result = await convertStreamToMessages(stream);
|
|
752
|
-
results.push({ item, result: result.fullResponse });
|
|
753
|
-
}
|
|
754
|
-
}
|
|
755
|
-
|
|
756
|
-
// Run workers concurrently
|
|
757
|
-
await Promise.all(Array(concurrency).fill(null).map(() => worker()));
|
|
758
|
-
return results;
|
|
759
|
-
}
|
|
760
|
-
|
|
761
|
-
// Stream multiple requests in parallel
|
|
762
|
-
async function parallelStreaming(prompts: string[]) {
|
|
763
|
-
const streams = prompts.map(prompt =>
|
|
764
|
-
request('claude-3.5-haiku', [
|
|
765
|
-
{ type: 'message', role: 'user', content: prompt }
|
|
766
|
-
])
|
|
767
|
-
);
|
|
768
|
-
|
|
769
|
-
// Process all streams concurrently
|
|
770
|
-
const results = await Promise.all(
|
|
771
|
-
streams.map(stream => convertStreamToMessages(stream))
|
|
772
|
-
);
|
|
773
|
-
|
|
774
|
-
return results.map(r => r.fullResponse);
|
|
775
|
-
}
|
|
776
148
|
```
|
|
777
149
|
|
|
778
150
|
## Environment Variables
|
|
779
151
|
|
|
780
|
-
Set up API keys for the providers you want to use:
|
|
781
|
-
|
|
782
152
|
```bash
|
|
783
153
|
ANTHROPIC_API_KEY=your_key_here
|
|
784
154
|
OPENAI_API_KEY=your_key_here
|
|
@@ -788,6 +158,32 @@ XAI_API_KEY=your_key_here
|
|
|
788
158
|
OPENROUTER_API_KEY=your_key_here
|
|
789
159
|
```
|
|
790
160
|
|
|
161
|
+
## Key Features
|
|
162
|
+
|
|
163
|
+
- **Unified streaming API** across all LLM providers
|
|
164
|
+
- **Automatic tool execution** with type-safe function calling
|
|
165
|
+
- **Smart model rotation** based on availability and performance
|
|
166
|
+
- **Built-in embeddings** with caching
|
|
167
|
+
- **OpenAI SDK compatibility** - drop-in replacement
|
|
168
|
+
- **Cost tracking** and quota management
|
|
169
|
+
|
|
170
|
+
## Documentation
|
|
171
|
+
|
|
172
|
+
- [Model Selection & Management](./docs/models.md)
|
|
173
|
+
- [Advanced Usage](./docs/advanced-usage.md) - Tools, structured output, images
|
|
174
|
+
- [Error Handling](./docs/error-handling.md)
|
|
175
|
+
- [OpenAI Compatibility](./docs/openai-compatibility.md)
|
|
176
|
+
- [Utility Functions](./docs/utilities.md)
|
|
177
|
+
|
|
178
|
+
## Examples
|
|
179
|
+
|
|
180
|
+
See the [examples](./examples) directory for:
|
|
181
|
+
- [Basic usage](./examples/basic-request.ts)
|
|
182
|
+
- [Tool calling](./examples/tool-calling.ts)
|
|
183
|
+
- [Embeddings & semantic search](./examples/embeddings.ts)
|
|
184
|
+
- [Model rotation](./examples/model-rotation.ts)
|
|
185
|
+
- [Stream conversion](./examples/stream-conversion.ts)
|
|
186
|
+
|
|
791
187
|
## License
|
|
792
188
|
|
|
793
|
-
MIT
|
|
189
|
+
MIT
|