@just-every/ensemble 0.1.17 → 0.1.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,28 +3,7 @@
3
3
  [![npm version](https://badge.fury.io/js/@just-every%2Fensemble.svg)](https://www.npmjs.com/package/@just-every/ensemble)
4
4
  [![GitHub Actions](https://github.com/just-every/ensemble/workflows/Release/badge.svg)](https://github.com/just-every/ensemble/actions)
5
5
 
6
- A unified interface for interacting with multiple LLM providers including OpenAI, Anthropic Claude, Google Gemini, Deepseek, Grok, and OpenRouter.
7
-
8
- ## Why Use an Ensemble Approach?
9
-
10
- The ensemble pattern - rotating between multiple LLM providers dynamically - offers compelling advantages over relying on a single model. Research has shown that sampling multiple reasoning chains and using consensus answers can improve performance by double-digit margins on complex tasks. By automating this at runtime rather than prompt-engineering time, ensemble delivers more reliable and robust AI interactions.
11
-
12
- Beyond accuracy improvements, ensemble requests provide practical benefits for production systems. Different models carry unique training biases and stylistic patterns - rotating between them dilutes individual quirks and prevents conversations from getting "stuck" in one voice. The approach also ensures resilience: when one provider experiences an outage, quota limit, or latency spike, requests seamlessly route to alternatives. You can optimize costs by routing simple tasks to cheaper models while reserving premium models for complex reasoning. Need regex help? Route to a code-specialized model. Need emotional calibration? Use a dialogue expert. The ensemble gives you this granularity without complex conditional logic.
13
-
14
- Perhaps most importantly, the ensemble approach future-proofs your application. Model quality and pricing change weekly in the fast-moving LLM landscape. With ensemble, you can trial newcomers on a small percentage of traffic, compare real metrics, then scale up or roll back within minutes - all without changing your code.
15
-
16
- ## Features
17
-
18
- - **Multi-provider support**: Claude, OpenAI, Gemini, Deepseek, Grok, OpenRouter
19
- - **AsyncGenerator API**: Clean, native async iteration for streaming responses
20
- - **Simple interface**: Direct async generator pattern matches native LLM APIs
21
- - **Tool calling**: Function calling support where available
22
- - **Stream conversion**: Convert streaming events to conversation history for chaining
23
- - **Image processing**: Image-to-text and image utilities
24
- - **Cost tracking**: Token usage and cost monitoring
25
- - **Quota management**: Rate limiting and usage tracking
26
- - **Pluggable logging**: Configurable request/response logging
27
- - **Type safety**: Full TypeScript support
6
+ A unified interface for interacting with multiple LLM providers (OpenAI, Anthropic, Google, etc.) with streaming support, tool calling, and embeddings.
28
7
 
29
8
  ## Installation
30
9
 
@@ -32,753 +11,144 @@ Perhaps most importantly, the ensemble approach future-proofs your application.
32
11
  npm install @just-every/ensemble
33
12
  ```
34
13
 
35
- ### Migration from OpenAI SDK
36
-
37
- If you're currently using the OpenAI SDK, migration is simple:
38
-
39
- ```typescript
40
- // Before:
41
- import OpenAI from 'openai';
42
- const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
43
-
44
- // After:
45
- import OpenAIEnsemble from '@just-every/ensemble/openai-compat';
46
- const client = OpenAIEnsemble;
47
-
48
- // Your existing code works unchanged!
49
- const completion = await client.chat.completions.create({ /* ... */ });
50
- ```
51
-
52
14
  ## Quick Start
53
15
 
54
16
  ```typescript
55
17
  import { request } from '@just-every/ensemble';
56
18
 
57
- // Simple request with AsyncGenerator API
58
- const stream = request('claude-3-5-sonnet-20241022', [
59
- { type: 'message', role: 'user', content: 'Hello, world!' }
60
- ]);
61
-
62
- // Process streaming events
63
- for await (const event of stream) {
64
- if (event.type === 'message_delta') {
65
- console.log(event.content);
66
- } else if (event.type === 'message_complete') {
67
- console.log('Request completed!');
68
- } else if (event.type === 'error') {
69
- console.error('Request failed:', event.error);
70
- }
71
- }
72
-
73
- // With tools
74
- const toolStream = request('gpt-4o', [
75
- { type: 'message', role: 'user', content: 'What is the weather?' }
76
- ], {
77
- tools: [{
78
- function: async (location: string) => {
79
- // Tool implementation
80
- return `Weather in ${location}: Sunny, 72°F`;
81
- },
82
- definition: {
83
- type: 'function',
84
- function: {
85
- name: 'get_weather',
86
- description: 'Get current weather',
87
- parameters: {
88
- type: 'object',
89
- properties: {
90
- location: { type: 'string' }
91
- },
92
- required: ['location']
93
- }
94
- }
95
- }
96
- }]
97
- });
98
-
99
- // Process tool calls
100
- for await (const event of toolStream) {
101
- if (event.type === 'tool_start') {
102
- console.log('Tool called:', event.tool_calls[0].function.name);
103
- } else if (event.type === 'message_delta') {
104
- console.log(event.content);
105
- }
106
- }
107
-
108
- // Early termination
109
- const earlyStream = request('claude-3-5-sonnet-20241022', [
110
- { type: 'message', role: 'user', content: 'Count to 100' }
111
- ]);
112
-
113
- let count = 0;
114
- for await (const event of earlyStream) {
115
- if (event.type === 'message_delta') {
116
- count++;
117
- if (count >= 10) break; // Stop after 10 events
19
+ // Simple streaming request
20
+ for await (const event of request('gpt-4o-mini', [
21
+ { type: 'message', role: 'user', content: 'Hello!' }
22
+ ])) {
23
+ if (event.type === 'text_delta') {
24
+ process.stdout.write(event.delta);
118
25
  }
119
26
  }
120
27
  ```
121
28
 
122
- ## API Reference
123
-
124
- ### Core Functions
29
+ ## Core Functions
125
30
 
126
- #### `request(model, messages, options?)`
31
+ ### `request(model, messages, options?)`
127
32
 
128
- Main function for making LLM requests with streaming responses and automatic tool execution.
129
-
130
- **Parameters:**
131
- - `model` (string): Model identifier (e.g., 'gpt-4o', 'claude-3.5-sonnet', 'gemini-2.0-flash')
132
- - `messages` (ResponseInput): Array of message objects in the conversation
133
- - `options` (RequestOptions): Optional configuration object
134
-
135
- **Returns:** `AsyncGenerator<EnsembleStreamEvent>` - An async generator that yields streaming events
33
+ Make streaming LLM requests with automatic tool execution.
136
34
 
137
35
  ```typescript
138
- interface RequestOptions {
139
- agentId?: string; // Identifier for logging/tracking
140
- tools?: ToolFunction[]; // Array of tool definitions
141
- toolChoice?: ToolChoice; // Control tool selection behavior
142
- maxToolCalls?: number; // Max rounds of tool execution (default: 10, 0 = disabled)
143
- processToolCall?: (toolCalls: ToolCall[]) => Promise<any>; // Custom tool handler
144
- modelSettings?: ModelSettings; // Temperature, maxTokens, etc.
145
- modelClass?: ModelClassID; // 'standard' | 'code' | 'reasoning' | 'monologue'
146
- responseFormat?: ResponseFormat; // JSON mode or structured output
147
- maxImageDimension?: number; // Auto-resize images (default: provider-specific)
148
- fallbackModels?: string[]; // Models to try if primary fails
149
- }
150
-
151
- // Stream event types
152
- type EnsembleStreamEvent =
153
- | { type: 'text_delta', delta: string }
154
- | { type: 'text', text: string }
155
- | { type: 'message_delta', content: string }
156
- | { type: 'message_complete', content: string }
157
- | { type: 'tool_start', tool_calls: ToolCall[] }
158
- | { type: 'cost_update', usage: TokenUsage }
159
- | { type: 'stream_end', timestamp: string }
160
- | { type: 'error', error: Error };
161
- ```
162
-
163
- #### `embed(text, options?)`
164
-
165
- Generate an embedding vector for the given text using any supported embedding model.
166
-
167
- **Parameters:**
168
- - `text` (string): Text to embed
169
- - `options` (object): Optional configuration
170
- - `model` (string): Specific model to use (e.g., 'text-embedding-3-small')
171
- - `modelClass` (ModelClassID): Model class to use (default: 'embedding')
172
- - `agentId` (string): Agent identifier for tracking
173
- - `opts` (EmbedOpts): Provider-specific embedding options
174
-
175
- **Returns:** `Promise<number[]>` - The embedding vector
176
-
177
- ```typescript
178
- // Simple embedding
179
- const embedding = await embed('Hello, world!');
180
- console.log(`Dimension: ${embedding.length}`);
181
-
182
- // With specific model
183
- const embedding = await embed('Search query', {
184
- model: 'text-embedding-3-large'
185
- });
186
-
187
- // With provider options (e.g., for Gemini)
188
- const embedding = await embed('Document text', {
189
- opts: { taskType: 'RETRIEVAL_DOCUMENT' }
190
- });
191
- ```
192
-
193
-
194
- ### Working with Models
195
-
196
- #### Model Selection
197
-
198
- ```typescript
199
- import { getModelFromClass, findModel, MODEL_REGISTRY } from '@just-every/ensemble';
200
-
201
- // Get best model for a specific task type
202
- const codeModel = getModelFromClass('code'); // Returns best available code model
203
- const reasoningModel = getModelFromClass('reasoning'); // For complex reasoning tasks
204
-
205
- // Check if a model exists
206
- const modelInfo = findModel('gpt-4o');
207
- if (modelInfo) {
208
- console.log(`Provider: ${modelInfo.provider}`);
209
- console.log(`Input cost: $${modelInfo.inputCost}/million tokens`);
210
- }
211
-
212
- // List all available models
213
- for (const [modelName, info] of Object.entries(MODEL_REGISTRY)) {
214
- console.log(`${modelName}: ${info.provider}`);
215
- }
216
- ```
217
-
218
- #### Model Classes
219
-
220
- - **standard**: General-purpose models for everyday tasks
221
- - **code**: Optimized for programming and technical tasks
222
- - **reasoning**: Advanced models for complex logical reasoning
223
- - **monologue**: Models supporting extended thinking/reasoning traces
224
-
225
- ### Message Types
226
-
227
- ```typescript
228
- // User/Assistant messages
229
- interface TextMessage {
230
- type: 'message';
231
- role: 'user' | 'assistant' | 'developer';
232
- content: string | MessageContent[];
233
- status?: 'completed' | 'in_progress';
234
- }
235
-
236
- // Multi-modal content
237
- type MessageContent =
238
- | { type: 'input_text', text: string }
239
- | { type: 'input_image', image_url: string, detail?: 'auto' | 'low' | 'high' }
240
- | { type: 'tool_use', id: string, name: string, arguments: any };
241
-
242
- // Tool-related messages
243
- interface FunctionCall {
244
- type: 'function_call';
245
- id: string;
246
- name: string;
247
- arguments: string;
248
- }
249
-
250
- interface FunctionCallOutput {
251
- type: 'function_call_output';
252
- id: string;
253
- output: string;
254
- }
255
- ```
256
-
257
- ## Common Use Cases
258
-
259
- ### 1. Basic Conversations
260
-
261
- ```typescript
262
- import { request } from '@just-every/ensemble';
36
+ // Basic usage
37
+ const stream = request('claude-3.5-sonnet', [
38
+ { type: 'message', role: 'user', content: 'Explain quantum computing' }
39
+ ]);
263
40
 
264
- // Simple Q&A
265
- for await (const event of request('gpt-4o-mini', [
266
- { type: 'message', role: 'user', content: 'Explain quantum computing in simple terms' }
267
- ])) {
41
+ for await (const event of stream) {
268
42
  if (event.type === 'text_delta') {
269
43
  process.stdout.write(event.delta);
44
+ } else if (event.type === 'cost_update') {
45
+ console.log(`Cost: $${event.usage.total_cost}`);
270
46
  }
271
47
  }
272
48
 
273
- // Multi-turn conversation
274
- const messages = [
275
- { type: 'message', role: 'developer', content: 'You are a helpful coding assistant' },
276
- { type: 'message', role: 'user', content: 'How do I center a div in CSS?' },
277
- { type: 'message', role: 'assistant', content: 'Here are several ways...' },
278
- { type: 'message', role: 'user', content: 'What about using flexbox?' }
279
- ];
280
-
281
- for await (const event of request('claude-3.5-sonnet', messages)) {
282
- // Handle streaming response
283
- }
284
- ```
285
-
286
- ### 2. Tool Calling & Function Execution
287
-
288
- ```typescript
289
- // Define tools with TypeScript types
290
- interface WeatherParams {
291
- city: string;
292
- unit?: 'celsius' | 'fahrenheit';
293
- }
294
-
295
- const weatherTool: ToolFunction = {
296
- function: async ({ city, unit = 'celsius' }: WeatherParams) => {
297
- // Real implementation would call weather API
298
- const temp = unit === 'celsius' ? 22 : 72;
299
- return `${temp}°${unit[0].toUpperCase()} in ${city}`;
300
- },
49
+ // With tools
50
+ const tools = [{
51
+ function: async ({ city }) => `Weather in ${city}: Sunny, 72°F`,
301
52
  definition: {
302
53
  type: 'function',
303
54
  function: {
304
55
  name: 'get_weather',
305
- description: 'Get current weather for a city',
56
+ description: 'Get weather for a city',
306
57
  parameters: {
307
58
  type: 'object',
308
59
  properties: {
309
- city: { type: 'string', description: 'City name' },
310
- unit: {
311
- type: 'string',
312
- enum: ['celsius', 'fahrenheit'],
313
- description: 'Temperature unit'
314
- }
60
+ city: { type: 'string' }
315
61
  },
316
62
  required: ['city']
317
63
  }
318
64
  }
319
65
  }
320
- };
321
-
322
- // Use with automatic execution
323
- for await (const event of request('gpt-4o', [
324
- { type: 'message', role: 'user', content: 'What\'s the weather in Tokyo and New York?' }
325
- ], { tools: [weatherTool] })) {
326
- if (event.type === 'tool_start') {
327
- console.log('Calling tool:', event.tool_calls[0].function.name);
328
- } else if (event.type === 'text_delta') {
329
- process.stdout.write(event.delta);
330
- }
331
- }
332
- ```
333
-
334
- ### 3. Model Selection Strategies
335
-
336
- ```typescript
337
- import { getModelFromClass, request } from '@just-every/ensemble';
338
-
339
- // Route based on task type
340
- async function intelligentRequest(task: string, messages: ResponseInput) {
341
- let model: string;
342
-
343
- if (task.includes('code') || task.includes('debug')) {
344
- model = getModelFromClass('code'); // Best code model
345
- } else if (task.includes('analyze') || task.includes('reasoning')) {
346
- model = getModelFromClass('reasoning'); // Best reasoning model
347
- } else {
348
- model = getModelFromClass('standard'); // Cost-effective general model
349
- }
350
-
351
- console.log(`Using ${model} for ${task}`);
352
-
353
- return request(model, messages, {
354
- fallbackModels: ['gpt-4o-mini', 'claude-3-5-haiku'] // Fallback options
355
- });
356
- }
357
-
358
- // Use model rotation for consensus
359
- async function consensusRequest(messages: ResponseInput) {
360
- const models = ['gpt-4o', 'claude-3.5-sonnet', 'gemini-2.0-flash'];
361
- const responses = [];
362
-
363
- for (const model of models) {
364
- const stream = request(model, messages);
365
- const result = await convertStreamToMessages(stream);
366
- responses.push(result.fullResponse);
367
- }
368
-
369
- // Analyze responses for consensus
370
- return analyzeConsensus(responses);
371
- }
372
- ```
373
-
374
- ### 4. Structured Output & JSON Mode
375
-
376
- ```typescript
377
- // JSON mode for reliable parsing
378
- const jsonStream = request('gpt-4o', [
379
- { type: 'message', role: 'user', content: 'List 3 programming languages with their pros/cons as JSON' }
380
- ], {
381
- responseFormat: { type: 'json_object' }
382
- });
383
-
384
- let jsonContent = '';
385
- for await (const event of jsonStream) {
386
- if (event.type === 'text_delta') {
387
- jsonContent += event.delta;
388
- }
389
- }
390
-
391
- const data = JSON.parse(jsonContent);
392
-
393
- // Structured output with schema validation
394
- const schema = {
395
- type: 'object',
396
- properties: {
397
- name: { type: 'string' },
398
- age: { type: 'number' },
399
- skills: {
400
- type: 'array',
401
- items: { type: 'string' }
402
- }
403
- },
404
- required: ['name', 'age', 'skills']
405
- };
406
-
407
- const structuredStream = request('gpt-4o', [
408
- { type: 'message', role: 'user', content: 'Generate a developer profile' }
409
- ], {
410
- responseFormat: {
411
- type: 'json_schema',
412
- json_schema: {
413
- name: 'developer_profile',
414
- schema: schema,
415
- strict: true
416
- }
417
- }
418
- });
419
- ```
66
+ }];
420
67
 
421
- ### 5. Image Processing
422
-
423
- ```typescript
424
- // Analyze images with vision models
425
- const imageStream = request('gpt-4o', [
426
- {
427
- type: 'message',
428
- role: 'user',
429
- content: [
430
- { type: 'input_text', text: 'What\'s in this image? Describe any text you see.' },
431
- {
432
- type: 'input_image',
433
- image_url: 'data:image/jpeg;base64,...',
434
- detail: 'high' // 'auto' | 'low' | 'high'
435
- }
436
- ]
437
- }
438
- ], {
439
- maxImageDimension: 2048 // Auto-resize large images
440
- });
441
-
442
- // Multiple images
443
- const comparison = request('claude-3.5-sonnet', [
444
- {
445
- type: 'message',
446
- role: 'user',
447
- content: [
448
- { type: 'input_text', text: 'Compare these two designs:' },
449
- { type: 'input_image', image_url: 'https://example.com/design1.png' },
450
- { type: 'input_image', image_url: 'https://example.com/design2.png' }
451
- ]
452
- }
453
- ]);
68
+ const stream = request('gpt-4o', [
69
+ { type: 'message', role: 'user', content: 'What\'s the weather in Paris?' }
70
+ ], { tools });
454
71
  ```
455
72
 
456
- ### 6. Error Handling & Resilience
73
+ ### `embed(text, options?)`
457
74
 
458
- ```typescript
459
- import { isRateLimitError, isAuthenticationError } from '@just-every/ensemble';
460
-
461
- async function robustRequest(model: string, messages: ResponseInput, options?: RequestOptions) {
462
- const maxRetries = 3;
463
- let lastError;
464
-
465
- for (let i = 0; i < maxRetries; i++) {
466
- try {
467
- const events = [];
468
- for await (const event of request(model, messages, options)) {
469
- if (event.type === 'error') {
470
- throw event.error;
471
- }
472
- events.push(event);
473
- }
474
- return events;
475
-
476
- } catch (error) {
477
- lastError = error;
478
-
479
- if (isAuthenticationError(error)) {
480
- throw error; // Don't retry auth errors
481
- }
482
-
483
- if (isRateLimitError(error)) {
484
- const waitTime = error.retryAfter || Math.pow(2, i) * 1000;
485
- console.log(`Rate limited. Waiting ${waitTime}ms...`);
486
- await new Promise(resolve => setTimeout(resolve, waitTime));
487
- continue;
488
- }
489
-
490
- // Try fallback model
491
- if (options?.fallbackModels?.[i]) {
492
- model = options.fallbackModels[i];
493
- console.log(`Falling back to ${model}`);
494
- continue;
495
- }
496
- }
497
- }
498
-
499
- throw lastError;
500
- }
501
- ```
502
-
503
- ## Utilities
504
-
505
- ### Cost & Usage Tracking
75
+ Generate embeddings for semantic search and RAG applications.
506
76
 
507
77
  ```typescript
508
- import { costTracker, quotaTracker } from '@just-every/ensemble';
509
-
510
- // Track costs across requests
511
- for await (const event of request('gpt-4o', messages)) {
512
- if (event.type === 'cost_update') {
513
- console.log(`Tokens: ${event.usage.input_tokens} in, ${event.usage.output_tokens} out`);
514
- console.log(`Cost: $${event.usage.total_cost.toFixed(4)}`);
515
- }
516
- }
78
+ // Simple embedding
79
+ const embedding = await embed('Hello, world!');
80
+ console.log(`Dimension: ${embedding.length}`); // e.g., 1536
517
81
 
518
- // Get cumulative costs
519
- const usage = costTracker.getAllUsage();
520
- for (const [model, stats] of Object.entries(usage)) {
521
- console.log(`${model}: $${stats.total_cost.toFixed(2)} for ${stats.request_count} requests`);
522
- }
82
+ // With specific model
83
+ const embedding = await embed('Search query', {
84
+ model: 'text-embedding-3-large'
85
+ });
523
86
 
524
- // Check quotas before making requests
525
- if (quotaTracker.canMakeRequest('gpt-4o', 'openai')) {
526
- // Safe to proceed
527
- } else {
528
- const resetTime = quotaTracker.getResetTime('openai');
529
- console.log(`Quota exceeded. Resets at ${resetTime}`);
87
+ // Calculate similarity
88
+ function cosineSimilarity(a: number[], b: number[]): number {
89
+ const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
90
+ const normA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
91
+ const normB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
92
+ return dotProduct / (normA * normB);
530
93
  }
531
- ```
532
94
 
533
- ### Stream Conversion & Chaining
534
-
535
- ```typescript
536
- import { convertStreamToMessages, chainRequests } from '@just-every/ensemble';
537
-
538
- let currentMessages = [
539
- { type: 'message', role: 'user', content: 'Write a haiku about coding' },
540
- { type: 'message', role: 'user', content: 'Make it really long' }
541
- ];
542
-
543
- let messages = [
544
- { type: 'message', role: 'developer', content: 'You are a helpful coding assistant' },
545
- { type: 'message', role: 'user', content: 'How do I center a div in CSS?' },
546
- ];
547
- messages = [...messages, ...(await convertStreamToMessages(request('claude-4-sonnet', messages))).messages];
548
- messages = [...messages, ...(await convertStreamToMessages(request(getModelFromClass('reasoning_mini'), messages))).messages];
549
- messages = [...messages, ...(await convertStreamToMessages(request('gemini-2.5-flash', messages))).messages];
550
-
551
-
552
- console.log(result.messages); // Full conversation history
553
- console.log(result.fullResponse); // Just the assistant's response
554
-
555
- // Chain multiple models for multi-step tasks
556
- const analysis = await chainRequests(
557
- [
558
- { type: 'message', role: 'user', content: codeToAnalyze }
559
- ],
560
- [
561
- {
562
- model: getModelFromClass('code'),
563
- systemPrompt: 'Analyze this code for bugs and security issues',
564
- },
565
- {
566
- model: getModelFromClass('reasoning'),
567
- systemPrompt: 'Prioritize the issues found and suggest fixes',
568
- },
569
- {
570
- model: 'gpt-4.1-mini',
571
- systemPrompt: 'Summarize the analysis in 3 bullet points',
572
- }
573
- ]);
95
+ const similarity = cosineSimilarity(embedding1, embedding2);
574
96
  ```
575
97
 
576
- ### Image Utilities
98
+ ### `chainRequests(messages, requests)`
577
99
 
578
- ```typescript
579
- import { resizeImageForModel, imageToText } from '@just-every/ensemble';
580
-
581
- // Auto-resize for specific model requirements
582
- const resized = await resizeImageForModel(
583
- base64ImageData,
584
- 'gpt-4o', // Different models have different size limits
585
- { maxDimension: 2048 }
586
- );
587
-
588
- // Extract text from images
589
- const extractedText = await imageToText(imageBuffer);
590
- console.log('Found text:', extractedText);
591
- ```
592
-
593
- ### Logging & Debugging
100
+ Chain multiple LLM calls, using the output of one as input to the next.
594
101
 
595
102
  ```typescript
596
- import { setEnsembleLogger, EnsembleLogger } from '@just-every/ensemble';
597
-
598
- // Production-ready logger example
599
- class ProductionLogger implements EnsembleLogger {
600
- log_llm_request(agentId: string, providerName: string, model: string, requestData: unknown, timestamp?: Date): string {
601
- const requestId = `req_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
602
-
603
- // Log to your monitoring system
604
- logger.info('LLM Request', {
605
- requestId,
606
- agentId,
607
- provider: providerName,
608
- model,
609
- timestamp,
610
- // Be careful not to log sensitive data
611
- messageCount: (requestData as any).messages?.length,
612
- hasTools: !!(requestData as any).tools?.length
613
- });
614
-
615
- return requestId;
616
- }
103
+ import { chainRequests } from '@just-every/ensemble';
617
104
 
618
- log_llm_response(requestId: string | undefined, responseData: unknown, timestamp?: Date): void {
619
- const response = responseData as any;
620
-
621
- logger.info('LLM Response', {
622
- requestId,
623
- timestamp,
624
- inputTokens: response.usage?.input_tokens,
625
- outputTokens: response.usage?.output_tokens,
626
- totalCost: response.usage?.total_cost,
627
- cached: response.usage?.cache_creation_input_tokens > 0
628
- });
629
- }
630
-
631
- log_llm_error(requestId: string | undefined, errorData: unknown, timestamp?: Date): void {
632
- logger.error('LLM Error', {
633
- requestId,
634
- timestamp,
635
- error: errorData,
636
- // Include retry information if available
637
- retryAfter: (errorData as any).retryAfter
638
- });
639
- }
640
- }
641
-
642
- // Enable logging globally
643
- setEnsembleLogger(new ProductionLogger());
644
-
645
- // Debug mode for development
646
- if (process.env.NODE_ENV === 'development') {
647
- setEnsembleLogger({
648
- log_llm_request: (agent, provider, model, data) => {
649
- console.log(`[${new Date().toISOString()}] → ${provider}/${model}`);
650
- return Date.now().toString();
105
+ const result = await chainRequests(
106
+ [{ type: 'message', role: 'user', content: 'Analyze this code for bugs: ...' }],
107
+ [
108
+ {
109
+ model: 'gpt-4o',
110
+ systemPrompt: 'You are a code reviewer. Find bugs and security issues.'
651
111
  },
652
- log_llm_response: (id, data) => {
653
- const response = data as any;
654
- console.log(`[${new Date().toISOString()}] ${response.usage?.total_tokens} tokens`);
112
+ {
113
+ model: 'claude-3.5-sonnet',
114
+ systemPrompt: 'Prioritize the issues found and suggest fixes.'
655
115
  },
656
- log_llm_error: (id, error) => {
657
- console.error(`[${new Date().toISOString()}] ✗ Error:`, error);
116
+ {
117
+ model: 'gpt-4o-mini',
118
+ systemPrompt: 'Summarize the analysis in 3 bullet points.'
658
119
  }
659
- });
660
- }
120
+ ]
121
+ );
122
+
123
+ console.log(result.fullResponse);
661
124
  ```
662
125
 
663
- ## Advanced Topics
126
+ ## Supported Providers
664
127
 
665
- ### OpenAI SDK Compatibility
128
+ - **OpenAI**: GPT-4o, GPT-4o-mini, o1-preview, o1-mini
129
+ - **Anthropic**: Claude 3.5 Sonnet, Claude 3.5 Haiku
130
+ - **Google**: Gemini 2.0 Flash, Gemini 1.5 Pro
131
+ - **DeepSeek**: DeepSeek Chat, DeepSeek Coder
132
+ - **xAI**: Grok 2, Grok Beta
133
+ - **OpenRouter**: Access to 100+ models
666
134
 
667
- Ensemble provides a drop-in replacement for the OpenAI SDK, allowing you to use any supported model with OpenAI's familiar API:
135
+ ## OpenAI SDK Compatibility
136
+
137
+ Drop-in replacement for the OpenAI SDK:
668
138
 
669
139
  ```typescript
140
+ // Instead of: import OpenAI from 'openai';
670
141
  import OpenAIEnsemble from '@just-every/ensemble/openai-compat';
671
- // Or named imports: import { chat, completions } from '@just-every/ensemble';
672
-
673
- // Replace OpenAI client
674
- const openai = OpenAIEnsemble; // Instead of: new OpenAI({ apiKey: '...' })
675
-
676
- // Use exactly like OpenAI SDK - but with any model!
677
- const completion = await openai.chat.completions.create({
678
- model: 'claude-3.5-sonnet', // or 'gpt-4o', 'gemini-2.0-flash', etc.
679
- messages: [
680
- { role: 'system', content: 'You are a helpful assistant.' },
681
- { role: 'user', content: 'Hello!' }
682
- ],
683
- temperature: 0.7
684
- });
685
-
686
- console.log(completion.choices[0].message.content);
687
142
 
688
- // Streaming
689
- const stream = await openai.chat.completions.create({
690
- model: 'gpt-4o-mini',
691
- messages: [{ role: 'user', content: 'Tell me a story' }],
143
+ const completion = await OpenAIEnsemble.chat.completions.create({
144
+ model: 'claude-3.5-sonnet', // Use any supported model!
145
+ messages: [{ role: 'user', content: 'Hello!' }],
692
146
  stream: true
693
147
  });
694
-
695
- for await (const chunk of stream) {
696
- process.stdout.write(chunk.choices[0].delta.content || '');
697
- }
698
-
699
- // Legacy completions API also supported
700
- const legacyCompletion = await openai.completions.create({
701
- model: 'deepseek-chat',
702
- prompt: 'Once upon a time',
703
- max_tokens: 100
704
- });
705
- ```
706
-
707
- This compatibility layer supports:
708
- - All chat.completions.create parameters (temperature, tools, response_format, etc.)
709
- - Streaming and non-streaming responses
710
- - Tool/function calling
711
- - Legacy completions.create API
712
- - Proper TypeScript types matching OpenAI's SDK
713
-
714
- ### Custom Model Providers
715
-
716
- ```typescript
717
- import { ModelProvider, registerExternalModel } from '@just-every/ensemble';
718
-
719
- // Register a custom model
720
- registerExternalModel({
721
- id: 'my-custom-model',
722
- provider: 'custom',
723
- inputCost: 0.001,
724
- outputCost: 0.002,
725
- contextWindow: 8192,
726
- maxOutput: 4096,
727
- supportsTools: true,
728
- supportsVision: false,
729
- supportsStreaming: true
730
- });
731
-
732
- // Use your custom model
733
- const stream = request('my-custom-model', messages);
734
- ```
735
-
736
- ### Performance Optimization
737
-
738
- ```typescript
739
- // Batch processing with concurrency control
740
- async function batchProcess(items: string[], concurrency = 3) {
741
- const results = [];
742
- const queue = [...items];
743
-
744
- async function worker() {
745
- while (queue.length > 0) {
746
- const item = queue.shift()!;
747
- const stream = request('gpt-4o-mini', [
748
- { type: 'message', role: 'user', content: `Process: ${item}` }
749
- ]);
750
-
751
- const result = await convertStreamToMessages(stream);
752
- results.push({ item, result: result.fullResponse });
753
- }
754
- }
755
-
756
- // Run workers concurrently
757
- await Promise.all(Array(concurrency).fill(null).map(() => worker()));
758
- return results;
759
- }
760
-
761
- // Stream multiple requests in parallel
762
- async function parallelStreaming(prompts: string[]) {
763
- const streams = prompts.map(prompt =>
764
- request('claude-3.5-haiku', [
765
- { type: 'message', role: 'user', content: prompt }
766
- ])
767
- );
768
-
769
- // Process all streams concurrently
770
- const results = await Promise.all(
771
- streams.map(stream => convertStreamToMessages(stream))
772
- );
773
-
774
- return results.map(r => r.fullResponse);
775
- }
776
148
  ```
777
149
 
778
150
  ## Environment Variables
779
151
 
780
- Set up API keys for the providers you want to use:
781
-
782
152
  ```bash
783
153
  ANTHROPIC_API_KEY=your_key_here
784
154
  OPENAI_API_KEY=your_key_here
@@ -788,6 +158,23 @@ XAI_API_KEY=your_key_here
788
158
  OPENROUTER_API_KEY=your_key_here
789
159
  ```
790
160
 
161
+ ## Documentation
162
+
163
+ - [Model Selection & Management](./docs/models.md)
164
+ - [Advanced Usage](./docs/advanced-usage.md)
165
+ - [Error Handling](./docs/error-handling.md)
166
+ - [OpenAI Compatibility](./docs/openai-compatibility.md)
167
+ - [Utility Functions](./docs/utilities.md)
168
+
169
+ ## Examples
170
+
171
+ See the [examples](./examples) directory for:
172
+ - [Basic usage](./examples/basic-request.ts)
173
+ - [Tool calling](./examples/tool-calling.ts)
174
+ - [Embeddings & semantic search](./examples/embeddings.ts)
175
+ - [Model rotation](./examples/model-rotation.ts)
176
+ - [Stream conversion](./examples/stream-conversion.ts)
177
+
791
178
  ## License
792
179
 
793
- MIT
180
+ MIT