@altsafe/aidirector 1.4.2 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,14 +1,21 @@
1
- # hydra-aidirector - Client SDK
2
1
  # hydra-aidirector
3
2
 
4
- The official Node.js/TypeScript client for [Hydra](https://hydrai.dev).
3
+ The official Node.js/TypeScript SDK for [Hydra](https://hydrai.dev) — a high-performance AI API gateway.
5
4
 
6
- Hydra is a high-performance AI API gateway that provides:
7
- - 🔄 **Automatic Failover**: Never let an LLM outage break your app
8
- - ⚡ **God-Tier Caching**: Reduce costs and latency with smart response caching
9
- - 🛡️ **Self-Healing AI**: Auto-extract JSON, strip markdown, and repair malformed responses with `healingReport`
10
- - 🧠 **Thinking Mode**: Support for reasoning models like Gemini 2.0 Flash Thinking
11
- - 📊 **Detailed Usage**: Track token usage, latency, and costs per model
5
+ [![npm version](https://badge.fury.io/js/hydra-aidirector.svg)](https://www.npmjs.com/package/hydra-aidirector)
6
+ [![TypeScript](https://badges.frapsoft.com/typescript/code/typescript.svg?v=101)](https://www.typescriptlang.org/)
7
+
8
+ ## Why Hydra?
9
+
10
+ | Problem | Hydra Solution |
11
+ |---------|----------------|
12
+ | LLM outages break your app | 🔄 **Automatic Failover** – Seamless fallback between providers |
13
+ | High latency & costs | ⚡ **God-Tier Caching** – Hybrid Redis + DB cache with AI-directed scoping |
14
+ | Malformed JSON responses | 🛡️ **Self-Healing AI** – Auto-repair JSON, strip markdown, extract from prose |
15
+ | Schema compliance issues | ✅ **Strict JSON Mode** – Force models to conform to your schema or fail |
16
+ | No visibility into usage | 📊 **Detailed Analytics** – Track tokens, latency, and costs per model |
17
+
18
+ ---
12
19
 
13
20
  ## Installation
14
21
 
@@ -16,42 +23,86 @@ Hydra is a high-performance AI API gateway that provides:
16
23
  npm install hydra-aidirector
17
24
  # or
18
25
  pnpm add hydra-aidirector
26
+ # or
27
+ yarn add hydra-aidirector
19
28
  ```
20
29
 
30
+ ---
31
+
21
32
  ## Quick Start
22
33
 
23
34
  ```typescript
24
35
  import { Hydra } from 'hydra-aidirector';
25
36
 
26
- const ai = new Hydra({
27
- secretKey: process.env.HYDRA_SECRET_KEY!,
37
+ const client = new Hydra({
38
+ secretKey: process.env.HYDRA_SECRET_KEY!, // hyd_sk_...
28
39
  baseUrl: 'https://your-instance.vercel.app',
29
40
  });
30
41
 
31
- // Generate content
32
- const result = await ai.generate({
42
+ // Basic generation
43
+ const result = await client.generate({
33
44
  chainId: 'my-chain',
34
- prompt: 'Generate 5 user profiles',
45
+ prompt: 'Generate 5 user profiles as JSON',
35
46
  });
36
47
 
37
48
  if (result.success) {
38
- console.log(result.data.valid);
49
+ console.log(result.data.valid); // Parsed, schema-validated objects
50
+ }
51
+ ```
52
+
53
+ ---
54
+
55
+ ## Core Features
56
+
57
+ ### 🔄 Automatic Failover
58
+ Define fallback chains in your dashboard. If Gemini fails, Hydra automatically tries OpenRouter, Claude, etc.
59
+
60
+ ### 🛡️ Self-Healing JSON
61
+ LLMs sometimes return broken JSON. Hydra extracts and repairs it automatically:
62
+
63
+ ```typescript
64
+ const result = await client.generate({ chainId: 'x', prompt: 'Get user' });
65
+
66
+ if (result.meta.recovered) {
67
+ console.log('JSON was malformed but healed!');
68
+ console.log(result.data.healingReport);
69
+ // [{ original: "{name: 'foo'", healed: { name: "foo" }, fixes: ["Added closing brace"] }]
39
70
  }
40
71
  ```
41
72
 
42
- ## Features
73
+ ### ✅ Strict JSON Mode vs Self-Healing
74
+
75
+ Hydra's self-healing JSON repair **works in both modes**. The difference is how the model behaves:
76
+
77
+ | Mode | Model Behavior | Healing Role |
78
+ |------|----------------|--------------|
79
+ | **Strict** (`strictJson: true`) | Model is **forced** to output pure JSON via native API constraints. Output is already clean. | Safety net — rarely needed since output is constrained. |
80
+ | **Non-Strict** (`strictJson: false`) | Model outputs best-effort JSON (may include markdown, prose, or broken syntax). | Primary mechanism — extracts and repairs JSON from messy output. |
81
+
82
+ ```typescript
83
+ // Strict mode - model constrained to pure JSON
84
+ await client.generate({
85
+ chainId: 'gemini-chain',
86
+ prompt: 'Extract invoice data',
87
+ schema: invoiceSchema,
88
+ strictJson: true, // No markdown, no explanations
89
+ });
43
90
 
44
- - 🔐 **HMAC Authentication** - Secure request signing
45
- - ⚡ **3-Step Architecture** - Token → Worker → Complete (minimizes costs)
46
- - 📎 **File Attachments** - Upload and process documents
47
- - 🔄 **Automatic Retries** - Exponential backoff on failures
48
- - 💾 **Smart Caching** - Two-tier (user/global) cache with AI-directed scoping
49
- - 🎯 **TypeScript** - Full type safety with comprehensive types
50
- - 🛑 **Request Cancellation** - Support for AbortSignal
51
- - 🪝 **Webhooks** - Async notification callbacks
91
+ // Non-strict mode - flexible output, Hydra heals as needed
92
+ await client.generate({
93
+ chainId: 'any-chain',
94
+ prompt: 'Generate a creative story with metadata',
95
+ schema: storySchema,
96
+ strictJson: false, // Allow model to be creative, Hydra extracts JSON
97
+ });
98
+ ```
99
+
100
+ ---
52
101
 
53
102
  ## Streaming (Recommended for Large Responses)
54
103
 
104
+ Process JSON objects as they arrive — perfect for UIs that need instant feedback:
105
+
55
106
  ```typescript
56
107
  await client.generateStream(
57
108
  {
@@ -73,8 +124,12 @@ await client.generateStream(
73
124
  );
74
125
  ```
75
126
 
127
+ ---
128
+
76
129
  ## Batch Generation
77
130
 
131
+ Process multiple prompts in parallel with automatic error handling:
132
+
78
133
  ```typescript
79
134
  const result = await client.generateBatch('my-chain', [
80
135
  { id: 'item1', prompt: 'Describe product A' },
@@ -85,54 +140,59 @@ const result = await client.generateBatch('my-chain', [
85
140
  console.log(`Processed ${result.summary.succeeded}/${result.summary.total}`);
86
141
  ```
87
142
 
88
- ## Request Cancellation
143
+ ---
144
+
145
+ ## Caching
146
+
147
+ ### Cache Scope
148
+ Control how responses are cached:
89
149
 
90
150
  ```typescript
91
- const controller = new AbortController();
151
+ // Global cache (shared across users - default)
152
+ await client.generate({ chainId: 'x', prompt: 'Facts', cacheScope: 'global' });
92
153
 
93
- // Cancel after 5 seconds
94
- setTimeout(() => controller.abort(), 5000);
154
+ // User-scoped cache (private to authenticated user)
155
+ await client.generate({ chainId: 'x', prompt: 'My profile', cacheScope: 'user' });
95
156
 
96
- try {
97
- const result = await client.generate({
98
- chainId: 'my-chain',
99
- prompt: 'Long running prompt',
100
- signal: controller.signal,
101
- });
102
- } catch (error) {
103
- if (error instanceof TimeoutError) {
104
- console.log('Request was cancelled');
105
- }
106
- }
157
+ // Skip cache entirely
158
+ await client.generate({ chainId: 'x', prompt: 'Random', cacheScope: 'skip' });
107
159
  ```
108
160
 
109
- ## Cache Control
161
+ ### Cache Quality
162
+ Control the trade-off between cache hit rate and precision:
163
+
164
+ | Level | Behavior |
165
+ |-------|----------|
166
+ | `STANDARD` | Balanced fuzzy matching. Good for most cases. |
167
+ | `HIGH` | Stricter matching. Higher quality hits, lower hit rate. |
168
+ | `MAX_EFFICIENCY` | Aggressive matching. Maximum cost savings, less precision. |
110
169
 
111
170
  ```typescript
112
- // Global cache (shared across users - default)
113
- const result = await client.generate({
171
+ await client.generate({
114
172
  chainId: 'my-chain',
115
- prompt: 'Static content',
116
- cacheScope: 'global',
173
+ prompt: 'Generate report',
174
+ cacheQuality: 'MAX_EFFICIENCY', // Maximize cache hits
117
175
  });
176
+ ```
118
177
 
119
- // User-scoped cache (private to user)
120
- const userResult = await client.generate({
121
- chainId: 'my-chain',
122
- prompt: 'Personalized content',
123
- cacheScope: 'user',
124
- });
178
+ ### AI-Directed Caching
179
+ The AI can override cache scope by including a `_cache` directive in its output:
125
180
 
126
- // Skip cache entirely
127
- const freshResult = await client.generate({
128
- chainId: 'my-chain',
129
- prompt: 'Always fresh',
130
- cacheScope: 'skip',
131
- });
181
+ ```json
182
+ {
183
+ "data": { "...": "..." },
184
+ "_cache": { "scope": "user" }
185
+ }
132
186
  ```
133
187
 
188
+ The directive is automatically stripped from your final response.
189
+
190
+ ---
191
+
134
192
  ## File Attachments
135
193
 
194
+ Upload documents for analysis. Hydra handles type detection and model compatibility:
195
+
136
196
  ```typescript
137
197
  import fs from 'fs';
138
198
 
@@ -149,10 +209,53 @@ const result = await client.generate({
149
209
  });
150
210
  ```
151
211
 
212
+ ---
213
+
214
+ ## Request Cancellation
215
+
216
+ Cancel long-running requests with `AbortSignal`:
217
+
218
+ ```typescript
219
+ const controller = new AbortController();
220
+ setTimeout(() => controller.abort(), 5000); // Cancel after 5s
221
+
222
+ try {
223
+ const result = await client.generate({
224
+ chainId: 'my-chain',
225
+ prompt: 'Long task',
226
+ signal: controller.signal,
227
+ });
228
+ } catch (error) {
229
+ if (error instanceof TimeoutError) {
230
+ console.log('Request was cancelled');
231
+ }
232
+ }
233
+ ```
234
+
235
+ ---
236
+
237
+ ## Thinking Mode
238
+
239
+ Enable reasoning models to show their thought process:
240
+
241
+ ```typescript
242
+ const result = await client.generate({
243
+ chainId: 'reasoning-chain',
244
+ prompt: 'Solve this complex problem step by step',
245
+ options: {
246
+ thinkingMode: true,
247
+ },
248
+ });
249
+ ```
250
+
251
+ ---
252
+
152
253
  ## Webhooks
153
254
 
255
+ Register callbacks for async notifications:
256
+
154
257
  ```typescript
155
- // Register a webhook
258
+ // Register
156
259
  await client.registerWebhook({
157
260
  requestId: 'req_123',
158
261
  url: 'https://your-domain.com/webhooks/hydra',
@@ -160,51 +263,44 @@ await client.registerWebhook({
160
263
  retryCount: 3,
161
264
  });
162
265
 
163
- // List webhooks
266
+ // Manage
164
267
  const webhooks = await client.listWebhooks();
165
-
166
- // Update webhook
167
268
  await client.updateWebhook('webhook_id', { retryCount: 5 });
168
-
169
- // Unregister webhook
170
269
  await client.unregisterWebhook('webhook_id');
171
270
  ```
172
271
 
173
- ## Thinking Mode (Reasoning Models)
272
+ ---
174
273
 
175
- ```typescript
176
- const result = await client.generate({
177
- chainId: 'reasoning-chain',
178
- prompt: 'Solve this complex problem step by step',
179
- options: {
180
- thinkingMode: true, // Shows model reasoning process
181
- },
182
- });
183
- ```
274
+ ## Configuration Reference
184
275
 
185
- ## Configuration
276
+ ### Client Options
186
277
 
187
278
  | Option | Type | Default | Description |
188
279
  |--------|------|---------|-------------|
189
280
  | `secretKey` | `string` | **required** | Your API key (`hyd_sk_...`) |
190
281
  | `baseUrl` | `string` | `http://localhost:3000` | API base URL |
191
- | `timeout` | `number` | `600000` | Request timeout (10 min) |
282
+ | `timeout` | `number` | `600000` | Request timeout in ms (10 min) |
192
283
  | `maxRetries` | `number` | `3` | Max retry attempts |
193
284
  | `debug` | `boolean` | `false` | Enable debug logging |
194
285
 
195
- ## Generate Options
286
+ ### Generate Options
196
287
 
197
288
  | Option | Type | Default | Description |
198
289
  |--------|------|---------|-------------|
199
290
  | `chainId` | `string` | **required** | Fallback chain ID |
200
291
  | `prompt` | `string` | **required** | The prompt to send |
201
292
  | `schema` | `object` | - | JSON schema for validation |
202
- | `cacheScope` | `'global' \| 'user' \| 'skip'` | `'global'` | Cache behavior |
293
+ | `cacheScope` | `'global' \| 'user' \| 'skip'` | `'global'` | Cache sharing behavior |
294
+ | `cacheQuality` | `'STANDARD' \| 'HIGH' \| 'MAX_EFFICIENCY'` | `'STANDARD'` | Cache match precision |
295
+ | `strictJson` | `boolean` | `true` | Force strict JSON mode |
203
296
  | `signal` | `AbortSignal` | - | Cancellation signal |
204
- | `maxRetries` | `number` | Client setting | Override retries |
297
+ | `maxRetries` | `number` | Client default | Override retries |
205
298
  | `requestId` | `string` | Auto-generated | Custom request ID |
206
299
  | `files` | `FileAttachment[]` | - | File attachments |
207
- | `useOptimized` | `boolean` | `true` | Use 3-step flow |
300
+ | `useOptimized` | `boolean` | `true` | Use 3-step cost-saving flow |
301
+ | `noCache` | `boolean` | `false` | Skip cache entirely |
302
+
303
+ ---
208
304
 
209
305
  ## API Methods
210
306
 
@@ -223,80 +319,81 @@ const result = await client.generate({
223
319
  | `listWebhooks()` | List all webhooks |
224
320
  | `updateWebhook(id, updates)` | Modify webhook config |
225
321
 
322
+ ---
323
+
226
324
  ## Error Handling
227
325
 
228
326
  ```typescript
229
- import {
327
+ import {
230
328
  RateLimitError,
231
329
  TimeoutError,
232
330
  AuthenticationError,
233
331
  QuotaExceededError,
234
332
  WorkerError,
235
333
  FileProcessingError,
334
+ isRetryableError,
236
335
  } from 'hydra-aidirector';
237
336
 
238
337
  try {
239
- const result = await client.generate({ ... });
338
+ const result = await client.generate({ chainId: 'x', prompt: 'y' });
240
339
  } catch (error) {
241
340
  if (error instanceof RateLimitError) {
242
341
  console.log(`Retry after ${error.retryAfterMs}ms`);
243
342
  } else if (error instanceof QuotaExceededError) {
244
343
  console.log(`Quota exceeded: ${error.used}/${error.limit} (${error.tier})`);
245
344
  } else if (error instanceof TimeoutError) {
246
- console.log(`Request timed out after ${error.timeoutMs}ms`);
345
+ console.log(`Timed out after ${error.timeoutMs}ms`);
247
346
  } else if (error instanceof AuthenticationError) {
248
347
  console.log('Invalid API key');
249
348
  } else if (error instanceof WorkerError) {
250
- console.log('Worker processing failed - will retry');
349
+ console.log('Worker failed - will retry');
251
350
  } else if (error instanceof FileProcessingError) {
252
351
  console.log(`File error: ${error.reason} - ${error.filename}`);
352
+ } else if (isRetryableError(error)) {
353
+ console.log('Transient error - safe to retry');
253
354
  }
254
355
  }
255
356
  ```
256
357
 
257
- ## Self-Healing Reports
358
+ ---
258
359
 
259
- When `hydra-aidirector` fixes a malformed JSON response, it includes a `healingReport` in the data object:
360
+ ## Pricing
260
361
 
261
- ```typescript
262
- const result = await client.generate({ ... });
362
+ **BYOK (Bring Your Own Key)** — You pay AI providers directly. Hydra charges only for API access:
263
363
 
264
- if (result.success && result.meta.recovered) {
265
- console.log('JSON was malformed but healed!');
266
- console.log(result.data.healingReport);
267
- // [{ original: "{name: 'foo'", healed: {name: "foo"}, fixes: ["Added missing brace"] }]
268
- }
269
- ```
364
+ | Tier | Price/mo | Requests | Overage |
365
+ |------|----------|----------|---------|
366
+ | Free | $0 | 1,000 | Blocked |
367
+ | Starter | $9 | 25,000 | $0.50/1K |
368
+ | Pro | $29 | 100,000 | $0.40/1K |
369
+ | Scale | $79 | 500,000 | $0.30/1K |
270
370
 
271
- ## AI-Directed Caching
371
+ ---
272
372
 
273
- The AI can control caching by including a `_cache` directive in its JSON output:
373
+ ## TypeScript Support
274
374
 
275
- ```json
276
- {
277
- "data": { ... },
278
- "_cache": { "scope": "user" }
279
- }
280
- ```
375
+ Full type safety with comprehensive types:
281
376
 
282
- Scopes:
283
- - `global` - Share response across all users (default)
284
- - `user` - Cache per-user only
285
- - `skip` - Do not cache this response
377
+ ```typescript
378
+ import type {
379
+ GenerateOptions,
380
+ GenerateResult,
381
+ StreamCallbacks,
382
+ FileAttachment,
383
+ ChainInfo,
384
+ ModelInfo,
385
+ } from 'hydra-aidirector';
386
+ ```
286
387
 
287
- The directive is automatically stripped from the final response.
388
+ ---
288
389
 
289
- ## Pricing
390
+ ## Requirements
290
391
 
291
- BYOK (Bring Your Own Key) - You pay for AI costs directly to providers.
392
+ - Node.js 18+
393
+ - TypeScript 5+ (optional but recommended)
292
394
 
293
- | Tier | Price | Requests | Overage |
294
- |------|-------|----------|---------|
295
- | Free | $0 | 1K | Blocked |
296
- | Starter | $9 | 25K | $0.50/1K |
297
- | Pro | $29 | 100K | $0.40/1K |
298
- | Scale | $79 | 500K | $0.30/1K |
395
+ ---
299
396
 
300
397
  ## License
301
398
 
302
- MIT
399
+ MIT © [Hydra](https://hydrai.dev)
package/dist/index.d.mts CHANGED
@@ -116,6 +116,18 @@ interface GenerateOptions {
116
116
  * Set to 0 to disable retries for idempotent-sensitive operations.
117
117
  */
118
118
  maxRetries?: number;
119
+ /**
120
+ * Override cache quality logic for this request.
121
+ * - STANDARD: Balance of speed and quality (default)
122
+ * - HIGH: Prefer better matches even if slightly slower
123
+ * - MAX_EFFICIENCY: Aggressively cache to reduce costs and latency
124
+ */
125
+ cacheQuality?: 'STANDARD' | 'HIGH' | 'MAX_EFFICIENCY';
126
+ /**
127
+ * Force strict JSON usage for providers that support it.
128
+ * If true, ensures the model outputs valid JSON conforming to the schema.
129
+ */
130
+ strictJson?: boolean;
119
131
  /**
120
132
  * Client-generated request ID for tracing and debugging.
121
133
  * If not provided, one will be generated automatically.
package/dist/index.d.ts CHANGED
@@ -116,6 +116,18 @@ interface GenerateOptions {
116
116
  * Set to 0 to disable retries for idempotent-sensitive operations.
117
117
  */
118
118
  maxRetries?: number;
119
+ /**
120
+ * Override cache quality logic for this request.
121
+ * - STANDARD: Balance of speed and quality (default)
122
+ * - HIGH: Prefer better matches even if slightly slower
123
+ * - MAX_EFFICIENCY: Aggressively cache to reduce costs and latency
124
+ */
125
+ cacheQuality?: 'STANDARD' | 'HIGH' | 'MAX_EFFICIENCY';
126
+ /**
127
+ * Force strict JSON usage for providers that support it.
128
+ * If true, ensures the model outputs valid JSON conforming to the schema.
129
+ */
130
+ strictJson?: boolean;
119
131
  /**
120
132
  * Client-generated request ID for tracing and debugging.
121
133
  * If not provided, one will be generated automatically.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@altsafe/aidirector",
3
- "version": "1.4.2",
3
+ "version": "1.6.0",
4
4
  "description": "Official TypeScript SDK for Hydra - Intelligent AI API Gateway with automatic failover, caching, and JSON extraction",
5
5
  "main": "./dist/index.js",
6
6
  "module": "./dist/index.mjs",
@@ -85,4 +85,4 @@
85
85
  "engines": {
86
86
  "node": ">=18.0.0"
87
87
  }
88
- }
88
+ }