npm - @altsafe/aidirector - Versions diffs - 1.4.2 → 1.6.0 - Mend

@altsafe/aidirector 1.4.2 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -1,14 +1,21 @@
-# hydra-aidirector - Client SDK
 # hydra-aidirector
-The official Node.js/TypeScript client for [Hydra](https://hydrai.dev).
+The official Node.js/TypeScript SDK for [Hydra](https://hydrai.dev) — a high-performance AI API gateway.
-Hydra is a high-performance AI API gateway that provides:
-- 🔄 **Automatic Failover**: Never let an LLM outage break your app
-- ⚡ **God-Tier Caching**: Reduce costs and latency with smart response caching
-- 🛡️ **Self-Healing AI**: Auto-extract JSON, strip markdown, and repair malformed responses with `healingReport`
-- 🧠 **Thinking Mode**: Support for reasoning models like Gemini 2.0 Flash Thinking
-- 📊 **Detailed Usage**: Track token usage, latency, and costs per model
+[![npm version](https://badge.fury.io/js/hydra-aidirector.svg)](https://www.npmjs.com/package/hydra-aidirector)
+[![TypeScript](https://badges.frapsoft.com/typescript/code/typescript.svg?v=101)](https://www.typescriptlang.org/)
+## Why Hydra?
+| Problem | Hydra Solution |
+|---------|----------------|
+| LLM outages break your app | 🔄 **Automatic Failover** – Seamless fallback between providers |
+| High latency & costs | ⚡ **God-Tier Caching** – Hybrid Redis + DB cache with AI-directed scoping |
+| Malformed JSON responses | 🛡️ **Self-Healing AI** – Auto-repair JSON, strip markdown, extract from prose |
+| Schema compliance issues | ✅ **Strict JSON Mode** – Force models to conform to your schema or fail |
+| No visibility into usage | 📊 **Detailed Analytics** – Track tokens, latency, and costs per model |
+---
 ## Installation
@@ -16,42 +23,86 @@ Hydra is a high-performance AI API gateway that provides:
 npm install hydra-aidirector
 # or
 pnpm add hydra-aidirector
+# or
+yarn add hydra-aidirector
 ```
+---
 ## Quick Start
 ```typescript
 import { Hydra } from 'hydra-aidirector';
-const ai = new Hydra({
-  secretKey: process.env.HYDRA_SECRET_KEY!,
+const client = new Hydra({
+  secretKey: process.env.HYDRA_SECRET_KEY!, // hyd_sk_...
   baseUrl: 'https://your-instance.vercel.app',
 });
-// Generate content
-const result = await ai.generate({
+// Basic generation
+const result = await client.generate({
   chainId: 'my-chain',
-  prompt: 'Generate 5 user profiles',
+  prompt: 'Generate 5 user profiles as JSON',
 });
 if (result.success) {
-  console.log(result.data.valid);
+  console.log(result.data.valid); // Parsed, schema-validated objects
+}
+```
+---
+## Core Features
+### 🔄 Automatic Failover
+Define fallback chains in your dashboard. If Gemini fails, Hydra automatically tries OpenRouter, Claude, etc.
+### 🛡️ Self-Healing JSON
+LLMs sometimes return broken JSON. Hydra extracts and repairs it automatically:
+```typescript
+const result = await client.generate({ chainId: 'x', prompt: 'Get user' });
+if (result.meta.recovered) {
+  console.log('JSON was malformed but healed!');
+  console.log(result.data.healingReport);
+  // [{ original: "{name: 'foo'", healed: { name: "foo" }, fixes: ["Added closing brace"] }]
 }
 ```
-## Features
+### ✅ Strict JSON Mode vs Self-Healing
+Hydra's self-healing JSON repair **works in both modes**. The difference is how the model behaves:
+| Mode | Model Behavior | Healing Role |
+|------|----------------|--------------|
+| **Strict** (`strictJson: true`) | Model is **forced** to output pure JSON via native API constraints. Output is already clean. | Safety net — rarely needed since output is constrained. |
+| **Non-Strict** (`strictJson: false`) | Model outputs best-effort JSON (may include markdown, prose, or broken syntax). | Primary mechanism — extracts and repairs JSON from messy output. |
+```typescript
+// Strict mode - model constrained to pure JSON
+await client.generate({
+  chainId: 'gemini-chain',
+  prompt: 'Extract invoice data',
+  schema: invoiceSchema,
+  strictJson: true, // No markdown, no explanations
+});
-- 🔐 **HMAC Authentication** - Secure request signing
-- ⚡ **3-Step Architecture** - Token → Worker → Complete (minimizes costs)
-- 📎 **File Attachments** - Upload and process documents
-- 🔄 **Automatic Retries** - Exponential backoff on failures
-- 💾 **Smart Caching** - Two-tier (user/global) cache with AI-directed scoping
-- 🎯 **TypeScript** - Full type safety with comprehensive types
-- 🛑 **Request Cancellation** - Support for AbortSignal
-- 🪝 **Webhooks** - Async notification callbacks
+// Non-strict mode - flexible output, Hydra heals as needed
+await client.generate({
+  chainId: 'any-chain',
+  prompt: 'Generate a creative story with metadata',
+  schema: storySchema,
+  strictJson: false, // Allow model to be creative, Hydra extracts JSON
+});
+```
+---
 ## Streaming (Recommended for Large Responses)
+Process JSON objects as they arrive — perfect for UIs that need instant feedback:
 ```typescript
 await client.generateStream(
   {
@@ -73,8 +124,12 @@ await client.generateStream(
 );
 ```
+---
 ## Batch Generation
+Process multiple prompts in parallel with automatic error handling:
 ```typescript
 const result = await client.generateBatch('my-chain', [
   { id: 'item1', prompt: 'Describe product A' },
@@ -85,54 +140,59 @@ const result = await client.generateBatch('my-chain', [
 console.log(`Processed ${result.summary.succeeded}/${result.summary.total}`);
 ```
-## Request Cancellation
+---
+## Caching
+### Cache Scope
+Control how responses are cached:
 ```typescript
-const controller = new AbortController();
+// Global cache (shared across users - default)
+await client.generate({ chainId: 'x', prompt: 'Facts', cacheScope: 'global' });
-// Cancel after 5 seconds
-setTimeout(() => controller.abort(), 5000);
+// User-scoped cache (private to authenticated user)
+await client.generate({ chainId: 'x', prompt: 'My profile', cacheScope: 'user' });
-try {
-  const result = await client.generate({
-    chainId: 'my-chain',
-    prompt: 'Long running prompt',
-    signal: controller.signal,
-  });
-} catch (error) {
-  if (error instanceof TimeoutError) {
-    console.log('Request was cancelled');
-  }
-}
+// Skip cache entirely
+await client.generate({ chainId: 'x', prompt: 'Random', cacheScope: 'skip' });
 ```
-## Cache Control
+### Cache Quality
+Control the trade-off between cache hit rate and precision:
+| Level | Behavior |
+|-------|----------|
+| `STANDARD` | Balanced fuzzy matching. Good for most cases. |
+| `HIGH` | Stricter matching. Higher quality hits, lower hit rate. |
+| `MAX_EFFICIENCY` | Aggressive matching. Maximum cost savings, less precision. |
 ```typescript
-// Global cache (shared across users - default)
-const result = await client.generate({
+await client.generate({
   chainId: 'my-chain',
-  prompt: 'Static content',
-  cacheScope: 'global',
+  prompt: 'Generate report',
+  cacheQuality: 'MAX_EFFICIENCY', // Maximize cache hits
 });
+```
-// User-scoped cache (private to user)
-const userResult = await client.generate({
-  chainId: 'my-chain',
-  prompt: 'Personalized content',
-  cacheScope: 'user',
-});
+### AI-Directed Caching
+The AI can override cache scope by including a `_cache` directive in its output:
-// Skip cache entirely
-const freshResult = await client.generate({
-  chainId: 'my-chain',
-  prompt: 'Always fresh',
-  cacheScope: 'skip',
-});
+```json
+{
+  "data": { "...": "..." },
+  "_cache": { "scope": "user" }
+}
 ```
+The directive is automatically stripped from your final response.
+---
 ## File Attachments
+Upload documents for analysis. Hydra handles type detection and model compatibility:
 ```typescript
 import fs from 'fs';
@@ -149,10 +209,53 @@ const result = await client.generate({
 });
 ```
+---
+## Request Cancellation
+Cancel long-running requests with `AbortSignal`:
+```typescript
+const controller = new AbortController();
+setTimeout(() => controller.abort(), 5000); // Cancel after 5s
+try {
+  const result = await client.generate({
+    chainId: 'my-chain',
+    prompt: 'Long task',
+    signal: controller.signal,
+  });
+} catch (error) {
+  if (error instanceof TimeoutError) {
+    console.log('Request was cancelled');
+  }
+}
+```
+---
+## Thinking Mode
+Enable reasoning models to show their thought process:
+```typescript
+const result = await client.generate({
+  chainId: 'reasoning-chain',
+  prompt: 'Solve this complex problem step by step',
+  options: {
+    thinkingMode: true,
+  },
+});
+```
+---
 ## Webhooks
+Register callbacks for async notifications:
 ```typescript
-// Register a webhook
+// Register
 await client.registerWebhook({
   requestId: 'req_123',
   url: 'https://your-domain.com/webhooks/hydra',
@@ -160,51 +263,44 @@ await client.registerWebhook({
   retryCount: 3,
 });
-// List webhooks
+// Manage
 const webhooks = await client.listWebhooks();
-// Update webhook
 await client.updateWebhook('webhook_id', { retryCount: 5 });
-// Unregister webhook
 await client.unregisterWebhook('webhook_id');
 ```
-## Thinking Mode (Reasoning Models)
+---
-```typescript
-const result = await client.generate({
-  chainId: 'reasoning-chain',
-  prompt: 'Solve this complex problem step by step',
-  options: {
-    thinkingMode: true, // Shows model reasoning process
-  },
-});
-```
+## Configuration Reference
-## Configuration
+### Client Options
 | Option | Type | Default | Description |
 |--------|------|---------|-------------|
 | `secretKey` | `string` | **required** | Your API key (`hyd_sk_...`) |
 | `baseUrl` | `string` | `http://localhost:3000` | API base URL |
-| `timeout` | `number` | `600000` | Request timeout (10 min) |
+| `timeout` | `number` | `600000` | Request timeout in ms (10 min) |
 | `maxRetries` | `number` | `3` | Max retry attempts |
 | `debug` | `boolean` | `false` | Enable debug logging |
-## Generate Options
+### Generate Options
 | Option | Type | Default | Description |
 |--------|------|---------|-------------|
 | `chainId` | `string` | **required** | Fallback chain ID |
 | `prompt` | `string` | **required** | The prompt to send |
 | `schema` | `object` | - | JSON schema for validation |
-| `cacheScope` | `'global' \| 'user' \| 'skip'` | `'global'` | Cache behavior |
+| `cacheScope` | `'global' \| 'user' \| 'skip'` | `'global'` | Cache sharing behavior |
+| `cacheQuality` | `'STANDARD' \| 'HIGH' \| 'MAX_EFFICIENCY'` | `'STANDARD'` | Cache match precision |
+| `strictJson` | `boolean` | `true` | Force strict JSON mode |
 | `signal` | `AbortSignal` | - | Cancellation signal |
-| `maxRetries` | `number` | Client setting | Override retries |
+| `maxRetries` | `number` | Client default | Override retries |
 | `requestId` | `string` | Auto-generated | Custom request ID |
 | `files` | `FileAttachment[]` | - | File attachments |
-| `useOptimized` | `boolean` | `true` | Use 3-step flow |
+| `useOptimized` | `boolean` | `true` | Use 3-step cost-saving flow |
+| `noCache` | `boolean` | `false` | Skip cache entirely |
+---
 ## API Methods
@@ -223,80 +319,81 @@ const result = await client.generate({
 | `listWebhooks()` | List all webhooks |
 | `updateWebhook(id, updates)` | Modify webhook config |
+---
 ## Error Handling
 ```typescript
-import {
+import {
   RateLimitError,
   TimeoutError,
   AuthenticationError,
   QuotaExceededError,
   WorkerError,
   FileProcessingError,
+  isRetryableError,
 } from 'hydra-aidirector';
 try {
-  const result = await client.generate({ ... });
+  const result = await client.generate({ chainId: 'x', prompt: 'y' });
 } catch (error) {
   if (error instanceof RateLimitError) {
     console.log(`Retry after ${error.retryAfterMs}ms`);
   } else if (error instanceof QuotaExceededError) {
     console.log(`Quota exceeded: ${error.used}/${error.limit} (${error.tier})`);
   } else if (error instanceof TimeoutError) {
-    console.log(`Request timed out after ${error.timeoutMs}ms`);
+    console.log(`Timed out after ${error.timeoutMs}ms`);
   } else if (error instanceof AuthenticationError) {
     console.log('Invalid API key');
   } else if (error instanceof WorkerError) {
-    console.log('Worker processing failed - will retry');
+    console.log('Worker failed - will retry');
   } else if (error instanceof FileProcessingError) {
     console.log(`File error: ${error.reason} - ${error.filename}`);
+  } else if (isRetryableError(error)) {
+    console.log('Transient error - safe to retry');
   }
 }
 ```
-## Self-Healing Reports
+---
-When `hydra-aidirector` fixes a malformed JSON response, it includes a `healingReport` in the data object:
+## Pricing
-```typescript
-const result = await client.generate({ ... });
+**BYOK (Bring Your Own Key)** — You pay AI providers directly. Hydra charges only for API access:
-if (result.success && result.meta.recovered) {
-  console.log('JSON was malformed but healed!');
-  console.log(result.data.healingReport);
-  // [{ original: "{name: 'foo'", healed: {name: "foo"}, fixes: ["Added missing brace"] }]
-}
-```
+| Tier | Price/mo | Requests | Overage |
+|------|----------|----------|---------|
+| Free | $0 | 1,000 | Blocked |
+| Starter | $9 | 25,000 | $0.50/1K |
+| Pro | $29 | 100,000 | $0.40/1K |
+| Scale | $79 | 500,000 | $0.30/1K |
-## AI-Directed Caching
+---
-The AI can control caching by including a `_cache` directive in its JSON output:
+## TypeScript Support
-```json
-{
-  "data": { ... },
-  "_cache": { "scope": "user" }
-}
-```
+Full type safety with comprehensive types:
-Scopes:
-- `global` - Share response across all users (default)
-- `user` - Cache per-user only
-- `skip` - Do not cache this response
+```typescript
+import type {
+  GenerateOptions,
+  GenerateResult,
+  StreamCallbacks,
+  FileAttachment,
+  ChainInfo,
+  ModelInfo,
+} from 'hydra-aidirector';
+```
-The directive is automatically stripped from the final response.
+---
-## Pricing
+## Requirements
-BYOK (Bring Your Own Key) - You pay for AI costs directly to providers.
+- Node.js 18+
+- TypeScript 5+ (optional but recommended)
-| Tier | Price | Requests | Overage |
-|------|-------|----------|---------|
-| Free | $0 | 1K | Blocked |
-| Starter | $9 | 25K | $0.50/1K |
-| Pro | $29 | 100K | $0.40/1K |
-| Scale | $79 | 500K | $0.30/1K |
+---
 ## License
-MIT
+MIT © [Hydra](https://hydrai.dev)

package/dist/index.d.mts CHANGED Viewed

@@ -116,6 +116,18 @@ interface GenerateOptions {
      * Set to 0 to disable retries for idempotent-sensitive operations.
      */
     maxRetries?: number;
+    /**
+     * Override cache quality logic for this request.
+     * - STANDARD: Balance of speed and quality (default)
+     * - HIGH: Prefer better matches even if slightly slower
+     * - MAX_EFFICIENCY: Aggressively cache to reduce costs and latency
+     */
+    cacheQuality?: 'STANDARD' | 'HIGH' | 'MAX_EFFICIENCY';
+    /**
+     * Force strict JSON usage for providers that support it.
+     * If true, ensures the model outputs valid JSON conforming to the schema.
+     */
+    strictJson?: boolean;
     /**
      * Client-generated request ID for tracing and debugging.
      * If not provided, one will be generated automatically.

package/dist/index.d.ts CHANGED Viewed

@@ -116,6 +116,18 @@ interface GenerateOptions {
      * Set to 0 to disable retries for idempotent-sensitive operations.
      */
     maxRetries?: number;
+    /**
+     * Override cache quality logic for this request.
+     * - STANDARD: Balance of speed and quality (default)
+     * - HIGH: Prefer better matches even if slightly slower
+     * - MAX_EFFICIENCY: Aggressively cache to reduce costs and latency
+     */
+    cacheQuality?: 'STANDARD' | 'HIGH' | 'MAX_EFFICIENCY';
+    /**
+     * Force strict JSON usage for providers that support it.
+     * If true, ensures the model outputs valid JSON conforming to the schema.
+     */
+    strictJson?: boolean;
     /**
      * Client-generated request ID for tracing and debugging.
      * If not provided, one will be generated automatically.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
     "name": "@altsafe/aidirector",
-    "version": "1.4.2",
+    "version": "1.6.0",
     "description": "Official TypeScript SDK for Hydra - Intelligent AI API Gateway with automatic failover, caching, and JSON extraction",
     "main": "./dist/index.js",
     "module": "./dist/index.mjs",
@@ -85,4 +85,4 @@
     "engines": {
         "node": ">=18.0.0"
     }
-}
+}