npm - compress-lightreach - Versions diffs - 1.0.1 → 1.0.2 - Mend

compress-lightreach 1.0.1 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +330 -80
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,25 +1,26 @@
 # Compress Light Reach
-**Intelligent compression algorithms for LLM prompts that reduce token usage**
+**AI cost management SDK with intelligent model routing, prompt compression, and real-time token tracking**
 [![npm version](https://badge.fury.io/js/compress-lightreach.svg)](https://badge.fury.io/js/compress-lightreach)
 [![Node.js 14+](https://img.shields.io/badge/node-14+-blue.svg)](https://nodejs.org/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-Compress Light Reach is a Node.js/TypeScript library that intelligently compresses LLM prompts by replacing repeated substrings with shorter placeholders, significantly reducing token usage and costs while maintaining perfect decompression.
+Compress Light Reach is a Node.js/TypeScript SDK that provides intelligent model routing and prompt compression for LLM applications, reducing token usage and costs while maintaining quality.
 ## Features
-- **Token-aware compression**: Only replaces substrings >1 token with 1-token placeholders
-- **Dual algorithms**:
+- **Intelligent Model Routing**: Automatically selects optimal model based on quality requirements (HLE) and available provider keys
+- **Token-aware Compression**: Replaces repeated substrings with shorter placeholders
+- **Dual Algorithms**:
   - Fast greedy (~99% optimal) for daily use
   - Optimal DP (O(n²)) for critical prompts
 - **Lossless**: Perfect decompression guaranteed
-- **Output compression**: Optional model output compression support
-- **Cloud API**: Uses Light Reach's cloud service for compression
-- **Model-aware**: Optimized for GPT-4, GPT-3.5-turbo, Claude, and more
+- **Output Compression**: Optional model output compression support
+- **Cloud API**: Uses Light Reach's cloud service for compression and routing
+- **Multi-provider Support**: OpenAI, Anthropic, Google, DeepSeek, Moonshot
 - **TypeScript**: Full TypeScript support with type definitions
-- **Intelligent Routing**: Automatic model selection based on quality requirements
+- **BYOK**: Provider API keys managed securely in dashboard (never passed through SDK)
 ## Installation
@@ -33,12 +34,12 @@ or
 yarn add compress-lightreach
 ```
-## Quick Start (v1.0.0)
+## Quick Start
 The SDK uses **intelligent model routing** and targets `POST /api/v2/complete`.
-- Authenticate with your **LightReach API key** (env var `PCOMPRESLR_API_KEY`)
-- Manage **provider keys** (OpenAI/Anthropic/Google) in the dashboard (BYOK)
+- Authenticate with your **LightReach API key** (env var `PCOMPRESLR_API_KEY` or `LIGHTREACH_API_KEY`)
+- Manage **provider keys** (OpenAI/Anthropic/Google/etc.) in the dashboard (BYOK)
 - System automatically selects optimal model based on your requirements
 ```typescript
@@ -51,7 +52,7 @@ const result = await client.complete({
     { role: 'system', content: 'You are a helpful assistant.' },
     { role: 'user', content: 'Explain quantum computing in simple terms.' },
   ],
-  desiredHle: 30,  // Quality preference (0-40, where 40 is SOTA)
+  desired_hle: 30,  // Quality preference (0-40, where 40 is SOTA)
 });
 console.log(result.decompressed_response);
@@ -64,14 +65,14 @@ console.log(`Token savings: ${result.compression_stats.token_savings}`);
 ```typescript
 const result = await client.complete({
   messages: [{ role: 'user', content: 'Generate a long report...' }],
-  desiredHle: 25,
-  compressOutput: true,
+  desired_hle: 25,
+  compress_output: true,
 });
 console.log(result.decompressed_response);
 ```
-### Intelligent Model Routing (v1.0.0)
+### Intelligent Model Routing
 The system automatically selects the optimal model based on quality requirements and your available provider keys:
@@ -82,14 +83,14 @@ const client = new PcompresslrAPIClient("your-lightreach-api-key");
 // Cross-provider optimization: system picks cheapest model meeting your quality bar
 const result = await client.complete({
-  messages: [{role: 'user', content: 'Explain quantum computing'}],
-  desiredHle: 30,  // Quality preference (0-40, where 40 is SOTA)
+  messages: [{ role: 'user', content: 'Explain quantum computing' }],
+  desired_hle: 30,  // Quality preference (0-40, where 40 is SOTA)
 });
 // Check what was selected
-console.log(result.routing_info?.selected_model);      // e.g., "gpt-4o-mini"
-console.log(result.routing_info?.selected_provider);   // e.g., "openai"
-console.log(result.routing_info?.model_hle);           // e.g., 32.5
+console.log(result.routing_info?.selected_model);           // e.g., "gpt-4o-mini"
+console.log(result.routing_info?.selected_provider);        // e.g., "openai"
+console.log(result.routing_info?.model_hle);                // e.g., 32.5
 console.log(result.routing_info?.model_price_per_million);  // e.g., 0.15
 ```
@@ -100,24 +101,24 @@ Optionally constrain to a specific provider:
 ```typescript
 // Only use OpenAI models, but pick the cheapest one meeting HLE 35
 const result = await client.complete({
-  messages: [{role: 'user', content: 'Write a poem'}],
-  llmProvider: 'openai',  // Optional: constrain to one provider
-  desiredHle: 35,
+  messages: [{ role: 'user', content: 'Write a poem' }],
+  llm_provider: 'openai',  // Optional: constrain to one provider
+  desired_hle: 35,
 });
 ```
 ### HLE Cascading with Admin Controls
-Admins can set quality **ceilings** via the dashboard (global or per-tag) to control costs. Your `desiredHle` is a preference, but requests will error if they exceed the admin-set ceiling:
+Admins can set quality **ceilings** via the dashboard (global or per-tag) to control costs. Your `desired_hle` is a preference, but requests will error if they exceed the admin-set ceiling:
 ```typescript
 // Admin set global HLE ceiling to 30%
 // Requesting above the ceiling will error
 try {
   const result = await client.complete({
-    messages: [{role: 'user', content: 'Process payment'}],
-    desiredHle: 35,  // ❌ ERROR: exceeds ceiling of 30
-    tags: {env: 'production'},
+    messages: [{ role: 'user', content: 'Process payment' }],
+    desired_hle: 35,  // ERROR: exceeds ceiling of 30
+    tags: { env: 'production' },
   });
 } catch (e) {
   console.error(e.message);  // "Requested HLE 35% exceeds workspace maximum of 30%"
@@ -125,9 +126,9 @@ try {
 // Correct usage: request within ceiling
 const result = await client.complete({
-  messages: [{role: 'user', content: 'Process payment'}],
-  desiredHle: 25,  // ✅ OK: below ceiling of 30
-  tags: {env: 'production'},
+  messages: [{ role: 'user', content: 'Process payment' }],
+  desired_hle: 25,  // OK: below ceiling of 30
+  tags: { env: 'production' },
 });
 // Check if your HLE was lowered by admin ceiling
@@ -138,6 +139,61 @@ if (result.routing_info?.hle_clamped) {
 }
 ```
+### Using the LightReach Wrapper Class
+For a more ergonomic API with camelCase parameters, use the `LightReach` class:
+```typescript
+import { LightReach } from 'compress-lightreach';
+const client = new LightReach({
+  apiKey: 'your-lightreach-api-key',
+  defaultModel: 'gpt-4',
+  defaultProvider: 'openai',
+  useOptimal: false,  // Use greedy algorithm by default
+});
+const result = await client.complete({
+  messages: [{ role: 'user', content: 'Hello!' }],
+  compress: true,
+  compressOutput: false,
+  compressionConfig: {
+    compressSystem: false,
+    compressUser: true,
+    compressAssistant: false,
+    compressOnlyLastNUser: 1,
+  },
+  temperature: 0.7,
+  maxTokens: 1000,
+  tags: { env: 'production' },
+});
+console.log(result.decompressed_response);
+```
+### Compression Only (No LLM Call)
+```typescript
+import { PcompresslrAPIClient } from 'compress-lightreach';
+const client = new PcompresslrAPIClient("your-lightreach-api-key");
+// Compress text without making an LLM call
+const compressed = await client.compress(
+  "Your text with repeated content here...",
+  "gpt-4",      // Model for tokenization
+  "greedy",     // Algorithm: 'greedy' or 'optimal'
+  { env: 'dev' } // Optional tags
+);
+console.log(compressed.llm_format);
+console.log(`Compression ratio: ${compressed.compression_ratio}`);
+// Decompress later
+const decompressed = await client.decompress(compressed.llm_format);
+console.log(decompressed.decompressed);
+```
 ### Command Line Interface
 ```bash
@@ -145,7 +201,6 @@ if (result.routing_info?.hle_clamped) {
 export PCOMPRESLR_API_KEY=your-api-key
 # Compress a prompt
-# Run the CLI directly (the published binary name is `pcompresslr`)
 npx pcompresslr "Your prompt with repeated text here..."
 # Use optimal algorithm only
@@ -161,67 +216,239 @@ npx pcompresslr "Your prompt here" --greedy-only
 Main API client for intelligent model routing and compression.
-#### Constructor Parameters
+#### Constructor
+```typescript
+new PcompresslrAPIClient(apiKey?: string, apiUrl?: string, timeout?: number)
+```
-- `apiKey` (string, optional): LightReach API key (or use `PCOMPRESLR_API_KEY` env var).
-- `apiUrl` (string, optional): Override base API URL (advanced/testing).
-- `timeout` (number): Request timeout in milliseconds (default: 120000).
+**Parameters:**
+- `apiKey` (string, optional): LightReach API key. Falls back to `LIGHTREACH_API_KEY` or `PCOMPRESLR_API_KEY` env vars.
+- `apiUrl` (string, optional): Override base API URL. Falls back to `PCOMPRESLR_API_URL` env var. Default: `https://api.compress.lightreach.io`
+- `timeout` (number, optional): Request timeout in milliseconds. Default: `120000` (2 minutes)
 #### Methods
-##### `complete(request)`
+##### `complete(request: CompleteV2Request): Promise<CompleteResponse>`
 Messages-first completion with intelligent routing (POST `/api/v2/complete`).
-**Parameters (CompleteV2Request):**
-- `messages` (required): Conversation history as array of objects with `role` and `content`
-- `llmProvider` (optional): Provider constraint (`'openai'`, `'anthropic'`, `'google'`, etc.). Omit for cross-provider optimization.
-- `desiredHle` (optional): Quality preference (0-40, where 40 is SOTA). Must not exceed admin's global/tag-level ceilings (request will error if it does).
-- `tags` (optional): Object of tags for cost attribution and tag-level HLE ceilings
-- `compress` (optional): Whether to compress messages (default: `true`)
-- `compressOutput` (optional): Whether to request compressed output from LLM (default: `false`)
-- `algorithm` (optional): Compression algorithm (`'greedy'` or `'optimal'`, default: `'greedy'`)
-- `temperature` (optional): LLM temperature parameter
-- `maxTokens` (optional): Maximum tokens to generate
-- `compressionConfig` (optional): Per-role compression settings
-- `maxHistoryMessages` (optional): Limit conversation history length
-**Response (CompleteResponse) includes:**
-- `decompressed_response`: Final decompressed LLM response
-- `routing_info`: Details about model selection:
-  - `selected_model`: Model chosen by system
-  - `selected_provider`: Provider chosen by system
-  - `model_hle`: HLE score of selected model
-  - `effective_hle`: Effective HLE after applying admin ceilings (min of desired/tag/global)
-  - `hle_source`: `'request'`, `'tag'`, `'global'`, or `'none'`
-  - `hle_clamped`: `true` if admin ceiling lowered your `desiredHle`
-- `compression_stats`: Token savings statistics
-- `llm_stats`: Token usage from the LLM
-- `warnings`: Array of any warnings (including HLE ceiling notifications)
-##### `compress(prompt, model, algorithm, tags)`
+**Request Parameters (`CompleteV2Request`):**
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `messages` | `Message[]` | required | Conversation history with `role` and `content` |
+| `llm_provider` | `'openai' \| 'anthropic' \| 'google' \| 'deepseek' \| 'moonshot'` | — | Optional provider constraint. Omit for cross-provider optimization |
+| `desired_hle` | `number` | — | Quality preference (0-40, where 40 is SOTA). Must not exceed admin ceilings |
+| `compress` | `boolean` | `true` | Whether to compress messages |
+| `compress_output` | `boolean` | `false` | Whether to request compressed output from LLM |
+| `algorithm` | `'greedy' \| 'optimal'` | `'greedy'` | Compression algorithm |
+| `compression_config` | `object` | — | Per-role compression settings (see below) |
+| `temperature` | `number` | — | LLM temperature parameter |
+| `max_tokens` | `number` | — | Maximum tokens to generate |
+| `tags` | `Record<string, string>` | — | Tags for cost attribution and tag-level HLE ceilings |
+| `max_history_messages` | `number` | — | Limit conversation history length |
+**`compression_config` options:**
+```typescript
+{
+  compress_system?: boolean;         // default: false
+  compress_user?: boolean;           // default: true
+  compress_assistant?: boolean;      // default: false
+  compress_only_last_n_user?: number | null;  // default: 1
+}
+```
+**Response (`CompleteResponse`):**
+```typescript
+{
+  decompressed_response: string;     // Final decompressed LLM response
+  compression_stats: {
+    original_size_chars: number;
+    compressed_size_chars: number;
+    original_tokens: number;
+    compressed_tokens: number;
+    compression_ratio: number;
+    token_savings: number;
+    token_savings_percent: number;
+    processing_time_ms?: number;
+  };
+  llm_stats: {
+    prompt_tokens: number;
+    completion_tokens: number;
+    total_tokens: number;
+  };
+  routing_info?: {
+    selected_model: string;          // Model chosen by system
+    selected_provider: string;       // Provider chosen by system
+    selected_model_id: string;
+    model_hle: number;               // HLE score of selected model
+    model_price_per_million: number;
+    requested_hle: number | null;
+    effective_hle: number | null;    // Effective HLE after admin ceilings
+    hle_source: 'request' | 'tag' | 'global' | 'none';
+    hle_clamped: boolean;            // true if admin ceiling lowered your desired_hle
+  };
+  warnings?: string[];
+  // Convenience aliases
+  text?: string;                     // Alias for decompressed_response
+  tokens_saved?: number;             // Alias for compression_stats.token_savings
+  tokens_used?: number;              // Alias for llm_stats.total_tokens
+  compression_ratio?: number;        // Alias for compression_stats.compression_ratio
+}
+```
+##### `compress(prompt, model?, algorithm?, tags?): Promise<CompressResponse>`
 Compression-only (POST `/api/v1/compress`).
-##### `decompress(llmFormat)`
+**Parameters:**
+- `prompt` (string, required): Text to compress
+- `model` (string, optional): Model for tokenization. Default: `'gpt-4'`
+- `algorithm` (`'greedy' | 'optimal'`, optional): Compression algorithm. Default: `'greedy'`
+- `tags` (`Record<string, string>`, optional): Tags for attribution
+**Response (`CompressResponse`):**
+```typescript
+{
+  compressed: string;
+  dictionary: Record<string, string>;
+  llm_format: string;
+  compression_ratio: number;
+  original_size: number;
+  compressed_size: number;
+  processing_time_ms: number;
+  algorithm: string;
+}
+```
+##### `decompress(llmFormat): Promise<DecompressResponse>`
 Decompress an LLM-formatted compressed prompt (POST `/api/v1/decompress`).
-##### `healthCheck()`
+**Parameters:**
+- `llmFormat` (string, required): The `llm_format` string from a compress response
+**Response (`DecompressResponse`):**
+```typescript
+{
+  decompressed: string;
+  processing_time_ms: number;
+}
+```
+##### `healthCheck(): Promise<HealthCheckResponse>`
 Check API health status (GET `/health`).
+**Response:**
+```typescript
+{
+  status: string;
+  version?: string;
+}
+```
+### `LightReach` Class
+Convenience wrapper with camelCase parameters.
+#### Constructor
+```typescript
+new LightReach(options?: {
+  apiKey?: string;
+  apiUrl?: string;
+  defaultModel?: string;      // Default: 'gpt-4'
+  defaultProvider?: 'openai' | 'anthropic' | 'google';  // Default: 'openai'
+  useOptimal?: boolean;       // Default: false (use greedy)
+})
+```
+#### Methods
+##### `complete(options: CompleteOptions): Promise<CompleteResponse>`
+```typescript
+interface CompleteOptions {
+  messages: Message[];
+  model?: string;
+  provider?: 'openai' | 'anthropic' | 'google';
+  compress?: boolean;
+  compressionConfig?: {
+    compressSystem?: boolean;
+    compressUser?: boolean;
+    compressAssistant?: boolean;
+    compressOnlyLastNUser?: number | null;
+  };
+  compressOutput?: boolean;
+  useOptimal?: boolean;
+  temperature?: number;
+  maxTokens?: number;
+  tags?: Record<string, string>;
+  maxHistoryMessages?: number;
+}
+```
+##### `compress(text, options?): Promise<CompressResponse>`
+```typescript
+await client.compress(text, {
+  model?: string;
+  algorithm?: 'greedy' | 'optimal';
+  tags?: Record<string, string>;
+});
+```
+### Message Types
+```typescript
+type MessageRole = 'system' | 'developer' | 'user' | 'assistant';
+interface Message {
+  role: MessageRole;
+  content: string;
+}
+```
 ### Environment Variables
-- `PCOMPRESLR_API_KEY` (or `LIGHTREACH_API_KEY`): Your LightReach API key.
-- `PCOMPRESLR_API_URL`: Override the API base URL (advanced/testing).
+| Variable | Description |
+|----------|-------------|
+| `PCOMPRESLR_API_KEY` | Your LightReach API key (primary) |
+| `LIGHTREACH_API_KEY` | Your LightReach API key (alternative) |
+| `PCOMPRESLR_API_URL` | Override the API base URL (advanced/testing) |
 ### Exceptions
-- `APIKeyError`: Raised when API key is invalid or missing
-- `RateLimitError`: Raised when rate limit is exceeded
-- `APIRequestError`: Raised for general API errors (including routing failures)
-- `PcompresslrAPIError`: Base exception class
+| Exception | Description |
+|-----------|-------------|
+| `PcompresslrAPIError` | Base exception class |
+| `APIKeyError` | Invalid or missing API key |
+| `RateLimitError` | Rate limit exceeded |
+| `APIRequestError` | General API errors (including routing failures) |
+```typescript
+import { APIKeyError, RateLimitError, APIRequestError } from 'compress-lightreach';
+try {
+  const result = await client.complete({ messages: [...] });
+} catch (error) {
+  if (error instanceof APIKeyError) {
+    console.error('Invalid API key');
+  } else if (error instanceof RateLimitError) {
+    console.error('Rate limited, please retry later');
+  } else if (error instanceof APIRequestError) {
+    console.error('API error:', error.message);
+  }
+}
+```
 ## How It Works
@@ -237,7 +464,7 @@ The library:
 ## Examples
-### Example 1: Using Complete Method (Recommended)
+### Example 1: Complete with Compression
 ```typescript
 import { PcompresslrAPIClient } from 'compress-lightreach';
@@ -250,10 +477,9 @@ Write a story about a dog. The dog is very friendly.
 Write a story about a bird. The bird is very friendly.
 `;
-// One call handles compression, LLM request, and decompression
 const result = await client.complete({
   messages: [{ role: "user", content: prompt }],
-  desiredHle: 30,
+  desired_hle: 30,
 });
 console.log(result.decompressed_response);
@@ -262,22 +488,46 @@ console.log(`Token savings: ${result.compression_stats.token_savings} tokens`);
 console.log(`Compression ratio: ${(result.compression_stats.compression_ratio * 100).toFixed(2)}%`);
 ```
-### Example 2: Complete with Output Compression
+### Example 2: Output Compression
 ```typescript
 import { PcompresslrAPIClient } from 'compress-lightreach';
 const client = new PcompresslrAPIClient("your-lightreach-api-key");
-// Complete with output compression - response is automatically decompressed
 const result = await client.complete({
   messages: [{ role: "user", content: "Generate a long report with repeated sections..." }],
-  desiredHle: 35,
-  compressOutput: true
+  desired_hle: 35,
+  compress_output: true,
 });
 console.log(result.decompressed_response);
 ```
+### Example 3: Multi-turn Conversation
+```typescript
+import { PcompresslrAPIClient } from 'compress-lightreach';
+const client = new PcompresslrAPIClient("your-lightreach-api-key");
+const result = await client.complete({
+  messages: [
+    { role: "system", content: "You are a helpful coding assistant." },
+    { role: "user", content: "How do I read a file in Python?" },
+    { role: "assistant", content: "You can use open() with a context manager..." },
+    { role: "user", content: "How about writing to a file?" },
+  ],
+  desired_hle: 30,
+  compression_config: {
+    compress_system: false,
+    compress_user: true,
+    compress_assistant: false,
+    compress_only_last_n_user: 2,  // Only compress last 2 user messages
+  },
+});
+```
 ## Getting an API Key
 To use Compress Light Reach, you need an API key from [compress.lightreach.io](https://compress.lightreach.io).
@@ -289,7 +539,7 @@ To use Compress Light Reach, you need an API key from [compress.lightreach.io](h
 ## Security & Privacy
-**BYOK model:** Provider keys (OpenAI/Anthropic/Google) are managed in the dashboard and **never passed through this SDK**. The SDK only uses your LightReach API key for authentication with the service.
+**BYOK model:** Provider keys (OpenAI/Anthropic/Google/etc.) are managed in the dashboard and **never passed through this SDK**. The SDK only uses your LightReach API key for authentication with the service.
 ## Requirements
@@ -298,7 +548,7 @@ To use Compress Light Reach, you need an API key from [compress.lightreach.io](h
 ## License
-MIT License - see [LICENSE](../LICENSE) file for details.
+MIT License - see [LICENSE](./LICENSE) file for details.
 ## Support

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "compress-lightreach",
-  "version": "1.0.1",
+  "version": "1.0.2",
   "description": "AI cost management SDK with intelligent model routing, prompt compression, and real-time token tracking",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",