compress-lightreach 0.1.4 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -10,11 +10,16 @@ Compress Light Reach is a Node.js/TypeScript library that intelligently compress
10
10
 
11
11
  ## Features
12
12
 
13
+ - **Token-aware compression**: Only replaces substrings >1 token with 1-token placeholders
14
+ - **Dual algorithms**:
15
+ - Fast greedy (~99% optimal) for daily use
16
+ - Optimal DP (O(n²)) for critical prompts
13
17
  - **Lossless**: Perfect decompression guaranteed
14
18
  - **Output compression**: Optional model output compression support
15
19
  - **Cloud API**: Uses Light Reach's cloud service for compression
16
20
  - **Model-aware**: Optimized for GPT-4, GPT-3.5-turbo, Claude, and more
17
21
  - **TypeScript**: Full TypeScript support with type definitions
22
+ - **Intelligent Routing**: Automatic model selection based on quality requirements
18
23
 
19
24
  ## Installation
20
25
 
@@ -28,50 +33,109 @@ or
28
33
  yarn add compress-lightreach
29
34
  ```
30
35
 
31
- ## Quick Start
36
+ ## Quick Start (v1.0.0)
32
37
 
33
- The `complete()` method is the only public interface for using Compress Light Reach. It handles compression, LLM call, and decompression in one request. Clients never see compressed prompts or dictionaries - everything is handled internally.
38
+ The SDK uses **intelligent model routing** and targets `POST /api/v2/complete`.
39
+
40
+ - Authenticate with your **LightReach API key** (env var `PCOMPRESLR_API_KEY`)
41
+ - Manage **provider keys** (OpenAI/Anthropic/Google) in the dashboard (BYOK)
42
+ - System automatically selects optimal model based on your requirements
34
43
 
35
44
  ```typescript
36
- import { Pcompresslr } from 'compress-lightreach';
37
-
38
- // Initialize compressor with LLM provider credentials
39
- const compressor = new Pcompresslr({
40
- model: "gpt-4",
41
- apiKey: "your-compress-api-key", // Or set PCOMPRESLR_API_KEY env var
42
- llmProvider: "openai", // or "anthropic"
43
- llmApiKey: "your-llm-api-key" // OpenAI or Anthropic API key
45
+ import { PcompresslrAPIClient } from 'compress-lightreach';
46
+
47
+ const client = new PcompresslrAPIClient("your-lightreach-api-key");
48
+
49
+ const result = await client.complete({
50
+ messages: [
51
+ { role: 'system', content: 'You are a helpful assistant.' },
52
+ { role: 'user', content: 'Explain quantum computing in simple terms.' },
53
+ ],
54
+ desiredHle: 30, // Quality preference (0-40, where 40 is SOTA)
44
55
  });
45
56
 
46
- // Complete a prompt (compresses, calls LLM, decompresses)
47
- const result = await compressor.complete("Your long prompt with repeated text here...");
57
+ console.log(result.decompressed_response);
58
+ console.log(`Selected: ${result.routing_info?.selected_model}`);
59
+ console.log(`Token savings: ${result.compression_stats.token_savings}`);
60
+ ```
61
+
62
+ ### With Output Compression
63
+
64
+ ```typescript
65
+ const result = await client.complete({
66
+ messages: [{ role: 'user', content: 'Generate a long report...' }],
67
+ desiredHle: 25,
68
+ compressOutput: true,
69
+ });
48
70
 
49
- // Get the final decompressed response
50
71
  console.log(result.decompressed_response);
72
+ ```
51
73
 
52
- // View compression statistics
53
- console.log(`Token savings: ${result.compression_stats.token_savings} tokens`);
54
- console.log(`Compression ratio: ${(result.compression_stats.compression_ratio * 100).toFixed(2)}%`);
74
+ ### Intelligent Model Routing (v1.0.0)
75
+
76
+ The system automatically selects the optimal model based on quality requirements and your available provider keys:
77
+
78
+ ```typescript
79
+ import { PcompresslrAPIClient } from 'compress-lightreach';
80
+
81
+ const client = new PcompresslrAPIClient("your-lightreach-api-key");
82
+
83
+ // Cross-provider optimization: system picks cheapest model meeting your quality bar
84
+ const result = await client.complete({
85
+ messages: [{role: 'user', content: 'Explain quantum computing'}],
86
+ desiredHle: 30, // Quality preference (0-40, where 40 is SOTA)
87
+ });
88
+
89
+ // Check what was selected
90
+ console.log(result.routing_info?.selected_model); // e.g., "gpt-4o-mini"
91
+ console.log(result.routing_info?.selected_provider); // e.g., "openai"
92
+ console.log(result.routing_info?.model_hle); // e.g., 32.5
93
+ console.log(result.routing_info?.model_price_per_million); // e.g., 0.15
55
94
  ```
56
95
 
57
- ### With Output Compression
96
+ ### Provider-Constrained Routing
97
+
98
+ Optionally constrain to a specific provider:
58
99
 
59
100
  ```typescript
60
- import { Pcompresslr } from 'compress-lightreach';
61
-
62
- const compressor = new Pcompresslr({
63
- model: "gpt-4",
64
- apiKey: "your-api-key",
65
- llmProvider: "openai",
66
- llmApiKey: "your-llm-api-key",
67
- compressOutput: true // Enable output compression
101
+ // Only use OpenAI models, but pick the cheapest one meeting HLE 35
102
+ const result = await client.complete({
103
+ messages: [{role: 'user', content: 'Write a poem'}],
104
+ llmProvider: 'openai', // Optional: constrain to one provider
105
+ desiredHle: 35,
68
106
  });
107
+ ```
69
108
 
70
- // Complete with output compression enabled
71
- const result = await compressor.complete("Generate a long report with repeated sections...");
109
+ ### HLE Cascading with Admin Controls
72
110
 
73
- // Response is automatically decompressed
74
- console.log(result.decompressed_response);
111
+ Admins can set quality **ceilings** via the dashboard (global or per-tag) to control costs. Your `desiredHle` is a preference, but requests will error if they exceed the admin-set ceiling:
112
+
113
+ ```typescript
114
+ // Admin set global HLE ceiling to 30%
115
+ // Requesting above the ceiling will error
116
+ try {
117
+ const result = await client.complete({
118
+ messages: [{role: 'user', content: 'Process payment'}],
119
+ desiredHle: 35, // ❌ ERROR: exceeds ceiling of 30
120
+ tags: {env: 'production'},
121
+ });
122
+ } catch (e) {
123
+ console.error(e.message); // "Requested HLE 35% exceeds workspace maximum of 30%"
124
+ }
125
+
126
+ // Correct usage: request within ceiling
127
+ const result = await client.complete({
128
+ messages: [{role: 'user', content: 'Process payment'}],
129
+ desiredHle: 25, // ✅ OK: below ceiling of 30
130
+ tags: {env: 'production'},
131
+ });
132
+
133
+ // Check if your HLE was lowered by admin ceiling
134
+ if (result.routing_info?.hle_clamped) {
135
+ console.log(`HLE lowered from ${result.routing_info.requested_hle} ` +
136
+ `to ${result.routing_info.effective_hle} ` +
137
+ `by ${result.routing_info.hle_source}-level ceiling`);
138
+ }
75
139
  ```
76
140
 
77
141
  ### Command Line Interface
@@ -81,50 +145,82 @@ console.log(result.decompressed_response);
81
145
  export PCOMPRESLR_API_KEY=your-api-key
82
146
 
83
147
  # Compress a prompt
84
- npx compress-lightreach "Your prompt with repeated text here..."
148
+ # Run the CLI directly (the published binary name is `pcompresslr`)
149
+ npx pcompresslr "Your prompt with repeated text here..."
85
150
 
86
151
  # Use optimal algorithm only
87
- npx compress-lightreach "Your prompt here" --optimal-only
152
+ npx pcompresslr "Your prompt here" --optimal-only
88
153
 
89
154
  # Use greedy algorithm only
90
- npx compress-lightreach "Your prompt here" --greedy-only
155
+ npx pcompresslr "Your prompt here" --greedy-only
91
156
  ```
92
157
 
93
158
  ## API Reference
94
159
 
95
- ### `Pcompresslr`
160
+ ### `PcompresslrAPIClient`
96
161
 
97
- Main interface class for prompt compression.
162
+ Main API client for intelligent model routing and compression.
98
163
 
99
- #### Constructor Options
164
+ #### Constructor Parameters
100
165
 
101
- - `model` (string): LLM model name for tokenization (default: `"gpt-4"`)
102
- - `apiKey` (string, optional): API key for authentication. If not provided, checks `PCOMPRESLR_API_KEY` env var.
103
- - `compressOutput` (boolean): If true, instruct the model to output in compressed format (default: `false`)
104
- - `useOptimal` (boolean): If true, use optimal DP algorithm, else use fast greedy (default: `false`)
105
- - `llmProvider` (string, optional): LLM provider ('openai' or 'anthropic') for caching API key
106
- - `llmApiKey` (string, optional): LLM provider API key for use with `complete()` method
166
+ - `apiKey` (string, optional): LightReach API key (or use `PCOMPRESLR_API_KEY` env var).
167
+ - `apiUrl` (string, optional): Override base API URL (advanced/testing).
168
+ - `timeout` (number): Request timeout in milliseconds (default: 120000).
107
169
 
108
170
  #### Methods
109
171
 
110
- - `complete(prompt, options?)`: Complete a prompt with compression, LLM call, and decompression in one request. Returns a Promise with `CompleteResponse` containing `decompressed_response`, `compression_stats`, `llm_stats`, and more. This is the only public method - clients never see compressed prompts or dictionaries.
111
- - `setLlmKey(provider, apiKey)`: Cache LLM provider API key for use with `complete()` method
112
- - `clearLlmKey()`: Clear cached LLM API key
113
- - `hasLlmKey()`: Check if LLM API key is cached
172
+ ##### `complete(request)`
114
173
 
115
- ### `PcompresslrAPIClient`
174
+ Messages-first completion with intelligent routing (POST `/api/v2/complete`).
116
175
 
117
- Low-level API client for direct interaction with the compression service. This is primarily for internal use. Most users should use `Pcompresslr.complete()` instead.
176
+ **Parameters (CompleteV2Request):**
177
+ - `messages` (required): Conversation history as array of objects with `role` and `content`
178
+ - `llmProvider` (optional): Provider constraint (`'openai'`, `'anthropic'`, `'google'`, etc.). Omit for cross-provider optimization.
179
+ - `desiredHle` (optional): Quality preference (0-40, where 40 is SOTA). Must not exceed admin's global/tag-level ceilings (request will error if it does).
180
+ - `tags` (optional): Object of tags for cost attribution and tag-level HLE ceilings
181
+ - `compress` (optional): Whether to compress messages (default: `true`)
182
+ - `compressOutput` (optional): Whether to request compressed output from LLM (default: `false`)
183
+ - `algorithm` (optional): Compression algorithm (`'greedy'` or `'optimal'`, default: `'greedy'`)
184
+ - `temperature` (optional): LLM temperature parameter
185
+ - `maxTokens` (optional): Maximum tokens to generate
186
+ - `compressionConfig` (optional): Per-role compression settings
187
+ - `maxHistoryMessages` (optional): Limit conversation history length
118
188
 
119
- #### Methods
189
+ **Response (CompleteResponse) includes:**
190
+ - `decompressed_response`: Final decompressed LLM response
191
+ - `routing_info`: Details about model selection:
192
+ - `selected_model`: Model chosen by system
193
+ - `selected_provider`: Provider chosen by system
194
+ - `model_hle`: HLE score of selected model
195
+ - `effective_hle`: Effective HLE after applying admin ceilings (min of desired/tag/global)
196
+ - `hle_source`: `'request'`, `'tag'`, `'global'`, or `'none'`
197
+ - `hle_clamped`: `true` if admin ceiling lowered your `desiredHle`
198
+ - `compression_stats`: Token savings statistics
199
+ - `llm_stats`: Token usage from the LLM
200
+ - `warnings`: Array of any warnings (including HLE ceiling notifications)
201
+
202
+ ##### `compress(prompt, model, algorithm, tags)`
203
+
204
+ Compression-only (POST `/api/v1/compress`).
205
+
206
+ ##### `decompress(llmFormat)`
207
+
208
+ Decompress an LLM-formatted compressed prompt (POST `/api/v1/decompress`).
120
209
 
121
- - `healthCheck()`: Check API health status
210
+ ##### `healthCheck()`
211
+
212
+ Check API health status (GET `/health`).
213
+
214
+ ### Environment Variables
215
+
216
+ - `PCOMPRESLR_API_KEY` (or `LIGHTREACH_API_KEY`): Your LightReach API key.
217
+ - `PCOMPRESLR_API_URL`: Override the API base URL (advanced/testing).
122
218
 
123
219
  ### Exceptions
124
220
 
125
221
  - `APIKeyError`: Raised when API key is invalid or missing
126
222
  - `RateLimitError`: Raised when rate limit is exceeded
127
- - `APIRequestError`: Raised for general API errors
223
+ - `APIRequestError`: Raised for general API errors (including routing failures)
128
224
  - `PcompresslrAPIError`: Base exception class
129
225
 
130
226
  ## How It Works
@@ -135,22 +231,18 @@ The library:
135
231
  1. Identifies repeated substrings using efficient suffix array algorithms
136
232
  2. Calculates token savings for each potential replacement
137
233
  3. Selects optimal replacements that reduce total token count
138
- 4. Formats the result for easy LLM consumption
139
- 5. Provides perfect decompression
234
+ 4. Intelligently routes to the best model based on your quality requirements
235
+ 5. Formats the result for easy LLM consumption
236
+ 6. Provides perfect decompression
140
237
 
141
238
  ## Examples
142
239
 
143
240
  ### Example 1: Using Complete Method (Recommended)
144
241
 
145
242
  ```typescript
146
- import { Pcompresslr } from 'compress-lightreach';
243
+ import { PcompresslrAPIClient } from 'compress-lightreach';
147
244
 
148
- const compressor = new Pcompresslr({
149
- model: "gpt-4",
150
- apiKey: "your-compress-api-key",
151
- llmProvider: "openai",
152
- llmApiKey: "your-openai-key"
153
- });
245
+ const client = new PcompresslrAPIClient("your-lightreach-api-key");
154
246
 
155
247
  const prompt = `
156
248
  Write a story about a cat. The cat is very friendly.
@@ -159,9 +251,13 @@ Write a story about a bird. The bird is very friendly.
159
251
  `;
160
252
 
161
253
  // One call handles compression, LLM request, and decompression
162
- const result = await compressor.complete(prompt);
254
+ const result = await client.complete({
255
+ messages: [{ role: "user", content: prompt }],
256
+ desiredHle: 30,
257
+ });
163
258
 
164
259
  console.log(result.decompressed_response);
260
+ console.log(`Model used: ${result.routing_info?.selected_model}`);
165
261
  console.log(`Token savings: ${result.compression_stats.token_savings} tokens`);
166
262
  console.log(`Compression ratio: ${(result.compression_stats.compression_ratio * 100).toFixed(2)}%`);
167
263
  ```
@@ -169,18 +265,16 @@ console.log(`Compression ratio: ${(result.compression_stats.compression_ratio *
169
265
  ### Example 2: Complete with Output Compression
170
266
 
171
267
  ```typescript
172
- import { Pcompresslr } from 'compress-lightreach';
268
+ import { PcompresslrAPIClient } from 'compress-lightreach';
173
269
 
174
- const compressor = new Pcompresslr({
175
- model: "gpt-4",
176
- apiKey: "your-compress-api-key",
177
- llmProvider: "openai",
178
- llmApiKey: "your-openai-key",
179
- compressOutput: true
180
- });
270
+ const client = new PcompresslrAPIClient("your-lightreach-api-key");
181
271
 
182
272
  // Complete with output compression - response is automatically decompressed
183
- const result = await compressor.complete("Generate a long report with repeated sections...");
273
+ const result = await client.complete({
274
+ messages: [{ role: "user", content: "Generate a long report with repeated sections..." }],
275
+ desiredHle: 35,
276
+ compressOutput: true
277
+ });
184
278
  console.log(result.decompressed_response);
185
279
  ```
186
280
 
@@ -195,13 +289,7 @@ To use Compress Light Reach, you need an API key from [compress.lightreach.io](h
195
289
 
196
290
  ## Security & Privacy
197
291
 
198
- **We do not store LLM API keys anywhere.** When you provide an LLM API key (OpenAI, Anthropic, etc.) to use with the `complete()` method, it is:
199
- - Only used in-memory for the duration of the API request
200
- - Never stored on disk, in databases, or in logs
201
- - Never transmitted to any third-party services
202
- - Immediately discarded after the request completes
203
-
204
- Your LLM API keys remain secure and private. Only your Compress Light Reach API key is stored for authentication with our compression service.
292
+ **BYOK model:** Provider keys (OpenAI/Anthropic/Google) are managed in the dashboard and **never passed through this SDK**. The SDK only uses your LightReach API key for authentication with the service.
205
293
 
206
294
  ## Requirements
207
295
 
@@ -221,4 +309,3 @@ MIT License - see [LICENSE](../LICENSE) file for details.
221
309
  ## Contributing
222
310
 
223
311
  Contributions are welcome! Please feel free to submit a Pull Request.
224
-
@@ -45,6 +45,51 @@ export interface CompleteResponse {
45
45
  total_tokens: number;
46
46
  };
47
47
  warnings?: string[];
48
+ routing_info?: {
49
+ selected_model: string;
50
+ selected_provider: string;
51
+ selected_model_id: string;
52
+ model_hle: number;
53
+ model_price_per_million: number;
54
+ requested_hle: number | null;
55
+ effective_hle: number | null;
56
+ hle_source: 'request' | 'tag' | 'global' | 'none';
57
+ hle_clamped: boolean;
58
+ };
59
+ text?: string;
60
+ tokens_saved?: number;
61
+ tokens_used?: number;
62
+ compression_ratio?: number;
63
+ cost_estimate?: number | null;
64
+ savings_estimate?: number | null;
65
+ }
66
+ export type MessageRole = 'system' | 'developer' | 'user' | 'assistant';
67
+ export interface Message {
68
+ role: MessageRole;
69
+ content: string;
70
+ }
71
+ export interface CompleteV2Request {
72
+ messages: Message[];
73
+ llm_provider?: 'openai' | 'anthropic' | 'google' | 'deepseek' | 'moonshot';
74
+ desired_hle?: number;
75
+ compress?: boolean;
76
+ compression_config?: {
77
+ compress_system?: boolean;
78
+ compress_user?: boolean;
79
+ compress_assistant?: boolean;
80
+ compress_only_last_n_user?: number | null;
81
+ };
82
+ compress_output?: boolean;
83
+ algorithm?: 'greedy' | 'optimal';
84
+ temperature?: number;
85
+ max_tokens?: number;
86
+ tags?: Record<string, string>;
87
+ max_history_messages?: number;
88
+ model?: string;
89
+ hle_target_percent?: number;
90
+ min_hle_score?: number;
91
+ auto_select_by_hle?: boolean;
92
+ same_provider_only?: boolean;
48
93
  }
49
94
  export interface HealthCheckResponse {
50
95
  status: string;
@@ -58,8 +103,36 @@ export declare class PcompresslrAPIClient {
58
103
  private session;
59
104
  constructor(apiKey?: string, apiUrl?: string, timeout?: number);
60
105
  private makeRequest;
61
- compress(prompt: string, model?: string, algorithm?: "greedy" | "optimal"): Promise<CompressResponse>;
106
+ compress(prompt: string, model?: string, algorithm?: "greedy" | "optimal", tags?: Record<string, string>): Promise<CompressResponse>;
62
107
  decompress(llmFormat: string): Promise<DecompressResponse>;
63
108
  healthCheck(): Promise<HealthCheckResponse>;
64
- complete(prompt: string, model: string, llmProvider: "openai" | "anthropic", llmApiKey: string, compress?: boolean, compressOutput?: boolean, algorithm?: "greedy" | "optimal", temperature?: number, maxTokens?: number): Promise<CompleteResponse>;
109
+ /**
110
+ * Messages-first complete with intelligent model selection (POST /api/v2/complete).
111
+ *
112
+ * v1.0.0: System automatically selects optimal model based on your provider keys,
113
+ * desired HLE, and admin's global/tag-level HLE ceilings.
114
+ *
115
+ * Provider API keys must be stored in your account (BYOK via dashboard).
116
+ *
117
+ * @example
118
+ * // Basic usage (cross-provider optimization)
119
+ * const response = await client.complete({
120
+ * messages: [{role: 'user', content: 'Hello'}],
121
+ * desired_hle: 30,
122
+ * });
123
+ *
124
+ * // Constrained to specific provider
125
+ * const response = await client.complete({
126
+ * messages: [{role: 'user', content: 'Hello'}],
127
+ * llm_provider: 'openai',
128
+ * desired_hle: 35,
129
+ * });
130
+ *
131
+ * // Access routing info
132
+ * console.log(response.routing_info?.selected_model);
133
+ * if (response.routing_info?.hle_clamped) {
134
+ * console.log('Admin ceiling lowered your desired HLE');
135
+ * }
136
+ */
137
+ complete(request: CompleteV2Request): Promise<CompleteResponse>;
65
138
  }
@@ -8,6 +8,7 @@ var __importDefault = (this && this.__importDefault) || function (mod) {
8
8
  Object.defineProperty(exports, "__esModule", { value: true });
9
9
  exports.PcompresslrAPIClient = exports.APIRequestError = exports.RateLimitError = exports.APIKeyError = exports.PcompresslrAPIError = void 0;
10
10
  const axios_1 = __importDefault(require("axios"));
11
+ const version_1 = require("./version");
11
12
  class PcompresslrAPIError extends Error {
12
13
  constructor(message) {
13
14
  super(message);
@@ -41,11 +42,11 @@ class APIRequestError extends PcompresslrAPIError {
41
42
  }
42
43
  exports.APIRequestError = APIRequestError;
43
44
  class PcompresslrAPIClient {
44
- constructor(apiKey, apiUrl, timeout = 10000 // 10 seconds in milliseconds
45
+ constructor(apiKey, apiUrl, timeout = 120000 // 2 minutes - complete() calls LLM which can take 30+ seconds
45
46
  ) {
46
47
  this.DEFAULT_API_URL = "https://api.compress.lightreach.io";
47
48
  // Get API key from parameter or environment
48
- this.apiKey = apiKey || process.env.PCOMPRESLR_API_KEY || '';
49
+ this.apiKey = apiKey || process.env.LIGHTREACH_API_KEY || process.env.PCOMPRESLR_API_KEY || '';
49
50
  if (!this.apiKey || !this.apiKey.trim()) {
50
51
  throw new APIKeyError("API key is required. Provide it as a parameter or set " +
51
52
  "PCOMPRESLR_API_KEY environment variable.");
@@ -60,9 +61,11 @@ class PcompresslrAPIClient {
60
61
  baseURL: this.apiUrl,
61
62
  timeout: this.timeout,
62
63
  headers: {
64
+ // Prefer standard Bearer auth, but keep X-API-Key for backward compatibility.
65
+ 'Authorization': `Bearer ${this.apiKey}`,
63
66
  'X-API-Key': this.apiKey,
64
67
  'Content-Type': 'application/json',
65
- 'User-Agent': 'compress-lightreach-nodejs/0.1.0'
68
+ 'User-Agent': `compress-lightreach-nodejs/${version_1.__version__}`
66
69
  }
67
70
  });
68
71
  // Add response interceptor for retry logic
@@ -136,12 +139,15 @@ class PcompresslrAPIClient {
136
139
  throw new APIRequestError(`Request failed: ${errorMessage}`);
137
140
  }
138
141
  }
139
- async compress(prompt, model = "gpt-4", algorithm = "greedy") {
142
+ async compress(prompt, model = "gpt-4", algorithm = "greedy", tags) {
140
143
  const data = {
141
144
  prompt,
142
145
  model,
143
146
  algorithm
144
147
  };
148
+ if (tags) {
149
+ data.tags = tags;
150
+ }
145
151
  return this.makeRequest("/api/v1/compress", data);
146
152
  }
147
153
  async decompress(llmFormat) {
@@ -166,23 +172,69 @@ class PcompresslrAPIClient {
166
172
  throw new APIRequestError(`Health check failed: ${errorMessage}`);
167
173
  }
168
174
  }
169
- async complete(prompt, model, llmProvider, llmApiKey, compress = true, compressOutput = false, algorithm = "greedy", temperature, maxTokens) {
170
- const data = {
171
- prompt,
172
- model,
173
- llm_provider: llmProvider,
174
- llm_api_key: llmApiKey,
175
- compress,
176
- compress_output: compressOutput,
177
- algorithm,
178
- };
179
- if (temperature !== undefined) {
180
- data.temperature = temperature;
175
+ /**
176
+ * Messages-first complete with intelligent model selection (POST /api/v2/complete).
177
+ *
178
+ * v1.0.0: System automatically selects optimal model based on your provider keys,
179
+ * desired HLE, and admin's global/tag-level HLE ceilings.
180
+ *
181
+ * Provider API keys must be stored in your account (BYOK via dashboard).
182
+ *
183
+ * @example
184
+ * // Basic usage (cross-provider optimization)
185
+ * const response = await client.complete({
186
+ * messages: [{role: 'user', content: 'Hello'}],
187
+ * desired_hle: 30,
188
+ * });
189
+ *
190
+ * // Constrained to specific provider
191
+ * const response = await client.complete({
192
+ * messages: [{role: 'user', content: 'Hello'}],
193
+ * llm_provider: 'openai',
194
+ * desired_hle: 35,
195
+ * });
196
+ *
197
+ * // Access routing info
198
+ * console.log(response.routing_info?.selected_model);
199
+ * if (response.routing_info?.hle_clamped) {
200
+ * console.log('Admin ceiling lowered your desired HLE');
201
+ * }
202
+ */
203
+ async complete(request) {
204
+ // Warn about deprecated parameters
205
+ if (request.model !== undefined) {
206
+ console.warn('[compress-lightreach v1.0.0] Parameter "model" is deprecated and will be ignored. ' +
207
+ 'The system now selects models automatically.');
181
208
  }
182
- if (maxTokens !== undefined) {
183
- data.max_tokens = maxTokens;
209
+ if (request.hle_target_percent !== undefined ||
210
+ request.min_hle_score !== undefined ||
211
+ request.auto_select_by_hle !== undefined ||
212
+ request.same_provider_only !== undefined) {
213
+ console.warn('[compress-lightreach v1.0.0] HLE parameters have changed. ' +
214
+ 'Use "desired_hle" and optional "llm_provider" instead.');
184
215
  }
185
- return this.makeRequest("/api/v1/complete", data);
216
+ const data = {
217
+ messages: request.messages,
218
+ compress: request.compress ?? true,
219
+ compress_output: request.compress_output ?? false,
220
+ algorithm: request.algorithm ?? 'greedy',
221
+ };
222
+ // v1.0.0 parameters
223
+ if (request.llm_provider !== undefined)
224
+ data.llm_provider = request.llm_provider;
225
+ if (request.desired_hle !== undefined)
226
+ data.desired_hle = request.desired_hle;
227
+ if (request.compression_config)
228
+ data.compression_config = request.compression_config;
229
+ if (request.temperature !== undefined)
230
+ data.temperature = request.temperature;
231
+ if (request.max_tokens !== undefined)
232
+ data.max_tokens = request.max_tokens;
233
+ if (request.tags !== undefined)
234
+ data.tags = request.tags;
235
+ if (request.max_history_messages !== undefined)
236
+ data.max_history_messages = request.max_history_messages;
237
+ return this.makeRequest("/api/v2/complete", data);
186
238
  }
187
239
  }
188
240
  exports.PcompresslrAPIClient = PcompresslrAPIClient;
package/dist/cli.js CHANGED
@@ -13,7 +13,7 @@ async function main() {
13
13
  console.log(' pcompresslr "hello world hello world hello world"');
14
14
  console.log(' pcompresslr "your prompt here" --greedy-only # Only greedy');
15
15
  console.log(' pcompresslr "your prompt here" --optimal-only # Only optimal');
16
- console.log("\nNote: Requires PCOMPRESLR_API_KEY environment variable or API key parameter");
16
+ console.log("\nNote: Requires PCOMPRESLR_API_KEY environment variable");
17
17
  process.exit(0);
18
18
  }
19
19
  let prompt = args.join(" ");
package/dist/core.d.ts CHANGED
@@ -1,47 +1,58 @@
1
1
  /**
2
- * Core Pcompresslr class for compressing prompts and optionally model outputs.
2
+ * SDK client for interacting with the LightReach/Compress API.
3
+ *
4
+ * v0.2.0 is a breaking release:
5
+ * - complete() is messages-first and targets POST /api/v2/complete
6
+ * - provider API keys are not accepted by this SDK (BYOK via dashboard)
3
7
  */
4
- import { CompleteResponse } from './api-client';
8
+ import { CompleteResponse, Message, CompressResponse } from './api-client';
5
9
  /**
6
- * Main interface for prompt compression with optional output compression.
7
- *
8
- * Usage:
9
- * // Complete a prompt (compresses, calls LLM, decompresses)
10
- * const compressor = new Pcompresslr({
11
- * model: "gpt-4",
12
- * apiKey: "your-key",
13
- * llmProvider: "openai",
14
- * llmApiKey: "your-llm-key"
15
- * });
16
- * const result = await compressor.complete("Your prompt here");
17
- * console.log(result.decompressed_response);
10
+ * User-facing compression config (camelCase).
18
11
  */
19
- export declare class Pcompresslr {
20
- private model;
21
- private compressOutput;
22
- private useOptimal;
23
- private llmProvider?;
24
- private llmApiKey?;
12
+ export interface CompressionConfig {
13
+ compressSystem?: boolean;
14
+ compressUser?: boolean;
15
+ compressAssistant?: boolean;
16
+ compressOnlyLastNUser?: number | null;
17
+ }
18
+ export interface CompleteOptions {
19
+ messages: Message[];
20
+ model?: string;
21
+ provider?: 'openai' | 'anthropic' | 'google';
22
+ compress?: boolean;
23
+ compressionConfig?: CompressionConfig;
24
+ compressOutput?: boolean;
25
+ useOptimal?: boolean;
26
+ hleTargetPercent?: number;
27
+ minHleScore?: number;
28
+ autoSelectByHle?: boolean;
29
+ sameProviderOnly?: boolean;
30
+ temperature?: number;
31
+ maxTokens?: number;
32
+ tags?: Record<string, string>;
33
+ maxHistoryMessages?: number;
34
+ }
35
+ export declare class LightReach {
25
36
  private apiClient;
37
+ private defaultModel;
38
+ private defaultProvider;
39
+ private useOptimal;
26
40
  constructor(options?: {
27
- model?: string;
28
41
  apiKey?: string;
29
- compressOutput?: boolean;
42
+ apiUrl?: string;
43
+ defaultModel?: string;
44
+ defaultProvider?: 'openai' | 'anthropic' | 'google';
30
45
  useOptimal?: boolean;
31
- llmProvider?: "openai" | "anthropic";
32
- llmApiKey?: string;
33
46
  });
34
- setLlmKey(provider: "openai" | "anthropic", apiKey: string): void;
35
- clearLlmKey(): void;
36
- hasLlmKey(): boolean;
37
- complete(prompt: string, options?: {
47
+ complete(options: CompleteOptions): Promise<CompleteResponse>;
48
+ /**
49
+ * Compress text without making an LLM call (POST /api/v1/compress).
50
+ */
51
+ compress(text: string, options?: {
38
52
  model?: string;
39
- llmProvider?: "openai" | "anthropic";
40
- llmApiKey?: string;
41
- compress?: boolean;
42
- compressOutput?: boolean;
43
- useOptimal?: boolean;
44
- temperature?: number;
45
- maxTokens?: number;
46
- }): Promise<CompleteResponse>;
53
+ algorithm?: 'greedy' | 'optimal';
54
+ tags?: Record<string, string>;
55
+ }): Promise<CompressResponse>;
56
+ }
57
+ export declare class Pcompresslr extends LightReach {
47
58
  }
package/dist/core.js CHANGED
@@ -1,152 +1,89 @@
1
1
  "use strict";
2
2
  /**
3
- * Core Pcompresslr class for compressing prompts and optionally model outputs.
3
+ * SDK client for interacting with the LightReach/Compress API.
4
+ *
5
+ * v0.2.0 is a breaking release:
6
+ * - complete() is messages-first and targets POST /api/v2/complete
7
+ * - provider API keys are not accepted by this SDK (BYOK via dashboard)
4
8
  */
5
9
  Object.defineProperty(exports, "__esModule", { value: true });
6
- exports.Pcompresslr = void 0;
10
+ exports.Pcompresslr = exports.LightReach = void 0;
7
11
  const api_client_1 = require("./api-client");
8
- /**
9
- * Main interface for prompt compression with optional output compression.
10
- *
11
- * Usage:
12
- * // Complete a prompt (compresses, calls LLM, decompresses)
13
- * const compressor = new Pcompresslr({
14
- * model: "gpt-4",
15
- * apiKey: "your-key",
16
- * llmProvider: "openai",
17
- * llmApiKey: "your-llm-key"
18
- * });
19
- * const result = await compressor.complete("Your prompt here");
20
- * console.log(result.decompressed_response);
21
- */
22
- class Pcompresslr {
12
+ class LightReach {
23
13
  constructor(options = {}) {
24
- /**
25
- * Initialize the compressor.
26
- *
27
- * @param options.model - LLM model name for tokenization (e.g., 'gpt-4', 'gpt-3.5-turbo')
28
- * @param options.apiKey - API key for authentication. If not provided, checks PCOMPRESLR_API_KEY env var.
29
- * @param options.compressOutput - If true, instruct the model to output in compressed format
30
- * @param options.useOptimal - If true, use optimal DP algorithm, else use fast greedy
31
- * @param options.llmProvider - LLM provider ('openai' or 'anthropic') - for caching LLM API key
32
- * @param options.llmApiKey - LLM provider API key - cached for use with complete() method
33
- *
34
- * Note:
35
- * When compressOutput=true, the input prompt size will increase due to
36
- * compression instructions. The benefit is that the model's output will
37
- * also be compressed, potentially saving tokens on the response.
38
- *
39
- * @throws APIKeyError - If API key is not provided and not found in environment
40
- */
41
- this.model = options.model || "gpt-4";
42
- this.compressOutput = options.compressOutput || false;
43
- this.useOptimal = options.useOptimal || false;
44
- this.llmProvider = options.llmProvider;
45
- this.llmApiKey = options.llmApiKey;
46
- // Initialize API client (uses production URL by default, can be overridden via PCOMPRESLR_API_URL env var)
47
- this.apiClient = new api_client_1.PcompresslrAPIClient(options.apiKey);
48
- }
49
- setLlmKey(provider, apiKey) {
50
- /**
51
- * Set LLM provider API key for use with complete() method.
52
- *
53
- * @param provider - LLM provider ('openai' or 'anthropic')
54
- * @param apiKey - LLM provider API key
55
- */
56
- if (provider !== "openai" && provider !== "anthropic") {
57
- throw new Error(`Unsupported provider: ${provider}. Must be 'openai' or 'anthropic'`);
58
- }
59
- this.llmProvider = provider;
60
- this.llmApiKey = apiKey;
61
- }
62
- clearLlmKey() {
63
- /**Clear cached LLM provider API key.*/
64
- this.llmProvider = undefined;
65
- this.llmApiKey = undefined;
66
- }
67
- hasLlmKey() {
68
- /**Check if LLM API key is set.*/
69
- return this.llmProvider !== undefined && this.llmApiKey !== undefined;
14
+ this.defaultModel = options.defaultModel ?? 'gpt-4';
15
+ this.defaultProvider = options.defaultProvider ?? 'openai';
16
+ this.useOptimal = options.useOptimal ?? false;
17
+ this.apiClient = new api_client_1.PcompresslrAPIClient(options.apiKey, options.apiUrl);
70
18
  }
71
- async complete(prompt, options) {
72
- /**
73
- * Complete a prompt with compression, LLM call, and decompression in one request.
74
- *
75
- * This is the only public method for using the compression service. Clients never
76
- * see compressed prompts or dictionaries - everything is handled internally.
77
- *
78
- * @param prompt - The prompt to complete
79
- * @param options.model - LLM model name. If undefined, uses the model from constructor.
80
- * @param options.llmProvider - LLM provider ('openai' or 'anthropic'). If undefined, uses cached provider or requires it.
81
- * @param options.llmApiKey - LLM provider API key. If undefined, uses cached key or requires it.
82
- * @param options.compress - Whether to compress the prompt (default: true)
83
- * @param options.compressOutput - Whether to request compressed output from LLM (default: false)
84
- * @param options.useOptimal - Whether to use optimal algorithm. If undefined, uses constructor setting.
85
- * @param options.temperature - LLM temperature setting (optional)
86
- * @param options.maxTokens - Maximum tokens to generate (optional)
87
- *
88
- * @returns Dictionary containing:
89
- * - decompressed_response: Final decompressed response
90
- * - compression_stats: Input compression statistics
91
- * - llm_stats: LLM usage statistics
92
- * - warnings: Any warnings
93
- *
94
- * @throws Error - If LLM provider/key not provided and not cached
95
- * @throws APIKeyError - If API key is invalid
96
- * @throws RateLimitError - If rate limit is exceeded
97
- * @throws APIRequestError - For other API errors
98
- */
99
- // Prompt is required for complete()
100
- if (!prompt) {
101
- throw new Error("prompt is required for complete(). Provide it as a parameter.");
102
- }
103
- // Use model from parameter or constructor
104
- const modelToUse = options?.model || this.model;
105
- // Get LLM provider - from parameter, cached, or error
106
- const providerToUse = options?.llmProvider || this.llmProvider;
107
- if (!providerToUse) {
108
- throw new Error("LLM provider is required. Provide it as a parameter or set it in constructor/setLlmKey().");
109
- }
110
- // Get LLM API key - from parameter, cached, or error
111
- const keyToUse = options?.llmApiKey || this.llmApiKey;
112
- if (!keyToUse) {
113
- throw new Error("LLM API key is required. Provide it as a parameter or set it in constructor/setLlmKey().");
114
- }
115
- // Cache the provider and key if provided
116
- if (options?.llmProvider && options?.llmApiKey) {
117
- this.llmProvider = options.llmProvider;
118
- this.llmApiKey = options.llmApiKey;
119
- }
120
- // Determine algorithm
121
- const useOptimalToUse = options?.useOptimal !== undefined
122
- ? options.useOptimal
123
- : this.useOptimal;
124
- const algorithm = useOptimalToUse ? "optimal" : "greedy";
125
- // Make the complete request
19
+ async complete(options) {
20
+ const algorithm = (options.useOptimal ?? this.useOptimal) ? 'optimal' : 'greedy';
21
+ const cfg = options.compressionConfig
22
+ ? {
23
+ compress_system: options.compressionConfig.compressSystem ?? false,
24
+ compress_user: options.compressionConfig.compressUser ?? true,
25
+ compress_assistant: options.compressionConfig.compressAssistant ?? false,
26
+ compress_only_last_n_user: options.compressionConfig.compressOnlyLastNUser ?? 1,
27
+ }
28
+ : undefined;
126
29
  try {
127
- const result = await this.apiClient.complete(prompt, modelToUse, providerToUse, keyToUse, options?.compress !== false, // default true
128
- options?.compressOutput || false, // default false
129
- algorithm, options?.temperature, options?.maxTokens);
130
- return result;
30
+ const resp = await this.apiClient.complete({
31
+ messages: options.messages,
32
+ model: options.model ?? this.defaultModel,
33
+ llm_provider: options.provider ?? this.defaultProvider,
34
+ compress: options.compress ?? true,
35
+ compression_config: cfg,
36
+ compress_output: options.compressOutput ?? false,
37
+ algorithm,
38
+ hle_target_percent: options.hleTargetPercent,
39
+ min_hle_score: options.minHleScore,
40
+ auto_select_by_hle: options.autoSelectByHle,
41
+ same_provider_only: options.sameProviderOnly,
42
+ temperature: options.temperature,
43
+ max_tokens: options.maxTokens,
44
+ tags: options.tags,
45
+ max_history_messages: options.maxHistoryMessages,
46
+ });
47
+ // Add helpful aliases to better match the Feature 0.6 spec without changing backend response.
48
+ // We do NOT fabricate cost estimates here since the API response does not include pricing data.
49
+ return {
50
+ ...resp,
51
+ text: resp.text ?? resp.decompressed_response,
52
+ tokens_saved: resp.tokens_saved ?? resp.compression_stats?.token_savings,
53
+ tokens_used: resp.tokens_used ?? resp.llm_stats?.total_tokens,
54
+ compression_ratio: resp.compression_ratio ?? resp.compression_stats?.compression_ratio,
55
+ cost_estimate: resp.cost_estimate ?? null,
56
+ savings_estimate: resp.savings_estimate ?? null,
57
+ };
131
58
  }
132
59
  catch (error) {
133
60
  if (error instanceof api_client_1.APIKeyError) {
134
61
  throw new api_client_1.APIKeyError(`${error.message}\n\n` +
135
- "To get an API key, visit https://compress.lightreach.io or " +
136
- "set the PCOMPRESLR_API_KEY environment variable.");
62
+ 'To get an API key, visit https://compress.lightreach.io or ' +
63
+ 'set the PCOMPRESLR_API_KEY environment variable.');
137
64
  }
138
65
  else if (error instanceof api_client_1.RateLimitError) {
139
66
  throw new api_client_1.RateLimitError(`${error.message}\n\n` +
140
67
  "You've exceeded your rate limit. Please wait before making more requests, " +
141
- "or upgrade your subscription plan.");
68
+ 'or upgrade your subscription plan.');
142
69
  }
143
70
  else if (error instanceof api_client_1.APIRequestError) {
144
71
  throw new api_client_1.APIRequestError(`${error.message}\n\n` +
145
- "If this problem persists, please check https://compress.lightreach.io/status " +
146
- "or contact support.");
72
+ 'If this problem persists, please check https://compress.lightreach.io/status ' +
73
+ 'or contact support.');
147
74
  }
148
75
  throw error;
149
76
  }
150
77
  }
78
+ /**
79
+ * Compress text without making an LLM call (POST /api/v1/compress).
80
+ */
81
+ async compress(text, options) {
82
+ return await this.apiClient.compress(text, options?.model ?? this.defaultModel, options?.algorithm ?? 'greedy', options?.tags);
83
+ }
84
+ }
85
+ exports.LightReach = LightReach;
86
+ // Backwards import name (API is still a breaking change vs v0.1.x)
87
+ class Pcompresslr extends LightReach {
151
88
  }
152
89
  exports.Pcompresslr = Pcompresslr;
package/dist/index.d.ts CHANGED
@@ -2,6 +2,7 @@
2
2
  * Compress Light Reach - Intelligent compression algorithms for LLM prompts.
3
3
  */
4
4
  export { __version__ } from './version';
5
- export { Pcompresslr } from './core';
5
+ export { LightReach, Pcompresslr } from './core';
6
+ export type { CompressionConfig, CompleteOptions } from './core';
6
7
  export { PcompresslrAPIClient, APIKeyError, RateLimitError, APIRequestError, PcompresslrAPIError, } from './api-client';
7
- export type { CompressResponse, DecompressResponse, CompleteResponse, HealthCheckResponse, } from './api-client';
8
+ export type { CompressResponse, DecompressResponse, CompleteResponse, HealthCheckResponse, Message, MessageRole, CompleteV2Request, } from './api-client';
package/dist/index.js CHANGED
@@ -3,11 +3,12 @@
3
3
  * Compress Light Reach - Intelligent compression algorithms for LLM prompts.
4
4
  */
5
5
  Object.defineProperty(exports, "__esModule", { value: true });
6
- exports.PcompresslrAPIError = exports.APIRequestError = exports.RateLimitError = exports.APIKeyError = exports.PcompresslrAPIClient = exports.Pcompresslr = exports.__version__ = void 0;
6
+ exports.PcompresslrAPIError = exports.APIRequestError = exports.RateLimitError = exports.APIKeyError = exports.PcompresslrAPIClient = exports.Pcompresslr = exports.LightReach = exports.__version__ = void 0;
7
7
  var version_1 = require("./version");
8
8
  Object.defineProperty(exports, "__version__", { enumerable: true, get: function () { return version_1.__version__; } });
9
- // Export main interface class
9
+ // Export main interface classes/types
10
10
  var core_1 = require("./core");
11
+ Object.defineProperty(exports, "LightReach", { enumerable: true, get: function () { return core_1.LightReach; } });
11
12
  Object.defineProperty(exports, "Pcompresslr", { enumerable: true, get: function () { return core_1.Pcompresslr; } });
12
13
  // Export API client and exceptions
13
14
  var api_client_1 = require("./api-client");
package/dist/version.d.ts CHANGED
@@ -1,4 +1,4 @@
1
1
  /**
2
2
  * Version information for compress-lightreach package.
3
3
  */
4
- export declare const __version__ = "0.1.2";
4
+ export declare const __version__ = "1.0.0";
package/dist/version.js CHANGED
@@ -4,4 +4,4 @@
4
4
  */
5
5
  Object.defineProperty(exports, "__esModule", { value: true });
6
6
  exports.__version__ = void 0;
7
- exports.__version__ = "0.1.2";
7
+ exports.__version__ = "1.0.0";
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "compress-lightreach",
3
- "version": "0.1.4",
3
+ "version": "1.0.0",
4
4
  "description": "Intelligent compression algorithms for LLM prompts that reduce token usage",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",
@@ -8,8 +8,7 @@
8
8
  "pcompresslr": "./dist/cli.js"
9
9
  },
10
10
  "scripts": {
11
- "increment-version": "node increment-version.js",
12
- "build": "npm run increment-version && tsc",
11
+ "build": "tsc",
13
12
  "prepublishOnly": "npm run build",
14
13
  "test": "jest",
15
14
  "test:watch": "jest --watch",