@aj-archipelago/cortex 0.0.11 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE CHANGED
@@ -1,10 +1,6 @@
1
1
  MIT License
2
2
 
3
- <<<<<<< HEAD
4
- Copyright (c) 2022 Al Jazeera Media Network
5
- =======
6
- Copyright (c) 2023 aj-archipelago
7
- >>>>>>> b78a6667176ebfd09d2c7ed1549d8dc18691c344
3
+ Copyright (c) 2023 Al Jazeera Media Network
8
4
 
9
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
10
6
  of this software and associated documentation files (the "Software"), to deal
package/README.md CHANGED
@@ -5,7 +5,7 @@ Modern AI models are transformational, but a number of complexities emerge when
5
5
  ## Features
6
6
 
7
7
  * Simple architecture to build custom functional endpoints (called `pathways`), that implement common NL AI tasks. Default pathways include chat, summarization, translation, paraphrasing, completion, spelling and grammar correction, entity extraction, sentiment analysis, and bias analysis.
8
- * Allows for building multi-model, multi-vendor, and model-agnostic pathways (choose the right model or combination of models for the job, implement redundancy) with built-in support for OpenAI GPT-3, GPT-3.5 (chatGPT), and GPT-4 models - both from OpenAI directly and through Azure OpenAI, OpenAI Whisper, Azure Translator, and more.
8
+ * Allows for building multi-model, multi-tool, multi-vendor, and model-agnostic pathways (choose the right model or combination of models and tools for the job, implement redundancy) with built-in support for OpenAI GPT-3, GPT-3.5 (chatGPT), and GPT-4 models - both from OpenAI directly and through Azure OpenAI, OpenAI Whisper, Azure Translator, LangChain.js and more.
9
9
  * Easy, templatized prompt definition with flexible support for most prompt engineering techniques and strategies ranging from simple single prompts to complex custom prompt chains with context continuity.
10
10
  * Built in support for long-running, asynchronous operations with progress updates or streaming responses
11
11
  * Integrated context persistence: have your pathways "remember" whatever you want and use it on the next request to the model
@@ -15,7 +15,7 @@ Modern AI models are transformational, but a number of complexities emerge when
15
15
  * Caching of repeated queries to provide instant results and avoid excess requests to the underlying model in repetitive use cases (chat bots, unit tests, etc.)
16
16
 
17
17
  ## Installation
18
- In order to use Cortex, you must first have a working Node.js environment. The version of Node.js should be at least 14 or higher. After verifying that you have the correct version of Node.js installed, you can get the simplest form up and running with a couple of commands.
18
+ In order to use Cortex, you must first have a working Node.js environment. The version of Node.js should be 18 or higher (lower versions supported with some reduction in features). After verifying that you have the correct version of Node.js installed, you can get the simplest form up and running with a couple of commands.
19
19
  ## Quick Start
20
20
  ```sh
21
21
  git clone git@github.com:aj-archipelago/cortex.git
@@ -55,18 +55,20 @@ apolloClient.query({
55
55
  ## Cortex Pathways: Supercharged Prompts
56
56
  Pathways are a core concept in Cortex. Each pathway is a single JavaScript file that encapsulates the data and logic needed to define a functional API endpoint. When the client makes a request via the API, one or more pathways are executed and the result is sent back to the client. Pathways can be very simple:
57
57
  ```js
58
- module.exports = {
59
- prompt: `{{text}}\n\nRewrite the above using British English spelling:`
58
+ export default {
59
+ prompt: `{{text}}\n\nRewrite the above using British English spelling:`
60
60
  }
61
61
  ```
62
62
  The real power of Cortex starts to show as the pathways get more complex. This pathway, for example, uses a three-part sequential prompt to ensure that specific people and place names are correctly translated:
63
63
  ```js
64
- prompt:
65
- [
66
- `{{{text}}}\nCopy the names of all people and places exactly from this document in the language above:\n`,
67
- `Original Language:\n{{{previousResult}}}\n\n{{to}}:\n`,
68
- `Entities in the document:\n\n{{{previousResult}}}\n\nDocument:\n{{{text}}}\nRewrite the document in {{to}}. If the document is already in {{to}}, copy it exactly below:\n`
69
- ]
64
+ export default {
65
+ prompt:
66
+ [
67
+ `{{{text}}}\nCopy the names of all people and places exactly from this document in the language above:\n`,
68
+ `Original Language:\n{{{previousResult}}}\n\n{{to}}:\n`,
69
+ `Entities in the document:\n\n{{{previousResult}}}\n\nDocument:\n{{{text}}}\nRewrite the document in {{to}}. If the document is already in {{to}}, copy it exactly below:\n`
70
+ ]
71
+ }
70
72
  ```
71
73
  Cortex pathway prompt enhancements include:
72
74
  * **Templatized prompt definition**: Pathways allow for easy and flexible prompt definition using Handlebars templating. This makes it simple to create and modify prompts using variables and context from the application as well as extensible internal functions provided by Cortex.
@@ -109,7 +111,7 @@ If you look closely at the examples above, you'll notice embedded parameters lik
109
111
  ### Parameters
110
112
  Pathways support an arbitrary number of input parameters. These are defined in the pathway like this:
111
113
  ```js
112
- module.exports = {
114
+ export default {
113
115
  prompt:
114
116
  [
115
117
  `{{{chatContext}}}\n\n{{{text}}}\n\nGiven the information above, create a short summary of the conversation to date making sure to include all of the personal details about the user that you encounter:\n\n`,
@@ -161,7 +163,7 @@ A core function of Cortex is dealing with token limited interfaces. To this end,
161
163
 
162
164
  Cortex provides built in functions to turn loosely formatted text output from the model API calls into structured objects for return to the application. Specifically, Cortex provides parsers for numbered lists of strings and numbered lists of objects. These are used in pathways like this:
163
165
  ```js
164
- module.exports = {
166
+ export default {
165
167
  temperature: 0,
166
168
  prompt: `{{text}}\n\nList the top {{count}} entities and their definitions for the above in the format {{format}}:`,
167
169
  format: `(name: definition)`,
@@ -179,47 +181,131 @@ The resolver property defines the function that processes the input and returns
179
181
 
180
182
  The core pathway `summary.js` below is implemented using custom pathway logic and a custom resolver to effectively target a specific summary length:
181
183
  ```js
182
- const { semanticTruncate } = require('../graphql/chunker');
183
- const { PathwayResolver } = require('../graphql/pathwayResolver');
184
- module.exports = {
185
- prompt: `{{{text}}}\n\nWrite a summary of the above text:\n\n`,
184
+ // summary.js
185
+ // Text summarization module with custom resolver
186
+ // This module exports a prompt that takes an input text and generates a summary using a custom resolver.
187
+
188
+ // Import required modules
189
+ import { semanticTruncate } from '../graphql/chunker.js';
190
+ import { PathwayResolver } from '../graphql/pathwayResolver.js';
191
+
192
+ export default {
193
+ // The main prompt function that takes the input text and asks to generate a summary.
194
+ prompt: `{{{text}}}\n\nWrite a summary of the above text. If the text is in a language other than english, make sure the summary is written in the same language:\n\n`,
195
+
196
+ // Define input parameters for the prompt, such as the target length of the summary.
186
197
  inputParameters: {
187
- targetLength: 500,
198
+ targetLength: 0,
188
199
  },
200
+
201
+ // Custom resolver to generate summaries by reprompting if they are too long or too short.
189
202
  resolver: async (parent, args, contextValue, info) => {
190
203
  const { config, pathway, requestState } = contextValue;
191
204
  const originalTargetLength = args.targetLength;
192
- const errorMargin = 0.2;
205
+
206
+ // If targetLength is not provided, execute the prompt once and return the result.
207
+ if (originalTargetLength === 0) {
208
+ let pathwayResolver = new PathwayResolver({ config, pathway, args, requestState });
209
+ return await pathwayResolver.resolve(args);
210
+ }
211
+
212
+ const errorMargin = 0.1;
193
213
  const lowTargetLength = originalTargetLength * (1 - errorMargin);
194
214
  const targetWords = Math.round(originalTargetLength / 6.6);
195
- // if the text is shorter than the summary length, just return the text
215
+
216
+ // If the text is shorter than the summary length, just return the text.
196
217
  if (args.text.length <= originalTargetLength) {
197
218
  return args.text;
198
219
  }
220
+
199
221
  const MAX_ITERATIONS = 5;
200
222
  let summary = '';
201
- let bestSummary = '';
202
- let pathwayResolver = new PathwayResolver({ config, pathway, requestState });
203
- // modify the prompt to be words-based instead of characters-based
204
- pathwayResolver.pathwayPrompt = `{{{text}}}\n\nWrite a summary of the above text in exactly ${targetWords} words:\n\n`
223
+ let pathwayResolver = new PathwayResolver({ config, pathway, args, requestState });
224
+
225
+ // Modify the prompt to be words-based instead of characters-based.
226
+ pathwayResolver.pathwayPrompt = `Write a summary of all of the text below. If the text is in a language other than english, make sure the summary is written in the same language. Your summary should be ${targetWords} words in length.\n\nText:\n\n{{{text}}}\n\nSummary:\n\n`
227
+
205
228
  let i = 0;
206
- // reprompt if summary is too long or too short
207
- while (((summary.length > originalTargetLength) || (summary.length < lowTargetLength)) && i < MAX_ITERATIONS) {
229
+ // Make sure it's long enough to start
230
+ while ((summary.length < lowTargetLength) && i < MAX_ITERATIONS) {
208
231
  summary = await pathwayResolver.resolve(args);
209
232
  i++;
210
233
  }
211
- // if the summary is still too long, truncate it
234
+
235
+ // If it's too long, it could be because the input text was chunked
236
+ // and now we have all the chunks together. We can summarize that
237
+ // to get a comprehensive summary.
238
+ if (summary.length > originalTargetLength) {
239
+ pathwayResolver.pathwayPrompt = `Write a summary of all of the text below. If the text is in a language other than english, make sure the summary is written in the same language. Your summary should be ${targetWords} words in length.\n\nText:\n\n${summary}\n\nSummary:\n\n`
240
+ summary = await pathwayResolver.resolve(args);
241
+ i++;
242
+
243
+ // Now make sure it's not too long
244
+ while ((summary.length > originalTargetLength) && i < MAX_ITERATIONS) {
245
+ pathwayResolver.pathwayPrompt = `${summary}\n\nIs that less than ${targetWords} words long? If not, try again using a length of no more than ${targetWords} words.\n\n`;
246
+ summary = await pathwayResolver.resolve(args);
247
+ i++;
248
+ }
249
+ }
250
+
251
+ // If the summary is still too long, truncate it.
212
252
  if (summary.length > originalTargetLength) {
213
253
  return semanticTruncate(summary, originalTargetLength);
214
254
  } else {
215
255
  return summary;
216
256
  }
217
257
  }
218
- }
258
+ };
219
259
  ```
260
+ ### LangChain.js Support
261
+ The ability to define a custom resolver function in Cortex pathways gives Cortex the flexibility to be able to cleanly incorporate alternate pipelines and technology stacks into the execution of a pathway. LangChain JS (https://github.com/hwchase17/langchainjs) is a very popular and well supported mechanism for wiring together models, tools, and logic to achieve some amazing results. We have developed specific functionality to support LangChain in the Cortex prompt execution framework and will continue to build features to fully integrate it with Cortex prompt execution contexts.
262
+
263
+ Below is an example pathway integrating with one of the example agents from the LangChain docs. You can see the seamless integration of Cortex's configuration and graphQL / REST interface logic.
264
+ ```js
265
+ // lc_test.js
266
+ // LangChain Cortex integration test
267
+
268
+ // Import required modules
269
+ import { OpenAI } from "langchain/llms";
270
+ import { initializeAgentExecutor } from "langchain/agents";
271
+ import { SerpAPI, Calculator } from "langchain/tools";
272
+
273
+ export default {
274
+
275
+ // Implement custom logic and interaction with Cortex
276
+ // in custom resolver.
277
+
278
+ resolver: async (parent, args, contextValue, info) => {
279
+
280
+ const { config } = contextValue;
281
+ const openAIApiKey = config.get('openaiApiKey');
282
+ const serpApiKey = config.get('serpApiKey');
283
+
284
+ const model = new OpenAI({ openAIApiKey: openAIApiKey, temperature: 0 });
285
+ const tools = [new SerpAPI( serpApiKey ), new Calculator()];
286
+
287
+ const executor = await initializeAgentExecutor(
288
+ tools,
289
+ model,
290
+ "zero-shot-react-description"
291
+ );
292
+
293
+ console.log(`====================`);
294
+ console.log("Loaded langchain agent.");
295
+ const input = args.text;
296
+ console.log(`Executing with input "${input}"...`);
297
+ const result = await executor.call({ input });
298
+ console.log(`Got output ${result.output}`);
299
+ console.log(`====================`);
300
+
301
+ return result?.output;
302
+ },
303
+ };
304
+ ```
305
+
220
306
  ### Building and Loading Pathways
221
307
 
222
- Pathways are loaded from modules in the `pathways` directory. The pathways are built and loaded to the `config` object using the `buildPathways` function. The `buildPathways` function loads the base pathway, the core pathways, and any custom pathways. It then creates a new object that contains all the pathways and adds it to the pathways property of the config object. The order of loading means that custom pathways will always override any core pathways that Cortext provides. While pathways are designed to be self-contained, you can override some pathway properties - including whether they're even available at all - in the `pathways` section of the config file.
308
+ Pathways are loaded from modules in the `pathways` directory. The pathways are built and loaded to the `config` object using the `buildPathways` function. The `buildPathways` function loads the base pathway, the core pathways, and any custom pathways. It then creates a new object that contains all the pathways and adds it to the pathways property of the config object. The order of loading means that custom pathways will always override any core pathways that Cortex provides. While pathways are designed to be self-contained, you can override some pathway properties - including whether they're even available at all - in the `pathways` section of the config file.
223
309
 
224
310
  ## Core (Default) Pathways
225
311
 
@@ -236,7 +322,84 @@ Below are the default pathways provided with Cortex. These can be used as is, ov
236
322
  - `translate`: Translates text from one language to another
237
323
  ## Extensibility
238
324
 
239
- Cortex is designed to be highly extensible. This allows you to customize the API to fit your needs. You can add new features, modify existing features, and even add integrations with other APIs and models.
325
+ Cortex is designed to be highly extensible. This allows you to customize the API to fit your needs. You can add new features, modify existing features, and even add integrations with other APIs and models. Here's an example of what an extended project might look like:
326
+
327
+ ### Cortex Internal Implementation
328
+
329
+ - **config**
330
+ - default.json
331
+ - package-lock.json
332
+ - package.json
333
+ - **pathways**
334
+ - chat_code.js
335
+ - chat_context.js
336
+ - chat_persist.js
337
+ - expand_story.js
338
+ - ...whole bunch of custom pathways
339
+ - translate_gpt4.js
340
+ - translate_turbo.js
341
+ - start.js
342
+
343
+ Where `default.json` holds all of your specific configuration:
344
+ ```js
345
+ {
346
+ "defaultModelName": "oai-gpturbo",
347
+ "models": {
348
+ "oai-td3": {
349
+ "type": "OPENAI-COMPLETION",
350
+ "url": "https://api.openai.com/v1/completions",
351
+ "headers": {
352
+ "Authorization": "Bearer {{OPENAI_API_KEY}}",
353
+ "Content-Type": "application/json"
354
+ },
355
+ "params": {
356
+ "model": "text-davinci-003"
357
+ },
358
+ "requestsPerSecond": 10,
359
+ "maxTokenLength": 4096
360
+ },
361
+ "oai-gpturbo": {
362
+ "type": "OPENAI-CHAT",
363
+ "url": "https://api.openai.com/v1/chat/completions",
364
+ "headers": {
365
+ "Authorization": "Bearer {{OPENAI_API_KEY}}",
366
+ "Content-Type": "application/json"
367
+ },
368
+ "params": {
369
+ "model": "gpt-3.5-turbo"
370
+ },
371
+ "requestsPerSecond": 10,
372
+ "maxTokenLength": 8192
373
+ },
374
+ "oai-gpt4": {
375
+ "type": "OPENAI-CHAT",
376
+ "url": "https://api.openai.com/v1/chat/completions",
377
+ "headers": {
378
+ "Authorization": "Bearer {{OPENAI_API_KEY}}",
379
+ "Content-Type": "application/json"
380
+ },
381
+ "params": {
382
+ "model": "gpt-4"
383
+ },
384
+ "requestsPerSecond": 10,
385
+ "maxTokenLength": 8192
386
+ }
387
+ },
388
+ "enableCache": false,
389
+ "enableRestEndpoints": false
390
+ }
391
+ ```
392
+
393
+ ...and `start.js` is really simple:
394
+ ```js
395
+ import cortex from '@aj-archipelago/cortex';
396
+
397
+ (async () => {
398
+ const { startServer } = await cortex();
399
+ startServer && startServer();
400
+ })();
401
+ ```
402
+
240
403
  ## Configuration
241
404
  Configuration of Cortex is done via a [convict](https://github.com/mozilla/node-convict/tree/master) object called `config`. The `config` object is built by combining the default values and any values specified in a configuration file or environment variables. The environment variables take precedence over the values in the configuration file. Below are the configurable properties and their defaults:
242
405
 
@@ -272,7 +435,7 @@ If you encounter any issues while using Cortex, there are a few things you can d
272
435
  If you would like to contribute to Cortex, there are two ways to do so. You can submit issues to the Cortex GitHub repository or submit pull requests with your proposed changes.
273
436
 
274
437
  ## License
275
- Cortex is released under the MIT License. See [LICENSE](https://github.com/ALJAZEERAPLUS/cortex/blob/main/LICENSE) for more details.
438
+ Cortex is released under the MIT License. See [LICENSE](https://github.com/aj-archipelago/cortex/blob/main/LICENSE) for more details.
276
439
 
277
440
  ## API Reference
278
441
  Detailed documentation on Cortex's API can be found in the /graphql endpoint of your project. Examples of queries and responses can also be found in the Cortex documentation, along with tips for getting the most out of Cortex.
@@ -280,5 +443,6 @@ Detailed documentation on Cortex's API can be found in the /graphql endpoint of
280
443
  ## Roadmap
281
444
  Cortex is a constantly evolving project, and the following features are coming soon:
282
445
 
446
+ * Prompt execution context preservation between calls (to enable interactive, multi-call integrations with LangChain and other technologies)
283
447
  * Model-specific cache key optimizations to increase hit rate and reduce cache size
284
448
  * Structured analytics and reporting on AI API call frequency, cost, cache hit rate, etc.
package/SECURITY.md ADDED
@@ -0,0 +1,22 @@
1
+ # Security Policy
2
+
3
+ ## Supported Versions
4
+
5
+ We take the security of our project seriously. The table below shows the versions of Cortex currently being supported with security updates.
6
+
7
+ | Version | Supported |
8
+ | ------- | ------------------ |
9
+ | 1.x.x | :white_check_mark: |
10
+
11
+ ## Reporting a Vulnerability
12
+
13
+ If you have discovered a security vulnerability in Cortex, please follow these steps to report it:
14
+
15
+ 1. **Do not** create a public GitHub issue, as this might expose the vulnerability to others.
16
+ 2. Please follow the GitHub process for [Privately Reporting a Security Vulnerability](https://docs.github.com/en/code-security/security-advisories/guidance-on-reporting-and-writing/privately-reporting-a-security-vulnerability)
17
+
18
+ ## Disclosure Policy
19
+
20
+ Cortex follows responsible disclosure practices. Once a vulnerability is confirmed and a fix is developed, we will release a security update and publicly disclose the vulnerability. We will credit the reporter of the vulnerability in the disclosure, unless the reporter wishes to remain anonymous.
21
+
22
+ We appreciate your help in keeping Cortex secure and your responsible disclosure of any security vulnerabilities you discover.
@@ -151,7 +151,7 @@ class PathwayResolver {
151
151
  }
152
152
  const encoded = encode(text);
153
153
  if (!this.useInputChunking || encoded.length <= chunkTokenLength) { // no chunking, return as is
154
- if (encoded.length >= chunkTokenLength) {
154
+ if (encoded.length > 0 && encoded.length >= chunkTokenLength) {
155
155
  const warnText = `Truncating long input text. Text length: ${text.length}`;
156
156
  this.warnings.push(warnText);
157
157
  console.warn(warnText);
@@ -189,7 +189,7 @@ class PathwayResolver {
189
189
  // the token ratio is the ratio of the total prompt to the result text - both have to be included
190
190
  // in computing the max token length
191
191
  const promptRatio = this.pathwayPrompter.plugin.getPromptTokenRatio();
192
- let chunkMaxTokenLength = promptRatio * this.pathwayPrompter.plugin.getModelMaxTokenLength() - maxPromptTokenLength;
192
+ let chunkMaxTokenLength = promptRatio * this.pathwayPrompter.plugin.getModelMaxTokenLength() - maxPromptTokenLength - 1;
193
193
 
194
194
  // if we have to deal with prompts that have both text input
195
195
  // and previous result, we need to split the maxChunkToken in half
@@ -0,0 +1,23 @@
1
+ // localModelPlugin.js
2
+ import ModelPlugin from './modelPlugin.js';
3
+ import { execFileSync } from 'child_process';
4
+
5
+ class LocalModelPlugin extends ModelPlugin {
6
+ constructor(config, pathway) {
7
+ super(config, pathway);
8
+ }
9
+
10
+ async execute(text, parameters, prompt, pathwayResolver) {
11
+ const { modelPromptText } = this.getCompiledPrompt(text, parameters, prompt);
12
+
13
+ try {
14
+ const result = execFileSync(executablePath, [text], { encoding: 'utf8' });
15
+ return result;
16
+ } catch (error) {
17
+ console.error('Error running local model:', error);
18
+ throw error;
19
+ }
20
+ }
21
+ }
22
+
23
+ export default LocalModelPlugin;
@@ -12,14 +12,14 @@ class OpenAIChatPlugin extends ModelPlugin {
12
12
  const { stream } = parameters;
13
13
 
14
14
  // Define the model's max token length
15
- const modelMaxTokenLength = this.getModelMaxTokenLength() * this.getPromptTokenRatio();
15
+ const modelTargetTokenLength = this.getModelMaxTokenLength() * this.getPromptTokenRatio();
16
16
 
17
17
  let requestMessages = modelPromptMessages || [{ "role": "user", "content": modelPromptText }];
18
18
 
19
19
  // Check if the token length exceeds the model's max token length
20
- if (tokenLength > modelMaxTokenLength) {
20
+ if (tokenLength > modelTargetTokenLength) {
21
21
  // Remove older messages until the token length is within the model's limit
22
- requestMessages = this.truncateMessagesToTargetLength(requestMessages, modelMaxTokenLength);
22
+ requestMessages = this.truncateMessagesToTargetLength(requestMessages, modelTargetTokenLength);
23
23
  }
24
24
 
25
25
  const requestParameters = {
@@ -13,19 +13,22 @@ class OpenAICompletionPlugin extends ModelPlugin {
13
13
  let { modelPromptMessages, modelPromptText, tokenLength } = this.getCompiledPrompt(text, parameters, prompt);
14
14
  const { stream } = parameters;
15
15
  let modelPromptMessagesML = '';
16
- const modelMaxTokenLength = this.getModelMaxTokenLength();
16
+ // Define the model's max token length
17
+ const modelTargetTokenLength = this.getModelMaxTokenLength() * this.getPromptTokenRatio();
17
18
  let requestParameters = {};
18
19
 
19
20
  if (modelPromptMessages) {
20
- const requestMessages = this.truncateMessagesToTargetLength(modelPromptMessages, modelMaxTokenLength - 1);
21
+ const minMsg = [{ role: "system", content: "" }];
22
+ const addAssistantTokens = encode(this.messagesToChatML(minMsg, true).replace(this.messagesToChatML(minMsg, false), '')).length;
23
+ const requestMessages = this.truncateMessagesToTargetLength(modelPromptMessages, (modelTargetTokenLength - addAssistantTokens));
21
24
  modelPromptMessagesML = this.messagesToChatML(requestMessages);
22
25
  tokenLength = encode(modelPromptMessagesML).length;
23
26
 
24
- if (tokenLength >= modelMaxTokenLength) {
25
- throw new Error(`The maximum number of tokens for this model is ${modelMaxTokenLength}. Please reduce the number of messages in the prompt.`);
27
+ if (tokenLength > modelTargetTokenLength) {
28
+ throw new Error(`Input is too long at ${tokenLength} tokens (this target token length for this pathway is ${modelTargetTokenLength} tokens because the response is expected to take up the rest of the model's max tokens (${this.getModelMaxTokenLength()}). You must reduce the size of the prompt to continue.`);
26
29
  }
27
30
 
28
- const max_tokens = modelMaxTokenLength - tokenLength - 1;
31
+ const max_tokens = this.getModelMaxTokenLength() - tokenLength;
29
32
 
30
33
  requestParameters = {
31
34
  prompt: modelPromptMessagesML,
@@ -38,11 +41,11 @@ class OpenAICompletionPlugin extends ModelPlugin {
38
41
  stream
39
42
  };
40
43
  } else {
41
- if (tokenLength >= modelMaxTokenLength) {
42
- throw new Error(`The maximum number of tokens for this model is ${modelMaxTokenLength}. Please reduce the length of the prompt.`);
44
+ if (tokenLength > modelTargetTokenLength) {
45
+ throw new Error(`Input is too long at ${tokenLength} tokens. The target token length for this pathway is ${modelTargetTokenLength} tokens because the response is expected to take up the rest of the ${this.getModelMaxTokenLength()} tokens that the model can handle. You must reduce the size of the prompt to continue.`);
43
46
  }
44
47
 
45
- const max_tokens = modelMaxTokenLength - tokenLength - 1;
48
+ const max_tokens = this.getModelMaxTokenLength() - tokenLength;
46
49
 
47
50
  requestParameters = {
48
51
  prompt: modelPromptText,
@@ -1,4 +1,4 @@
1
- // OpenAICompletionPlugin.js
1
+ // openAiWhisperPlugin.js
2
2
  import ModelPlugin from './modelPlugin.js';
3
3
 
4
4
  import FormData from 'form-data';
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@aj-archipelago/cortex",
3
- "version": "0.0.11",
3
+ "version": "1.0.1",
4
4
  "description": "Cortex is a GraphQL API for AI. It provides a simple, extensible interface for using AI services from OpenAI, Azure and others.",
5
5
  "repository": {
6
6
  "type": "git",