@tyvm/knowhow 0.0.68 → 0.0.70
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/shell-commands.md +174 -0
- package/package.json +2 -2
- package/src/agents/base/base.ts +1 -3
- package/src/agents/developer/developer.ts +21 -16
- package/src/agents/tools/agentCall.ts +4 -2
- package/src/agents/tools/fileSearch.ts +5 -1
- package/src/agents/tools/list.ts +41 -37
- package/src/agents/tools/startAgentTask.ts +131 -22
- package/src/chat/CliChatService.ts +57 -11
- package/src/chat/modules/AgentModule.ts +72 -12
- package/src/chat/modules/CustomCommandsModule.ts +79 -0
- package/src/chat/modules/InternalChatModule.ts +11 -1
- package/src/chat/modules/ShellCommandModule.ts +96 -0
- package/src/chat/modules/index.ts +1 -0
- package/src/chat/types.ts +14 -2
- package/src/chat.ts +16 -13
- package/src/cli.ts +16 -6
- package/src/clients/anthropic.ts +88 -91
- package/src/clients/gemini.ts +495 -94
- package/src/clients/index.ts +125 -0
- package/src/clients/knowhow.ts +81 -0
- package/src/clients/openai.ts +256 -145
- package/src/clients/pricing/anthropic.ts +90 -0
- package/src/clients/pricing/google.ts +65 -0
- package/src/clients/pricing/index.ts +4 -0
- package/src/clients/pricing/openai.ts +134 -0
- package/src/clients/pricing/xai.ts +62 -0
- package/src/clients/types.ts +170 -1
- package/src/clients/xai.ts +275 -46
- package/src/config.ts +61 -15
- package/src/embeddings.ts +9 -1
- package/src/microphone.ts +15 -16
- package/src/migrations.ts +151 -0
- package/src/plugins/AgentsMdPlugin.ts +118 -0
- package/src/plugins/PluginBase.ts +8 -0
- package/src/plugins/downloader/downloader.ts +5 -6
- package/src/plugins/embedding.ts +10 -8
- package/src/plugins/exec.ts +70 -0
- package/src/plugins/github.ts +120 -74
- package/src/plugins/language.ts +11 -13
- package/src/plugins/plugins.ts +25 -4
- package/src/plugins/tmux.ts +132 -0
- package/src/plugins/types.ts +1 -0
- package/src/plugins/vim.ts +14 -1
- package/src/server/index.ts +2 -0
- package/src/services/AgentSyncFs.ts +417 -0
- package/src/services/{AgentSynchronization.ts → AgentSyncKnowhowWeb.ts} +2 -2
- package/src/services/EventService.ts +0 -1
- package/src/services/KnowhowClient.ts +106 -0
- package/src/services/index.ts +4 -2
- package/src/types.ts +57 -4
- package/src/worker.ts +25 -2
- package/tests/manual/modalities/README.md +157 -0
- package/tests/manual/modalities/google.modalities.test.ts +335 -0
- package/tests/manual/modalities/openai.modalities.test.ts +329 -0
- package/tests/manual/modalities/streaming.test.ts +260 -0
- package/tests/manual/modalities/xai.modalities.test.ts +307 -0
- package/tests/plugins/language/languagePlugin-content-triggers.test.ts +5 -5
- package/tests/plugins/language/languagePlugin-integration.test.ts +1 -1
- package/tests/plugins/language/languagePlugin.test.ts +17 -8
- package/ts_build/package.json +2 -2
- package/ts_build/src/agents/base/base.js +1 -1
- package/ts_build/src/agents/base/base.js.map +1 -1
- package/ts_build/src/agents/developer/developer.js +21 -15
- package/ts_build/src/agents/developer/developer.js.map +1 -1
- package/ts_build/src/agents/tools/agentCall.js +4 -2
- package/ts_build/src/agents/tools/agentCall.js.map +1 -1
- package/ts_build/src/agents/tools/executeScript/index.d.ts +1 -1
- package/ts_build/src/agents/tools/fileSearch.js +2 -1
- package/ts_build/src/agents/tools/fileSearch.js.map +1 -1
- package/ts_build/src/agents/tools/github/index.d.ts +1 -1
- package/ts_build/src/agents/tools/list.js +41 -37
- package/ts_build/src/agents/tools/list.js.map +1 -1
- package/ts_build/src/agents/tools/startAgentTask.d.ts +2 -1
- package/ts_build/src/agents/tools/startAgentTask.js +118 -17
- package/ts_build/src/agents/tools/startAgentTask.js.map +1 -1
- package/ts_build/src/chat/CliChatService.d.ts +4 -0
- package/ts_build/src/chat/CliChatService.js +39 -5
- package/ts_build/src/chat/CliChatService.js.map +1 -1
- package/ts_build/src/chat/modules/AgentModule.d.ts +4 -1
- package/ts_build/src/chat/modules/AgentModule.js +49 -11
- package/ts_build/src/chat/modules/AgentModule.js.map +1 -1
- package/ts_build/src/chat/modules/CustomCommandsModule.d.ts +9 -0
- package/ts_build/src/chat/modules/CustomCommandsModule.js +58 -0
- package/ts_build/src/chat/modules/CustomCommandsModule.js.map +1 -0
- package/ts_build/src/chat/modules/InternalChatModule.d.ts +2 -0
- package/ts_build/src/chat/modules/InternalChatModule.js +10 -0
- package/ts_build/src/chat/modules/InternalChatModule.js.map +1 -1
- package/ts_build/src/chat/modules/ShellCommandModule.d.ts +8 -0
- package/ts_build/src/chat/modules/ShellCommandModule.js +83 -0
- package/ts_build/src/chat/modules/ShellCommandModule.js.map +1 -0
- package/ts_build/src/chat/modules/index.d.ts +1 -0
- package/ts_build/src/chat/modules/index.js +3 -1
- package/ts_build/src/chat/modules/index.js.map +1 -1
- package/ts_build/src/chat/types.d.ts +11 -1
- package/ts_build/src/chat.js +16 -13
- package/ts_build/src/chat.js.map +1 -1
- package/ts_build/src/cli.js +10 -3
- package/ts_build/src/cli.js.map +1 -1
- package/ts_build/src/clients/anthropic.d.ts +6 -1
- package/ts_build/src/clients/anthropic.js +47 -92
- package/ts_build/src/clients/anthropic.js.map +1 -1
- package/ts_build/src/clients/gemini.d.ts +81 -2
- package/ts_build/src/clients/gemini.js +362 -79
- package/ts_build/src/clients/gemini.js.map +1 -1
- package/ts_build/src/clients/index.d.ts +9 -1
- package/ts_build/src/clients/index.js +65 -0
- package/ts_build/src/clients/index.js.map +1 -1
- package/ts_build/src/clients/knowhow.d.ts +9 -1
- package/ts_build/src/clients/knowhow.js +43 -0
- package/ts_build/src/clients/knowhow.js.map +1 -1
- package/ts_build/src/clients/openai.d.ts +9 -1
- package/ts_build/src/clients/openai.js +201 -133
- package/ts_build/src/clients/openai.js.map +1 -1
- package/ts_build/src/clients/pricing/anthropic.d.ts +17 -0
- package/ts_build/src/clients/pricing/anthropic.js +93 -0
- package/ts_build/src/clients/pricing/anthropic.js.map +1 -0
- package/ts_build/src/clients/pricing/google.d.ts +73 -0
- package/ts_build/src/clients/pricing/google.js +68 -0
- package/ts_build/src/clients/pricing/google.js.map +1 -0
- package/ts_build/src/clients/pricing/index.d.ts +4 -0
- package/ts_build/src/clients/pricing/index.js +14 -0
- package/ts_build/src/clients/pricing/index.js.map +1 -0
- package/ts_build/src/clients/pricing/openai.d.ts +7 -0
- package/ts_build/src/clients/pricing/openai.js +137 -0
- package/ts_build/src/clients/pricing/openai.js.map +1 -0
- package/ts_build/src/clients/pricing/xai.d.ts +26 -0
- package/ts_build/src/clients/pricing/xai.js +59 -0
- package/ts_build/src/clients/pricing/xai.js.map +1 -0
- package/ts_build/src/clients/types.d.ts +135 -0
- package/ts_build/src/clients/xai.d.ts +9 -1
- package/ts_build/src/clients/xai.js +178 -46
- package/ts_build/src/clients/xai.js.map +1 -1
- package/ts_build/src/config.d.ts +1 -0
- package/ts_build/src/config.js +45 -16
- package/ts_build/src/config.js.map +1 -1
- package/ts_build/src/embeddings.js +8 -1
- package/ts_build/src/embeddings.js.map +1 -1
- package/ts_build/src/microphone.js +7 -9
- package/ts_build/src/microphone.js.map +1 -1
- package/ts_build/src/migrations.d.ts +17 -0
- package/ts_build/src/migrations.js +86 -0
- package/ts_build/src/migrations.js.map +1 -0
- package/ts_build/src/plugins/AgentsMdPlugin.d.ts +13 -0
- package/ts_build/src/plugins/AgentsMdPlugin.js +118 -0
- package/ts_build/src/plugins/AgentsMdPlugin.js.map +1 -0
- package/ts_build/src/plugins/PluginBase.d.ts +1 -0
- package/ts_build/src/plugins/PluginBase.js +3 -0
- package/ts_build/src/plugins/PluginBase.js.map +1 -1
- package/ts_build/src/plugins/downloader/downloader.js +5 -5
- package/ts_build/src/plugins/downloader/downloader.js.map +1 -1
- package/ts_build/src/plugins/embedding.js +9 -8
- package/ts_build/src/plugins/embedding.js.map +1 -1
- package/ts_build/src/plugins/exec.d.ts +10 -0
- package/ts_build/src/plugins/exec.js +56 -0
- package/ts_build/src/plugins/exec.js.map +1 -0
- package/ts_build/src/plugins/github.js +93 -51
- package/ts_build/src/plugins/github.js.map +1 -1
- package/ts_build/src/plugins/language.js +14 -11
- package/ts_build/src/plugins/language.js.map +1 -1
- package/ts_build/src/plugins/plugins.d.ts +1 -0
- package/ts_build/src/plugins/plugins.js +19 -1
- package/ts_build/src/plugins/plugins.js.map +1 -1
- package/ts_build/src/plugins/tmux.d.ts +14 -0
- package/ts_build/src/plugins/tmux.js +108 -0
- package/ts_build/src/plugins/tmux.js.map +1 -0
- package/ts_build/src/plugins/types.d.ts +1 -0
- package/ts_build/src/plugins/vim.js +11 -1
- package/ts_build/src/plugins/vim.js.map +1 -1
- package/ts_build/src/server/index.js.map +1 -1
- package/ts_build/src/services/AgentSyncFs.d.ts +34 -0
- package/ts_build/src/services/AgentSyncFs.js +325 -0
- package/ts_build/src/services/AgentSyncFs.js.map +1 -0
- package/ts_build/src/services/AgentSyncKnowhowWeb.d.ts +29 -0
- package/ts_build/src/services/AgentSyncKnowhowWeb.js +178 -0
- package/ts_build/src/services/AgentSyncKnowhowWeb.js.map +1 -0
- package/ts_build/src/services/AgentSynchronization.d.ts +1 -1
- package/ts_build/src/services/AgentSynchronization.js +3 -3
- package/ts_build/src/services/AgentSynchronization.js.map +1 -1
- package/ts_build/src/services/EventService.js.map +1 -1
- package/ts_build/src/services/KnowhowClient.d.ts +9 -1
- package/ts_build/src/services/KnowhowClient.js +58 -0
- package/ts_build/src/services/KnowhowClient.js.map +1 -1
- package/ts_build/src/services/index.d.ts +2 -1
- package/ts_build/src/services/index.js +2 -1
- package/ts_build/src/services/index.js.map +1 -1
- package/ts_build/src/types.d.ts +26 -1
- package/ts_build/src/types.js +45 -4
- package/ts_build/src/types.js.map +1 -1
- package/ts_build/src/utils/PersistentInputManager.d.ts +28 -0
- package/ts_build/src/utils/PersistentInputManager.js +293 -0
- package/ts_build/src/utils/PersistentInputManager.js.map +1 -0
- package/ts_build/src/worker.js +11 -2
- package/ts_build/src/worker.js.map +1 -1
- package/ts_build/tests/manual/modalities/google.modalities.test.d.ts +1 -0
- package/ts_build/tests/manual/modalities/google.modalities.test.js +252 -0
- package/ts_build/tests/manual/modalities/google.modalities.test.js.map +1 -0
- package/ts_build/tests/manual/modalities/openai.modalities.test.d.ts +1 -0
- package/ts_build/tests/manual/modalities/openai.modalities.test.js +252 -0
- package/ts_build/tests/manual/modalities/openai.modalities.test.js.map +1 -0
- package/ts_build/tests/manual/modalities/streaming.test.d.ts +1 -0
- package/ts_build/tests/manual/modalities/streaming.test.js +206 -0
- package/ts_build/tests/manual/modalities/streaming.test.js.map +1 -0
- package/ts_build/tests/manual/modalities/xai.modalities.test.d.ts +1 -0
- package/ts_build/tests/manual/modalities/xai.modalities.test.js +226 -0
- package/ts_build/tests/manual/modalities/xai.modalities.test.js.map +1 -0
- package/ts_build/tests/manual/persistent-input-test.d.ts +1 -0
- package/ts_build/tests/manual/persistent-input-test.js +35 -0
- package/ts_build/tests/manual/persistent-input-test.js.map +1 -0
- package/ts_build/tests/plugins/language/languagePlugin-content-triggers.test.js +5 -5
- package/ts_build/tests/plugins/language/languagePlugin-content-triggers.test.js.map +1 -1
- package/ts_build/tests/plugins/language/languagePlugin-integration.test.js +1 -1
- package/ts_build/tests/plugins/language/languagePlugin-integration.test.js.map +1 -1
- package/ts_build/tests/plugins/language/languagePlugin.test.js +17 -7
- package/ts_build/tests/plugins/language/languagePlugin.test.js.map +1 -1
|
@@ -7,6 +7,20 @@ import {
|
|
|
7
7
|
CompletionResponse,
|
|
8
8
|
EmbeddingOptions,
|
|
9
9
|
EmbeddingResponse,
|
|
10
|
+
AudioTranscriptionOptions,
|
|
11
|
+
AudioTranscriptionResponse,
|
|
12
|
+
AudioGenerationOptions,
|
|
13
|
+
AudioGenerationResponse,
|
|
14
|
+
ImageGenerationOptions,
|
|
15
|
+
ImageGenerationResponse,
|
|
16
|
+
VideoGenerationOptions,
|
|
17
|
+
VideoGenerationResponse,
|
|
18
|
+
VideoStatusOptions,
|
|
19
|
+
VideoStatusResponse,
|
|
20
|
+
FileUploadOptions,
|
|
21
|
+
FileUploadResponse,
|
|
22
|
+
FileDownloadOptions,
|
|
23
|
+
FileDownloadResponse,
|
|
10
24
|
} from "../clients";
|
|
11
25
|
import { Config } from "../types";
|
|
12
26
|
|
|
@@ -202,6 +216,98 @@ export class KnowhowSimpleClient {
|
|
|
202
216
|
});
|
|
203
217
|
}
|
|
204
218
|
|
|
219
|
+
async createAudioTranscription(options: AudioTranscriptionOptions) {
|
|
220
|
+
await this.checkJwt();
|
|
221
|
+
const formData = new FormData();
|
|
222
|
+
// options.file can be a Buffer, ReadStream, Blob, or File
|
|
223
|
+
if (Buffer.isBuffer(options.file)) {
|
|
224
|
+
formData.append("file", new Blob([options.file]), options["fileName"] || "audio.mp3");
|
|
225
|
+
} else {
|
|
226
|
+
formData.append("file", options.file);
|
|
227
|
+
}
|
|
228
|
+
if (options.model) formData.append("model", options.model);
|
|
229
|
+
if (options.language) formData.append("language", options.language);
|
|
230
|
+
if (options.prompt) formData.append("prompt", options.prompt);
|
|
231
|
+
if (options.response_format) formData.append("response_format", options.response_format);
|
|
232
|
+
if (options.temperature != null) formData.append("temperature", String(options.temperature));
|
|
233
|
+
|
|
234
|
+
return axios.post<AudioTranscriptionResponse>(
|
|
235
|
+
`${this.baseUrl}/api/proxy/v1/audio/transcriptions`,
|
|
236
|
+
formData,
|
|
237
|
+
{ headers: { ...this.headers } }
|
|
238
|
+
);
|
|
239
|
+
}
|
|
240
|
+
|
|
241
|
+
async createAudioGeneration(options: AudioGenerationOptions) {
|
|
242
|
+
await this.checkJwt();
|
|
243
|
+
return axios.post<AudioGenerationResponse>(
|
|
244
|
+
`${this.baseUrl}/api/proxy/v1/audio/generations`,
|
|
245
|
+
options,
|
|
246
|
+
{ headers: this.headers }
|
|
247
|
+
);
|
|
248
|
+
}
|
|
249
|
+
|
|
250
|
+
async createImageGeneration(options: ImageGenerationOptions) {
|
|
251
|
+
await this.checkJwt();
|
|
252
|
+
return axios.post<ImageGenerationResponse>(
|
|
253
|
+
`${this.baseUrl}/api/proxy/v1/images/generations`,
|
|
254
|
+
options,
|
|
255
|
+
{ headers: this.headers }
|
|
256
|
+
);
|
|
257
|
+
}
|
|
258
|
+
|
|
259
|
+
async createVideoGeneration(options: VideoGenerationOptions) {
|
|
260
|
+
await this.checkJwt();
|
|
261
|
+
return axios.post<VideoGenerationResponse>(
|
|
262
|
+
`${this.baseUrl}/api/proxy/v1/videos/generations`,
|
|
263
|
+
options,
|
|
264
|
+
{ headers: this.headers }
|
|
265
|
+
);
|
|
266
|
+
}
|
|
267
|
+
|
|
268
|
+
async getVideoStatus(options: VideoStatusOptions) {
|
|
269
|
+
await this.checkJwt();
|
|
270
|
+
const { jobId, ...rest } = options;
|
|
271
|
+
return axios.get<VideoStatusResponse>(
|
|
272
|
+
`${this.baseUrl}/api/proxy/v1/videos/${jobId}/status`,
|
|
273
|
+
{ headers: this.headers, params: rest }
|
|
274
|
+
);
|
|
275
|
+
}
|
|
276
|
+
|
|
277
|
+
async downloadVideo(options: FileDownloadOptions) {
|
|
278
|
+
await this.checkJwt();
|
|
279
|
+
const { fileId } = options;
|
|
280
|
+
return axios.get<ArrayBuffer>(
|
|
281
|
+
`${this.baseUrl}/api/proxy/v1/videos/${fileId}/content`,
|
|
282
|
+
{ headers: this.headers, responseType: "arraybuffer" }
|
|
283
|
+
);
|
|
284
|
+
}
|
|
285
|
+
|
|
286
|
+
async uploadFile(options: FileUploadOptions) {
|
|
287
|
+
await this.checkJwt();
|
|
288
|
+
// Send as JSON with base64-encoded data
|
|
289
|
+
const body = {
|
|
290
|
+
data: options.data.toString("base64"),
|
|
291
|
+
mimeType: options.mimeType,
|
|
292
|
+
fileName: options.fileName,
|
|
293
|
+
displayName: options.displayName,
|
|
294
|
+
};
|
|
295
|
+
return axios.post<FileUploadResponse>(
|
|
296
|
+
`${this.baseUrl}/api/proxy/v1/files`,
|
|
297
|
+
body,
|
|
298
|
+
{ headers: this.headers }
|
|
299
|
+
);
|
|
300
|
+
}
|
|
301
|
+
|
|
302
|
+
async downloadFile(options: FileDownloadOptions) {
|
|
303
|
+
await this.checkJwt();
|
|
304
|
+
const { fileId } = options;
|
|
305
|
+
return axios.get<ArrayBuffer>(
|
|
306
|
+
`${this.baseUrl}/api/proxy/v1/files/${fileId}/content`,
|
|
307
|
+
{ headers: this.headers, responseType: "arraybuffer" }
|
|
308
|
+
);
|
|
309
|
+
}
|
|
310
|
+
|
|
205
311
|
async createChatTask(request: CreateMessageTaskRequest) {
|
|
206
312
|
await this.checkJwt();
|
|
207
313
|
return axios.post<CreateMessageTaskResponse>(
|
package/src/services/index.ts
CHANGED
|
@@ -10,7 +10,8 @@ import { S3Service } from "./S3";
|
|
|
10
10
|
import { ToolsService } from "./Tools";
|
|
11
11
|
import { PluginService } from "../plugins/plugins";
|
|
12
12
|
import { DockerService } from "./DockerService";
|
|
13
|
-
import {
|
|
13
|
+
import { AgentSyncKnowhowWeb } from "./AgentSyncKnowhowWeb";
|
|
14
|
+
import { AgentSyncFs } from "./AgentSyncFs";
|
|
14
15
|
import { SessionManager } from "./SessionManager";
|
|
15
16
|
import { TaskRegistry } from "./TaskRegistry";
|
|
16
17
|
|
|
@@ -24,7 +25,8 @@ export * from "./LazyToolsService";
|
|
|
24
25
|
export * as MCP from "./Mcp";
|
|
25
26
|
export * from "./EmbeddingService";
|
|
26
27
|
export * from "./DockerService";
|
|
27
|
-
export * from "./
|
|
28
|
+
export * from "./AgentSyncKnowhowWeb";
|
|
29
|
+
export * from "./AgentSyncFs";
|
|
28
30
|
export * from "./SessionManager";
|
|
29
31
|
export * from "./TaskRegistry";
|
|
30
32
|
export { Clients } from "../clients";
|
package/src/types.ts
CHANGED
|
@@ -46,7 +46,8 @@ export type Config = {
|
|
|
46
46
|
embedSources: EmbedSource[];
|
|
47
47
|
embeddingModel: string;
|
|
48
48
|
|
|
49
|
-
plugins: string[];
|
|
49
|
+
plugins: { enabled: string[]; disabled: string[] };
|
|
50
|
+
|
|
50
51
|
modules: string[];
|
|
51
52
|
|
|
52
53
|
agents: Assistant[];
|
|
@@ -140,6 +141,7 @@ export type Language = {
|
|
|
140
141
|
events: string[];
|
|
141
142
|
sources: IDatasource[];
|
|
142
143
|
context?: string;
|
|
144
|
+
handled?: boolean;
|
|
143
145
|
};
|
|
144
146
|
};
|
|
145
147
|
|
|
@@ -177,6 +179,8 @@ export const Models = {
|
|
|
177
179
|
Grok3MiniFastBeta: "grok-3-mini-fast-beta",
|
|
178
180
|
Grok21212: "grok-2-1212",
|
|
179
181
|
Grok2Vision1212: "grok-2-vision-1212",
|
|
182
|
+
GrokImagineImage: "grok-imagine-image",
|
|
183
|
+
GrokImagineVideo: "grok-imagine-video",
|
|
180
184
|
},
|
|
181
185
|
openai: {
|
|
182
186
|
GPT_5_2: "gpt-5.2",
|
|
@@ -203,6 +207,14 @@ export const Models = {
|
|
|
203
207
|
o1_Mini: "o1-mini-2024-09-12",
|
|
204
208
|
GPT_4o_Mini_Search: "gpt-4o-mini-search-preview-2025-03-11",
|
|
205
209
|
GPT_4o_Search: "gpt-4o-search-preview-2025-03-11",
|
|
210
|
+
|
|
211
|
+
TTS_1: "tts-1",
|
|
212
|
+
Whisper_1: "whisper-1",
|
|
213
|
+
DALL_E_3: "dall-e-3",
|
|
214
|
+
DALL_E_2: "dall-e-2",
|
|
215
|
+
Sora: "sora",
|
|
216
|
+
Sora_2: "sora-2",
|
|
217
|
+
Sora_2_Pro: "sora-2-pro",
|
|
206
218
|
// Computer_Use: "computer-use-preview-2025-03-11",
|
|
207
219
|
// Codex_Mini: "codex-mini-latest",
|
|
208
220
|
},
|
|
@@ -212,14 +224,17 @@ export const Models = {
|
|
|
212
224
|
Gemini_25_Pro_Preview: "gemini-2.5-pro-preview-05-06",
|
|
213
225
|
Gemini_20_Flash: "gemini-2.0-flash",
|
|
214
226
|
Gemini_20_Flash_Preview_Image_Generation:
|
|
215
|
-
"gemini-2.0-flash-
|
|
227
|
+
"gemini-2.0-flash-exp-image-generation",
|
|
216
228
|
Gemini_20_Flash_Lite: "gemini-2.0-flash-lite",
|
|
217
229
|
Gemini_15_Flash: "gemini-1.5-flash",
|
|
218
230
|
Gemini_15_Flash_8B: "gemini-1.5-flash-8b",
|
|
219
231
|
Gemini_15_Pro: "gemini-1.5-pro",
|
|
220
|
-
Imagen_3: "imagen-
|
|
232
|
+
Imagen_3: "imagen-4.0-generate-001",
|
|
221
233
|
Veo_2: "veo-2.0-generate-001",
|
|
234
|
+
Veo_3_1: "veo-3.1-generate-preview",
|
|
222
235
|
Gemini_20_Flash_Live: "gemini-2.0-flash-live-001",
|
|
236
|
+
Gemini_25_Flash_TTS: "gemini-2.5-flash-preview-tts",
|
|
237
|
+
Gemini_20_Flash_TTS: "gemini-2.0-flash-preview-tts",
|
|
223
238
|
},
|
|
224
239
|
};
|
|
225
240
|
|
|
@@ -234,6 +249,20 @@ export const EmbeddingModels = {
|
|
|
234
249
|
},
|
|
235
250
|
};
|
|
236
251
|
|
|
252
|
+
export function getEnabledPlugins(
|
|
253
|
+
plugins: Config["plugins"] | undefined
|
|
254
|
+
): string[] {
|
|
255
|
+
if (!plugins) return [];
|
|
256
|
+
return plugins.enabled ?? [];
|
|
257
|
+
}
|
|
258
|
+
|
|
259
|
+
export function getDisabledPlugins(
|
|
260
|
+
plugins: Config["plugins"] | undefined
|
|
261
|
+
): string[] {
|
|
262
|
+
if (!plugins) return [];
|
|
263
|
+
return plugins.disabled ?? [];
|
|
264
|
+
}
|
|
265
|
+
|
|
237
266
|
export const Providers = Object.keys(Models).reduce((obj, key) => {
|
|
238
267
|
obj[key] = key;
|
|
239
268
|
return obj;
|
|
@@ -275,6 +304,30 @@ export const GoogleImageModels = [
|
|
|
275
304
|
Models.google.Imagen_3,
|
|
276
305
|
];
|
|
277
306
|
|
|
278
|
-
export const
|
|
307
|
+
export const OpenAiImageModels = [
|
|
308
|
+
Models.openai.DALL_E_3,
|
|
309
|
+
Models.openai.DALL_E_2,
|
|
310
|
+
];
|
|
311
|
+
|
|
312
|
+
export const OpenAiVideoModels = [
|
|
313
|
+
Models.openai.Sora,
|
|
314
|
+
Models.openai.Sora_2,
|
|
315
|
+
Models.openai.Sora_2_Pro,
|
|
316
|
+
];
|
|
317
|
+
|
|
318
|
+
export const OpenAiTTSModels = [Models.openai.TTS_1];
|
|
319
|
+
|
|
320
|
+
export const OpenAiTranscriptionModels = [Models.openai.Whisper_1];
|
|
321
|
+
|
|
322
|
+
export const XaiImageModels = [Models.xai.GrokImagineImage];
|
|
323
|
+
|
|
324
|
+
export const XaiVideoModels = [Models.xai.GrokImagineVideo];
|
|
325
|
+
|
|
326
|
+
export const GoogleTTSModels = [
|
|
327
|
+
Models.google.Gemini_25_Flash_TTS,
|
|
328
|
+
Models.google.Gemini_20_Flash_TTS,
|
|
329
|
+
];
|
|
330
|
+
|
|
331
|
+
export const GoogleVideoModels = [Models.google.Veo_2, Models.google.Veo_3_1];
|
|
279
332
|
|
|
280
333
|
export const GoogleEmbeddingModels = [EmbeddingModels.google.Gemini_Embedding];
|
package/src/worker.ts
CHANGED
|
@@ -306,14 +306,37 @@ export async function worker(options?: {
|
|
|
306
306
|
// Get the allowedPorts configuration
|
|
307
307
|
const allowedPorts = config.worker?.tunnel?.allowedPorts || [];
|
|
308
308
|
|
|
309
|
+
// Create URL rewriter callback that returns the hostname (without protocol)
|
|
310
|
+
// The tunnel package will add the protocol based on the useHttps config
|
|
311
|
+
// This receives port and metadata from the tunnel request
|
|
312
|
+
const urlRewriter = (port: number, metadata?: any) => {
|
|
313
|
+
const workerId = metadata?.workerId;
|
|
314
|
+
const secret = metadata?.secret;
|
|
315
|
+
|
|
316
|
+
// Build the hostname/domain (without protocol) based on metadata
|
|
317
|
+
// The tunnel handler will add the protocol using the useHttps config
|
|
318
|
+
// Examples:
|
|
319
|
+
// - secret-p3000.worker.example.com
|
|
320
|
+
// - workerId-p3000.worker.example.com
|
|
321
|
+
const subdomain = secret
|
|
322
|
+
? `${secret}-p${port}`
|
|
323
|
+
: `${workerId}-p${port}`;
|
|
324
|
+
|
|
325
|
+
// Return just the hostname - the tunnel package should add the protocol
|
|
326
|
+
// based on the useHttps configuration passed below
|
|
327
|
+
const replacementUrl = `${subdomain}.${tunnelDomain}`;
|
|
328
|
+
return replacementUrl;
|
|
329
|
+
};
|
|
330
|
+
|
|
309
331
|
// Initialize tunnel handler with the tunnel-specific WebSocket
|
|
332
|
+
// Pass useHttps flag so the tunnel package can add the correct protocol
|
|
310
333
|
tunnelHandler = createTunnelHandler(tunnelConnection!, {
|
|
311
334
|
allowedPorts,
|
|
312
335
|
maxConcurrentStreams:
|
|
313
336
|
config.worker?.tunnel?.maxConcurrentStreams || 50,
|
|
337
|
+
tunnelUseHttps: tunnelUseHttps,
|
|
314
338
|
localHost: tunnelLocalHost,
|
|
315
|
-
|
|
316
|
-
tunnelUseHttps,
|
|
339
|
+
urlRewriter,
|
|
317
340
|
enableUrlRewriting:
|
|
318
341
|
config.worker?.tunnel?.enableUrlRewriting !== false,
|
|
319
342
|
portMapping,
|
|
@@ -0,0 +1,157 @@
|
|
|
1
|
+
# Modalities Manual Tests
|
|
2
|
+
|
|
3
|
+
Manual integration tests for audio, image, vision, and video generation across all supported AI providers.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
These tests exercise every output/input modality supported by each provider:
|
|
8
|
+
|
|
9
|
+
| Modality | OpenAI | Google | XAI |
|
|
10
|
+
|----------------------|--------------|--------------------|--------------|
|
|
11
|
+
| Audio Generation | TTS-1 / TTS-1-HD | Gemini 2.0 Flash TTS | ❌ Not supported |
|
|
12
|
+
| Audio Transcription | Whisper-1 | ❌ Not supported | ❌ Not supported |
|
|
13
|
+
| Image Generation | DALL-E 3 | Imagen 3 / Gemini 2.0 Flash | Aurora |
|
|
14
|
+
| Vision (image input) | GPT-4o | Gemini 2.0 Flash | Grok-2-Vision |
|
|
15
|
+
| Video Generation | Sora (stub) | Veo 2 | ❌ Not supported |
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Prerequisites
|
|
20
|
+
|
|
21
|
+
Set the appropriate API keys in your environment before running tests:
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
export OPENAI_KEY="sk-..."
|
|
25
|
+
export GEMINI_API_KEY="AIza..."
|
|
26
|
+
export XAI_API_KEY="xai-..."
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## Running the Tests
|
|
32
|
+
|
|
33
|
+
### Run all modality tests
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
npx jest tests/manual/modalities --testTimeout=300000 --runInBand
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
> **`--runInBand`** is recommended so tests within a file run sequentially (some tests depend on outputs from prior tests in the same file).
|
|
40
|
+
|
|
41
|
+
### Run a single provider
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
# OpenAI only
|
|
45
|
+
npx jest tests/manual/modalities/openai.modalities.test.ts --testTimeout=120000 --runInBand
|
|
46
|
+
|
|
47
|
+
# Google only
|
|
48
|
+
npx jest tests/manual/modalities/google.modalities.test.ts --testTimeout=300000 --runInBand
|
|
49
|
+
|
|
50
|
+
# XAI only
|
|
51
|
+
npx jest tests/manual/modalities/xai.modalities.test.ts --testTimeout=120000 --runInBand
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Run a single test by name
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
npx jest tests/manual/modalities/openai.modalities.test.ts \
|
|
58
|
+
--testNamePattern="DALL-E 3" \
|
|
59
|
+
--testTimeout=60000
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Output Files
|
|
65
|
+
|
|
66
|
+
All generated artifacts are saved under `tests/manual/modalities/outputs/<provider>/` so they can be reviewed after the tests run.
|
|
67
|
+
|
|
68
|
+
### OpenAI (`outputs/openai/`)
|
|
69
|
+
|
|
70
|
+
| File | Test | Description |
|
|
71
|
+
|------|------|-------------|
|
|
72
|
+
| `tts-output.mp3` | Test 1 | TTS-1 generated speech audio |
|
|
73
|
+
| `transcription.json` | Test 2 | Whisper transcription of the TTS audio |
|
|
74
|
+
| `dalle3-output.png` | Test 3 | DALL-E 3 generated image |
|
|
75
|
+
| `dalle3-output-url.txt` | Test 3 | URL fallback if b64_json not returned |
|
|
76
|
+
| `vision-description.txt` | Test 4 | GPT-4o description of the DALL-E image |
|
|
77
|
+
|
|
78
|
+
### Google (`outputs/google/`)
|
|
79
|
+
|
|
80
|
+
| File | Test | Description |
|
|
81
|
+
|------|------|-------------|
|
|
82
|
+
| `tts-output.wav` | Test 1 | Gemini 2.0 Flash TTS audio |
|
|
83
|
+
| `gemini-flash-image.png` | Test 2 | Gemini 2.0 Flash inline image |
|
|
84
|
+
| `gemini-flash-image-url.txt` | Test 2 | URL fallback |
|
|
85
|
+
| `imagen3-output.png` | Test 3 | Imagen 3 generated image |
|
|
86
|
+
| `imagen3-output-url.txt` | Test 3 | URL fallback |
|
|
87
|
+
| `vision-description.txt` | Test 4 | Gemini description of the Imagen 3 image |
|
|
88
|
+
| `veo2-output.mp4` | Test 5 | Veo 2 generated video |
|
|
89
|
+
| `veo2-output-url.txt` | Test 5 | URL fallback |
|
|
90
|
+
|
|
91
|
+
### XAI (`outputs/xai/`)
|
|
92
|
+
|
|
93
|
+
| File | Test | Description |
|
|
94
|
+
|------|------|-------------|
|
|
95
|
+
| `aurora-output.png` | Test 1 | Aurora generated image |
|
|
96
|
+
| `aurora-output-url.txt` | Test 1 | URL fallback |
|
|
97
|
+
| `vision-description.txt` | Test 2 | Grok-2-Vision description of Aurora image |
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## Test Dependency Order
|
|
102
|
+
|
|
103
|
+
Several tests depend on output from a previous test **within the same file**. Always run tests sequentially (`--runInBand`):
|
|
104
|
+
|
|
105
|
+
- **OpenAI**: Test 2 (Whisper) needs `tts-output.mp3` from Test 1
|
|
106
|
+
- **OpenAI**: Test 4 (Vision) needs `dalle3-output.png` from Test 3
|
|
107
|
+
- **Google**: Test 4 (Vision) needs `imagen3-output.png` from Test 3
|
|
108
|
+
- **XAI**: Test 2 (Vision) needs `aurora-output.png` from Test 1
|
|
109
|
+
|
|
110
|
+
If a dependency file is missing, the dependent test will be skipped with a clear log message.
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
114
|
+
## Provider Notes
|
|
115
|
+
|
|
116
|
+
### OpenAI
|
|
117
|
+
|
|
118
|
+
- **TTS models**: `tts-1` (faster, lower quality) and `tts-1-hd` (higher quality). Voices: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`.
|
|
119
|
+
- **Whisper**: Supports `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `wav`, `webm`. Response formats: `json`, `text`, `srt`, `verbose_json`, `vtt`.
|
|
120
|
+
- **DALL-E 3**: Sizes `1024x1024`, `1792x1024`, `1024x1792`. Quality: `standard` or `hd`.
|
|
121
|
+
- **Sora**: Not yet implemented in the client. See Test 5 for the intended API.
|
|
122
|
+
|
|
123
|
+
### Google
|
|
124
|
+
|
|
125
|
+
- **Gemini TTS**: Uses `gemini-2.0-flash-preview-tts` model. Supports multi-speaker synthesis.
|
|
126
|
+
- **Gemini 2.0 Flash image**: Native inline image generation using the `gemini-2.0-flash-preview-image-generation` model.
|
|
127
|
+
- **Imagen 3**: High-quality image generation via the Vertex-style Gemini API.
|
|
128
|
+
- **Veo 2**: Video generation via `veo-2.0-generate-001`. Generation is asynchronous and polls for completion — allow up to 5 minutes.
|
|
129
|
+
|
|
130
|
+
### XAI
|
|
131
|
+
|
|
132
|
+
- **Aurora**: XAI's image generation model. Uses the OpenAI-compatible `/images/generations` endpoint.
|
|
133
|
+
- **Grok-2-Vision**: Vision model for image understanding (`grok-2-vision-1212`).
|
|
134
|
+
- **Audio**: XAI does not support audio generation or transcription. Tests 3 and 4 verify these throw errors.
|
|
135
|
+
- **Video**: XAI has no public video generation API yet. Test 5 is a documented placeholder.
|
|
136
|
+
|
|
137
|
+
---
|
|
138
|
+
|
|
139
|
+
## Adding New Tests
|
|
140
|
+
|
|
141
|
+
1. Create a new file: `tests/manual/modalities/<provider>.modalities.test.ts`
|
|
142
|
+
2. Save all outputs to `path.join(__dirname, "outputs", "<provider>", "<filename>")`
|
|
143
|
+
3. Guard each test with an API key check and skip gracefully if not set
|
|
144
|
+
4. Add `--runInBand` if tests depend on each other's outputs
|
|
145
|
+
5. Update this README with the new provider and output files
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
## Troubleshooting
|
|
150
|
+
|
|
151
|
+
**Tests time out**: Increase `--testTimeout`. Video generation (Veo 2) can take 2–5 minutes.
|
|
152
|
+
|
|
153
|
+
**"API key not set" skip**: Export the relevant environment variable before running.
|
|
154
|
+
|
|
155
|
+
**Dependency file missing**: Run tests in order with `--runInBand` rather than in parallel.
|
|
156
|
+
|
|
157
|
+
**TypeScript errors**: Run `npm run compile` from `packages/knowhow/` to check for type issues.
|