@tyvm/knowhow 0.0.26 → 0.0.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CONFIG.md CHANGED
@@ -88,6 +88,44 @@ Here is an overview of examples from various `knowhow.json` configuration files
88
88
  ],
89
89
  ```
90
90
 
91
+ ## Expose Tools to Knowhow Site via workers
92
+ After you've connected via `knowhow login` you can connect tools from your local machine
93
+ Run `knowhow worker` to generate a block like below. Edit it to include the tools you want to expose.
94
+ Then run `knowhow worker` again to connect
95
+ ```json
96
+ "worker": {
97
+ "allowedTools": [
98
+ "embeddingSearch",
99
+ "finalAnswer",
100
+ "callPlugin",
101
+ "readFile",
102
+ "readBlocks",
103
+ "patchFile",
104
+ "lintFile",
105
+ "textSearch",
106
+ "fileSearch",
107
+ "writeFileChunk",
108
+ "createAiCompletion",
109
+ "listAllModels",
110
+ "listAllProviders",
111
+ "getPullRequest",
112
+ "getPullRequestBuildStatuses",
113
+ "getRunLogs",
114
+ "getPullRequestBuildFailureLogs",
115
+ "addLanguageTerm",
116
+ "getAllLanguageTerms",
117
+ "lookupLanguageTerm",
118
+ "mcp_0_puppeteer_navigate",
119
+ "mcp_0_puppeteer_screenshot",
120
+ "mcp_0_puppeteer_click",
121
+ "mcp_0_puppeteer_fill",
122
+ "mcp_0_puppeteer_select",
123
+ "mcp_0_puppeteer_hover",
124
+ "mcp_0_puppeteer_evaluate"
125
+ ]
126
+ },
127
+ ```
128
+
91
129
  ## Custom Models Via LMS Studio
92
130
  ```json
93
131
  "modelProviders": [
package/PROCESSING.md ADDED
@@ -0,0 +1,429 @@
1
+ # Media Processing with Knowhow
2
+
3
+ This guide covers how to process audio, video, and PDF files using Knowhow's configuration system. Knowhow automatically converts media files to text for AI processing, supports pipeline chaining where outputs become inputs, and allows assignment of processing tasks to specific agents.
4
+
5
+ ---
6
+
7
+ ## Supported File Types
8
+
9
+ Knowhow's `convertToText()` function automatically handles these file types:
10
+
11
+ **Audio Files:**
12
+ - `.mp3` - MP3 audio files
13
+ - `.wav` - WAV audio files
14
+ - `.m4a` - M4A audio files
15
+ - `.mpga` - MPEG audio files
16
+
17
+ **Video Files:**
18
+ - `.mp4` - MP4 video files
19
+ - `.webm` - WebM video files
20
+ - `.mov` - QuickTime video files
21
+ - `.mpeg` - MPEG video files
22
+
23
+ **Document Files:**
24
+ - `.pdf` - PDF documents
25
+
26
+ All other file types are processed as plain text files.
27
+
28
+ ---
29
+
30
+ ## Audio Processing
31
+
32
+ ### Basic Audio Transcription
33
+
34
+ ```json
35
+ {
36
+ "sources": [
37
+ {
38
+ "input": "./recordings/**/*.mp3",
39
+ "output": "./transcripts/",
40
+ "prompt": "BasicTranscript"
41
+ }
42
+ ]
43
+ }
44
+ ```
45
+
46
+ Audio files are automatically:
47
+ 1. Split into 30-second chunks by default (configurable via `chunkTime` parameter)
48
+ 2. Transcribed using speech-to-text via `Downloader.transcribeChunks()`
49
+ 3. Saved as `transcript.json` in a folder named after the audio file
50
+ 4. Formatted with timestamps: `[0:30s] transcript text [30:60s] more text`
51
+
52
+ ### Audio with Agent Assignment
53
+
54
+ ```json
55
+ {
56
+ "sources": [
57
+ {
58
+ "input": "./meetings/**/*.wav",
59
+ "output": "./meeting-notes/",
60
+ "prompt": "MeetingNotes",
61
+ "agent": "patcher"
62
+ }
63
+ ]
64
+ }
65
+ ```
66
+
67
+ ### Meeting Recording Pipeline
68
+
69
+ ```json
70
+ {
71
+ "sources": [
72
+ {
73
+ "input": "./meetings/**/*.m4a",
74
+ "output": "./meetings/transcripts/",
75
+ "prompt": "MeetingTranscriber"
76
+ },
77
+ {
78
+ "input": "./meetings/transcripts/**/*.mdx",
79
+ "output": "./meetings/summaries/",
80
+ "prompt": "MeetingSummarizer"
81
+ },
82
+ {
83
+ "input": "./meetings/summaries/**/*.mdx",
84
+ "output": "./meetings/action-items.txt",
85
+ "prompt": "ActionItemExtractor"
86
+ }
87
+ ]
88
+ }
89
+ ```
90
+
91
+ This pipeline demonstrates chaining:
92
+ 1. Audio files → transcripts (multi-output to directory)
93
+ 2. Transcripts → summaries (multi-output to directory)
94
+ 3. Summaries → single action items file (single output)
95
+
96
+ ---
97
+
98
+ ## Video Processing
99
+
100
+ ### Basic Video Processing
101
+
102
+ ```json
103
+ {
104
+ "sources": [
105
+ {
106
+ "input": "./videos/**/*.mp4",
107
+ "output": "./video-analysis/",
108
+ "prompt": "VideoAnalyzer"
109
+ }
110
+ ]
111
+ }
112
+ ```
113
+
114
+ Video files are processed by:
115
+ 1. Extracting and transcribing audio (same process as audio files)
116
+ 2. Extracting keyframes at regular intervals using `Downloader.extractKeyframes()`
117
+ 3. Analyzing visual content of each keyframe
118
+ 4. Combining transcript and visual analysis with timestamps
119
+
120
+ The actual output format from `convertVideoToText()` includes:
121
+ ```
122
+ Chunk: (1/10):
123
+ Start Timestamp: [0s]
124
+ Visual: description of keyframe
125
+ Audio: transcribed audio
126
+ End Timestamp: [30s]
127
+ ```
128
+
129
+ ### Video Content Organization
130
+
131
+ ```json
132
+ {
133
+ "sources": [
134
+ {
135
+ "input": "./raw-videos/**/*.webm",
136
+ "output": "./organized-videos/",
137
+ "prompt": "VideoOrganizer",
138
+ "agent": "patcher"
139
+ }
140
+ ]
141
+ }
142
+ ```
143
+
144
+ This example shows assigning video processing to the "patcher" agent.
145
+
146
+ ---
147
+
148
+ ## PDF Processing
149
+
150
+ ### Document Analysis
151
+
152
+ ```json
153
+ {
154
+ "sources": [
155
+ {
156
+ "input": "./documents/**/*.pdf",
157
+ "output": "./document-summaries/",
158
+ "prompt": "DocumentSummarizer"
159
+ }
160
+ ]
161
+ }
162
+ ```
163
+
164
+ PDF files are processed by:
165
+ 1. Reading the file using `fs.readFileSync()`
166
+ 2. Extracting text content using the `pdf-parse` library
167
+ 3. Returning the extracted text via `data.text`
168
+ 4. Applying the specified prompt for analysis
169
+
170
+ ### Multi-Document Research Pipeline
171
+
172
+ ```json
173
+ {
174
+ "sources": [
175
+ {
176
+ "input": "./research-papers/**/*.pdf",
177
+ "output": "./paper-summaries/",
178
+ "prompt": "AcademicSummarizer"
179
+ },
180
+ {
181
+ "input": "./paper-summaries/**/*.mdx",
182
+ "output": "./research-synthesis.md",
183
+ "prompt": "ResearchSynthesizer"
184
+ }
185
+ ]
186
+ }
187
+ ```
188
+
189
+ ---
190
+
191
+ ## Agent Assignment
192
+
193
+ ### Assigning Processing to Specific Agents
194
+
195
+ ```json
196
+ {
197
+ "sources": [
198
+ {
199
+ "input": "./meetings/**/*.mov",
200
+ "output": "./meeting-notes/",
201
+ "prompt": "MeetingNotes",
202
+ "agent": "patcher"
203
+ }
204
+ ]
205
+ }
206
+ ```
207
+
208
+ The `agent` parameter assigns processing to a specific agent defined in your configuration. The agent receives the file content and prompt, then processes it according to its instructions and capabilities.
209
+
210
+ ### Multi-Agent Pipeline
211
+
212
+ ```json
213
+ {
214
+ "sources": [
215
+ {
216
+ "input": "./interviews/**/*.wav",
217
+ "output": "./interview-transcripts/",
218
+ "prompt": "InterviewTranscriber",
219
+ "agent": "transcriber"
220
+ },
221
+ {
222
+ "input": "./interview-transcripts/**/*.mdx",
223
+ "output": "./interview-insights/",
224
+ "prompt": "InsightExtractor",
225
+ "agent": "analyst"
226
+ }
227
+ ]
228
+ }
229
+ ```
230
+
231
+ This assigns transcription to a "transcriber" agent and analysis to an "analyst" agent, each specialized for their respective tasks.
232
+
233
+ ---
234
+
235
+ ## Embedding Generation from Processed Media
236
+
237
+ ### Audio Embeddings
238
+
239
+ ```json
240
+ {
241
+ "embedSources": [
242
+ {
243
+ "input": "./podcasts/**/*.mp3",
244
+ "output": ".knowhow/embeddings/podcasts.json",
245
+ "chunkSize": 2000,
246
+ "prompt": "PodcastEmbeddingExplainer"
247
+ }
248
+ ]
249
+ }
250
+ ```
251
+
252
+ Audio files are automatically transcribed using `convertAudioToText()`, then chunked and embedded for semantic search.
253
+
254
+ ### Video Embeddings
255
+
256
+ ```json
257
+ {
258
+ "embedSources": [
259
+ {
260
+ "input": "./tutorials/**/*.mp4",
261
+ "output": ".knowhow/embeddings/tutorials.json",
262
+ "chunkSize": 1500
263
+ }
264
+ ]
265
+ }
266
+ ```
267
+
268
+ Video files are processed using `convertVideoToText()` to extract both visual and audio content, then embedded for search.
269
+
270
+ ### PDF Document Embeddings
271
+
272
+ ```json
273
+ {
274
+ "embedSources": [
275
+ {
276
+ "input": "./documentation/**/*.pdf",
277
+ "output": ".knowhow/embeddings/docs.json",
278
+ "chunkSize": 2000,
279
+ "prompt": "DocumentChunker"
280
+ }
281
+ ]
282
+ }
283
+ ```
284
+
285
+ ---
286
+
287
+ ## Pipeline Chaining Examples
288
+
289
+ ### Complete Media Processing Workflow
290
+
291
+ ```json
292
+ {
293
+ "sources": [
294
+ {
295
+ "input": "./raw-content/**/*.{mp4,mp3,pdf}",
296
+ "output": "./processed-content/",
297
+ "prompt": "ContentProcessor"
298
+ },
299
+ {
300
+ "input": "./processed-content/**/*.mdx",
301
+ "output": "./content-categories/",
302
+ "prompt": "ContentCategorizer"
303
+ },
304
+ {
305
+ "input": "./content-categories/**/*.mdx",
306
+ "output": "./final-report.md",
307
+ "prompt": "ReportGenerator"
308
+ }
309
+ ]
310
+ }
311
+ ```
312
+
313
+ This three-stage pipeline:
314
+ 1. Processes mixed media files (video, audio, PDF) into text summaries
315
+ 2. Categorizes the processed content
316
+ 3. Generates a final consolidated report
317
+
318
+ ### Meeting-to-Tasks Pipeline
319
+
320
+ ```json
321
+ {
322
+ "sources": [
323
+ {
324
+ "input": "./weekly-meetings/**/*.mov",
325
+ "output": "./meeting-transcripts/",
326
+ "prompt": "MeetingTranscriber"
327
+ },
328
+ {
329
+ "input": "./meeting-transcripts/**/*.mdx",
330
+ "output": "./extracted-tasks/",
331
+ "prompt": "TaskExtractor"
332
+ },
333
+ {
334
+ "input": "./extracted-tasks/**/*.mdx",
335
+ "output": "./project-tasks.json",
336
+ "prompt": "TaskConsolidator",
337
+ "agent": "patcher"
338
+ }
339
+ ]
340
+ }
341
+ ```
342
+
343
+ ---
344
+
345
+ ## Processing Workflow Details
346
+
347
+ ### Transcript Caching and Reuse
348
+
349
+ The system implements intelligent caching for audio/video processing:
350
+
351
+ **Audio Processing (`processAudio()`):**
352
+ - Transcripts saved as `transcript.json` in `{dir}/{filename}/transcript.json`
353
+ - Uses `reusePreviousTranscript` parameter (default: true)
354
+ - Checks if transcript exists with `fileExists()` before reprocessing
355
+ - Enables fast re-processing with different prompts
356
+
357
+ **Video Processing (`processVideo()`):**
358
+ - Audio transcripts cached the same way as audio files
359
+ - Video analysis cached as `video.json` in `{dir}/{filename}/video.json`
360
+ - Keyframes extracted using `Downloader.extractKeyframes()`
361
+
362
+ ### Chunking Behavior
363
+
364
+ **Audio Files:**
365
+ - Default chunk time: 30 seconds (configurable via `chunkTime` parameter)
366
+ - Uses `Downloader.chunk()` to split audio files
367
+ - Each chunk transcribed separately then combined with timestamps
368
+
369
+ **Video Files:**
370
+ - Same 30-second default chunking for audio track
371
+ - Keyframes extracted at chunk intervals
372
+ - Visual and audio analysis combined per chunk
373
+
374
+ ### Output Structure
375
+
376
+ **Multi-Output (directory ending with `/`):**
377
+ - Creates one output file per input file
378
+ - Preserves directory structure relative to input pattern
379
+ - Uses `outputExt` (default: "mdx") for file extensions
380
+ - Uses `outputName` or original filename
381
+
382
+ **Single Output (specific filename):**
383
+ - Combines all input files into one output
384
+ - Useful for reports, summaries, and consolidated documents
385
+
386
+ ---
387
+
388
+ ## Troubleshooting
389
+
390
+ ### Common Issues
391
+
392
+ **Audio/Video Processing Fails:**
393
+ - Ensure ffmpeg is installed and accessible (required by Downloader)
394
+ - Check file permissions and disk space
395
+ - Verify audio/video files aren't corrupted
396
+ - Check that `Downloader.chunk()` and `Downloader.transcribeChunks()` are working
397
+
398
+ **PDF Processing Fails:**
399
+ - Some PDFs may have restrictions or encryption
400
+ - Scanned PDFs without OCR won't extract text properly with `pdf-parse`
401
+ - Large PDFs may cause memory issues when loading with `fs.readFileSync()`
402
+ - Ensure PDF file isn't corrupted
403
+
404
+ **Pipeline Chaining Issues:**
405
+ - Ensure output directory of one stage matches input pattern of next stage
406
+ - Check that intermediate files are being created successfully
407
+ - Verify file extensions match between stages (default: .mdx)
408
+
409
+ **Agent Assignment Problems:**
410
+ - Ensure the specified agent exists in your configuration
411
+ - Check that the agent has appropriate permissions and tools
412
+ - Verify the agent can handle the specific prompt and content type
413
+
414
+ ### Performance Optimization
415
+
416
+ **Large File Processing:**
417
+ - Audio/video files are automatically chunked (30s default) for efficient processing
418
+ - Consider adjusting `chunkTime` parameter for very long recordings
419
+ - Transcript caching avoids reprocessing unchanged files
420
+
421
+ **Batch Processing:**
422
+ - Process files in smaller batches if memory becomes an issue
423
+ - Use specific file patterns rather than overly broad glob patterns
424
+ - Consider processing different file types in separate pipeline stages
425
+
426
+ **File System Considerations:**
427
+ - Transcripts create subdirectories: `{filename}/transcript.json`
428
+ - Video processing creates: `{filename}/video.json`
429
+ - Ensure sufficient disk space for intermediate files