@redaksjon/protokoll 0.0.15 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/guide/feedback.md CHANGED
@@ -175,7 +175,7 @@ For comprehensive corrections, the system typically:
175
175
 
176
176
  ### Example: Complete Term Correction
177
177
 
178
- When you say: *"WCMP should be WCNP - Walmart's Native Cloud Platform"*
178
+ When you provide feedback: *"WCMP should be WCNP - Walmart's Native Cloud Platform"*
179
179
 
180
180
  The model executes:
181
181
 
package/guide/index.md CHANGED
@@ -21,7 +21,8 @@ Protokoll transforms audio recordings into intelligent, context-enhanced transcr
21
21
  - [**Context Commands**](./context-commands.md): CLI for managing entities
22
22
  - [**Routing**](./routing.md): Intelligent note routing
23
23
  - [**Reasoning**](./reasoning.md): Reasoning model integration
24
- - [**Transcript Actions**](./action.md): Post-processing commands (combine, etc.)
24
+ - [**Transcript Listing**](./transcript-listing.md): Browse, search, and filter transcripts
25
+ - [**Transcript Actions**](./action.md): Edit, combine, and manage transcripts
25
26
  - [**Feedback**](./feedback.md): Intelligent feedback for corrections
26
27
 
27
28
  ### AI Integration
@@ -97,6 +98,21 @@ protokoll context status
97
98
  protokoll context search <query>
98
99
  ```
99
100
 
101
+ ### Transcript Management
102
+
103
+ ```bash
104
+ # List transcripts with search and filtering
105
+ protokoll transcript list <directory>
106
+ protokoll transcript list ~/notes --search "kubernetes"
107
+ protokoll transcript list ~/notes --start-date 2026-01-01 --limit 25
108
+
109
+ # Compare raw vs enhanced
110
+ protokoll transcript compare <file>
111
+
112
+ # Show transcript info
113
+ protokoll transcript info <file>
114
+ ```
115
+
100
116
  ### Transcript Actions
101
117
 
102
118
  ```bash
@@ -142,7 +158,7 @@ protokoll feedback --help-me
142
158
 
143
159
  ### MCP / AI Integration
144
160
 
145
- Instead of CLI commands, you can talk to an AI assistant:
161
+ Instead of CLI commands, you can use natural language with an AI assistant:
146
162
 
147
163
  ```
148
164
  "Can you transcribe ~/Downloads/meeting.m4a?"
@@ -189,6 +205,8 @@ If you're an AI helping someone use Protokoll:
189
205
  3. **Interactive Learning**: Asks questions, remembers answers
190
206
  4. **Self-Reflection**: Reports on tool effectiveness (enabled by default)
191
207
  5. **Full Preservation**: Not a summarizer - keeps all content
208
+ 6. **Smart Projects**: AI-assisted project configuration with phonetic variants
209
+ 7. **Proactive Phonetic**: Integration with Observasjon for improved transcription accuracy
192
210
 
193
211
  ## Current Defaults
194
212
 
@@ -199,5 +217,50 @@ If you're an AI helping someone use Protokoll:
199
217
  | Transcription Model | `whisper-1` |
200
218
  | Self-Reflection | `true` (enabled) |
201
219
  | Interactive Mode | `true` (enabled, use `--batch` to disable) |
220
+ | Smart Projects | `true` (enabled) |
202
221
  | Output Structure | `month` |
203
222
 
223
+ ## Integration with Observasjon
224
+
225
+ Protokoll's project data is automatically used by [Observasjon](https://github.com/redaksjon/observasjon) to improve transcription accuracy through **Proactive Phonetic Enhancement**:
226
+
227
+ ### How It Works
228
+
229
+ 1. **You define projects in Protokoll** with names and phonetic variations:
230
+ ```bash
231
+ protokoll project add
232
+ # Name: "Observasjon"
233
+ # Sounds like: "observation", "observashun"
234
+ ```
235
+
236
+ 2. **Observasjon automatically detects your projects** from `~/.protokoll/context/projects/`
237
+
238
+ 3. **Project names are sent during transcription** (not after) so Whisper gets them right from the start
239
+
240
+ 4. **Results**: "Observasjon" transcribed correctly instead of "observation"
241
+
242
+ ### Benefits
243
+
244
+ - **Better accuracy**: Project names spelled correctly in initial transcription
245
+ - **Lower cost**: Fix names during transcription, not in post-processing
246
+ - **No configuration**: Works automatically if you use both tools
247
+
248
+ ### Smart Projects Feature
249
+
250
+ The Smart Projects feature (enabled by default) helps you configure projects with AI assistance:
251
+
252
+ ```bash
253
+ # Interactive mode (review AI suggestions)
254
+ protokoll project add
255
+
256
+ # Non-interactive mode (trust AI suggestions)
257
+ protokoll project add --name "My Project" --yes
258
+
259
+ # AI automatically suggests:
260
+ # - Phonetic variations for the project name
261
+ # - Classification signals (trigger phrases)
262
+ # - Common mishearings to watch for
263
+ ```
264
+
265
+ This makes it easy to set up proactive phonetic enhancement without manually thinking through all the ways Whisper might mishear your project names. Use `--yes` for automation or when you trust the AI completely.
266
+
@@ -172,12 +172,218 @@ type ClarificationType =
172
172
  | 'general';
173
173
  ```
174
174
 
175
+ ## Session Tracking & Progress
176
+
177
+ ### Per-File Progress Monitoring
178
+
179
+ Every interactive prompt now shows your progress:
180
+
181
+ ```
182
+ [File: recording1.m4a] [Prompts: 5]
183
+ (Type 'S' to skip remaining prompts for this file)
184
+ ```
185
+
186
+ This helps you understand:
187
+ - Which file you're currently working on
188
+ - How many questions you've answered for this file
189
+ - The option to skip ahead
190
+
191
+ ### Skip Rest of File
192
+
193
+ When you've had enough prompts for a particular file:
194
+
195
+ ```bash
196
+ > S # or "skip"
197
+
198
+ [Skipping remaining prompts for this file...]
199
+ ```
200
+
201
+ **When to use:**
202
+ - File has too many unknown terms
203
+ - You want to process it later
204
+ - You've answered enough questions already
205
+
206
+ The file will still be transcribed fully - you just won't get more prompts for it.
207
+
208
+ ### Session Summary Report
209
+
210
+ At the end of every interactive session, you get a comprehensive summary:
211
+
212
+ ```
213
+ ═══════════════════════════════════════════════════════════
214
+ INTERACTIVE SESSION SUMMARY
215
+ ═══════════════════════════════════════════════════════════
216
+
217
+ Duration: 12m 34s
218
+ Total prompts answered: 18
219
+
220
+ ────────────────────────────────────────────────────────────
221
+ FILES PROCESSED
222
+ ────────────────────────────────────────────────────────────
223
+
224
+ 1. /recordings/meeting1.m4a
225
+ Prompts answered: 8
226
+ Status: Completed
227
+ Transcript: ~/notes/2026/01/2026-01-18_Meeting_Notes.md
228
+ Audio moved to: ~/archive/2026/01/meeting1.m4a
229
+
230
+ 2. /recordings/ideas.m4a
231
+ Prompts answered: 5
232
+ Status: SKIPPED (user requested)
233
+ Transcript: ~/notes/2026/01/2026-01-18_Quick_Ideas.md
234
+
235
+ 3. /recordings/project-update.m4a
236
+ Prompts answered: 5
237
+ Status: Completed
238
+ Transcript: ~/notes/2026/01/2026-01-18_Project_Update.md
239
+
240
+ ────────────────────────────────────────────────────────────
241
+ CHANGES MADE
242
+ ────────────────────────────────────────────────────────────
243
+
244
+ ✓ Terms added (5):
245
+ - Kubernetes
246
+ - Docker
247
+ - GraphQL
248
+ - React
249
+ - PostgreSQL
250
+
251
+ ✓ Terms updated (2):
252
+ - AWS
253
+ - API
254
+
255
+ ✓ Projects added (2):
256
+ - Project Alpha
257
+ - Client Beta
258
+
259
+ ✓ Aliases created (3):
260
+ - "K8s" → "Kubernetes"
261
+ - "Postgres" → "PostgreSQL"
262
+ - "Chronology" → "Kronologi"
263
+
264
+ ✓ People added (1):
265
+ - Priya Sharma
266
+
267
+ ═══════════════════════════════════════════════════════════
268
+ ```
269
+
270
+ **Summary includes:**
271
+ - Session duration and total prompts
272
+ - Each file with prompt counts, status, and output paths
273
+ - All changes made to context (terms, projects, people, aliases)
274
+
275
+ ### Mid-Session Stop
276
+
277
+ Press `Ctrl+C` at any time to stop and see the summary:
278
+
279
+ ```bash
280
+ ^C
281
+ [Session interrupted by user]
282
+
283
+ ═══════════════════════════════════════════════════════════
284
+ INTERACTIVE SESSION SUMMARY
285
+ ═══════════════════════════════════════════════════════════
286
+ ...
287
+ ```
288
+
289
+ All progress is automatically saved. Resume processing later.
290
+
291
+ ## Streamlined Learning Flow
292
+
293
+ ### Smart Similarity Matching
294
+
295
+ Protokoll automatically detects similar existing terms:
296
+
297
+ ```
298
+ ────────────────────────────────────────────────────────────
299
+ [Unknown: "Chronology"]
300
+ ────────────────────────────────────────────────────────────
301
+
302
+ Found similar term(s): Kronologi
303
+ Is "Chronology" the same as "Kronologi"? (Y/N): Y
304
+
305
+ info: Added alias "Chronology" → "Kronologi"
306
+ ```
307
+
308
+ Prevents duplicate entries for similar spellings!
309
+
310
+ ### Automated Content Analysis
311
+
312
+ Instead of answering multiple questions manually, provide documentation:
313
+
314
+ ```
315
+ ────────────────────────────────────────────────────────────
316
+ [Unknown: "Cursor"]
317
+ ────────────────────────────────────────────────────────────
318
+
319
+ [How should I learn about this?]
320
+ Options:
321
+ 1. Provide a file path
322
+ 2. Provide a URL
323
+ 3. Paste text directly
324
+ 4. Enter details manually
325
+
326
+ Enter 1-4, or paste path/URL directly: https://cursor.com
327
+
328
+ Fetching and analyzing...
329
+
330
+ ────────────────────────────────────────────────────────────
331
+ [Analysis Results]
332
+ Type: TERM
333
+ Name: Cursor
334
+ Description: AI-powered code editor built on VS Code
335
+ Topics: ai, code-editor, vscode, development
336
+ Confidence: high
337
+ ────────────────────────────────────────────────────────────
338
+
339
+ Use this? (Y/N): Y
340
+
341
+ Which project(s)?
342
+ 1. FjellGrunn
343
+ 2. Redaksjon
344
+ 3. Grunnverk
345
+
346
+ Enter numbers: 2
347
+
348
+ info: Added term "Cursor" to Redaksjon
349
+ ```
350
+
351
+ **Three input methods:**
352
+
353
+ 1. **File path**: `~/docs/project-info.md`
354
+ 2. **URL**: `https://kubernetes.io/docs`
355
+ 3. **Paste text**: Multi-line input
356
+
357
+ AI automatically extracts:
358
+ - Entity type (Project/Term)
359
+ - Correct name
360
+ - Description
361
+ - Topics
362
+ - Acronym expansions (if any)
363
+
364
+ **95% automated** - just point to documentation!
365
+
366
+ ### Clean Project Selection
367
+
368
+ Project lists show only names:
369
+
370
+ ```
371
+ Which project(s) is this related to?
372
+ 1. FjellGrunn
373
+ 2. Grunnverk
374
+ 3. Redaksjon
375
+ N. Create new project
376
+ ```
377
+
378
+ Much cleaner than long descriptions!
379
+
175
380
  ## Best Practices
176
381
 
177
382
  1. **Start with interactive mode**: Build context quickly
178
- 2. **Review session files**: See what was learned
179
- 3. **Switch to default**: Once context is established
180
- 4. **Periodic interactive runs**: Catch new names
383
+ 2. **Use URL/file analysis**: Point to docs instead of typing
384
+ 3. **Skip when overwhelmed**: Use 'S' to skip rest of file
385
+ 4. **Review session summary**: See what was learned
386
+ 5. **Periodic interactive runs**: Catch new names and terms
181
387
 
182
388
  ## Troubleshooting
183
389
 
@@ -189,12 +395,21 @@ type ClarificationType =
189
395
 
190
396
  ### Too Many Prompts
191
397
 
192
- 1. Add more context entries
193
- 2. Run with `--batch` for known content
194
- 3. Add sounds_like mappings
398
+ 1. Use 'S' to skip rest of current file
399
+ 2. Add more context entries beforehand
400
+ 3. Run with `--batch` for known content
401
+ 4. Use URL/file analysis for bulk additions
402
+
403
+ ### Session Summary Not Showing
404
+
405
+ 1. Summary prints automatically at end of session
406
+ 2. Press `Ctrl+C` to trigger summary mid-session
407
+ 3. Check that interactive mode is enabled
195
408
 
196
- ### Prompts Timing Out
409
+ ### Analysis Failing
197
410
 
198
- 1. Increase timeout in config
199
- 2. Run without interactive mode and review manually
411
+ 1. Ensure OPENAI_API_KEY is set
412
+ 2. Check URL/file is accessible
413
+ 3. Verify file format is supported (.md, .txt, .yaml, etc.)
414
+ 4. Fall back to manual entry if needed
200
415
 
@@ -10,8 +10,8 @@ Traditional CLI usage requires you to:
10
10
  3. Remember command syntax
11
11
  4. Run commands manually
12
12
 
13
- With MCP, you can simply say:
14
- - "Hey, can you transcribe this recording?"
13
+ With MCP, you can use natural language:
14
+ - "Can you transcribe this recording?"
15
15
  - "Add Sanjay Gupta as a person - Whisper keeps mishearing it as 'San Jay Grouper'"
16
16
  - "This transcript should be in the Quantum Readiness project"
17
17
 
@@ -109,7 +109,7 @@ The AI will automatically use `client-alpha/.protokoll` configuration.
109
109
  └── recording.m4a # ← Where does this go?
110
110
  ```
111
111
 
112
- The AI will ask: *"I found 2 projects in your configuration. Which project is this recording for?"*
112
+ The AI will ask: *I found 2 projects in your configuration. Which project is this recording for?*
113
113
 
114
114
  **Scenario 3: No Configuration Found**
115
115
 
@@ -151,74 +151,148 @@ If no `.protokoll` directory exists in the hierarchy, the AI will:
151
151
  | Tool | Description |
152
152
  |------|-------------|
153
153
  | `protokoll_add_person` | Add a new person to context |
154
- | `protokoll_add_project` | Add a new project |
155
- | `protokoll_add_term` | Add a technical term |
154
+ | `protokoll_add_project` | Add a new project with smart assistance |
155
+ | `protokoll_add_term` | Add a technical term with smart assistance |
156
156
  | `protokoll_add_company` | Add a company |
157
- | `protokoll_add_ignored_term` | Add a term to ignore |
158
157
  | `protokoll_delete_entity` | Remove an entity |
159
158
 
159
+ ### Entity Editing Tools
160
+
161
+ | Tool | Description |
162
+ |------|-------------|
163
+ | `protokoll_edit_person` | Edit person: add sounds_like variants, change role/company, etc. |
164
+ | `protokoll_edit_term` | Edit term: add sounds_like variants, topics, projects, domain, etc. |
165
+ | `protokoll_edit_project` | Edit project: add sounds_like variants, phrases, routing config, etc. |
166
+ | `protokoll_update_term` | Regenerate term metadata from source URL/file (uses LLM) |
167
+ | `protokoll_update_project` | Regenerate project metadata from source URL/file (uses LLM) |
168
+ | `protokoll_merge_terms` | Merge duplicate terms into one |
169
+ | `protokoll_suggest_project_metadata` | Generate project suggestions without creating |
170
+ | `protokoll_suggest_term_metadata` | Generate term suggestions without creating |
171
+
160
172
  ### Transcript Tools
161
173
 
162
174
  | Tool | Description |
163
175
  |------|-------------|
176
+ | `protokoll_list_transcripts` | List/search transcripts with pagination and filtering |
177
+ | `protokoll_read_transcript` | Read a transcript and parse its metadata |
164
178
  | `protokoll_edit_transcript` | Change transcript title or project |
165
179
  | `protokoll_combine_transcripts` | Merge multiple transcripts |
166
180
  | `protokoll_provide_feedback` | Correct transcription errors |
167
181
 
182
+ #### protokoll_list_transcripts
183
+
184
+ Browse and search your transcript library with pagination, sorting, and filtering.
185
+
186
+ **Parameters:**
187
+ - `directory` (required): Directory to search (recursive)
188
+ - `limit` (optional): Max results to return (default: 50)
189
+ - `offset` (optional): Skip N results for pagination (default: 0)
190
+ - `sortBy` (optional): Sort by "date" (default), "filename", or "title"
191
+ - `startDate` (optional): Filter from date (YYYY-MM-DD)
192
+ - `endDate` (optional): Filter to date (YYYY-MM-DD)
193
+ - `search` (optional): Search text in filename and content
194
+
195
+ **Returns:**
196
+ - `transcripts[]`: Array of transcript items with:
197
+ - `path`: Full file path
198
+ - `filename`: File name
199
+ - `date`: Date from filename (YYYY-MM-DD)
200
+ - `time`: Time from filename if present (HH:MM)
201
+ - `title`: Extracted from `# heading`
202
+ - `hasRawTranscript`: Whether raw Whisper output exists
203
+ - `pagination`: Total count, limit, offset, hasMore, nextOffset
204
+ - `filters`: Applied filters
205
+
206
+ **Example:**
207
+ ```typescript
208
+ // List recent transcripts
209
+ const result = await use_mcp_tool('protokoll_list_transcripts', {
210
+ directory: '/Users/me/notes',
211
+ limit: 10
212
+ });
213
+
214
+ // Search for specific content
215
+ const meetings = await use_mcp_tool('protokoll_list_transcripts', {
216
+ directory: '/Users/me/notes',
217
+ search: 'kubernetes deployment',
218
+ startDate: '2026-01-01'
219
+ });
220
+ ```
221
+
168
222
  ## Example Conversations
169
223
 
170
224
  ### Basic Transcription
171
225
 
172
- **You**: "Can you transcribe the meeting recording at ~/Downloads/standup.m4a?"
226
+ **User**: Can you transcribe the meeting recording at ~/Downloads/standup.m4a?
173
227
 
174
228
  **AI**: *Calls `protokoll_discover_config` to find configuration*
175
229
  *Calls `protokoll_suggest_project` to determine project*
176
230
 
177
- "I found your Protokoll configuration at `~/.protokoll`. You have 3 projects configured. Based on the file location, I'm not sure which project this belongs to. Is this for:
231
+ I found your Protokoll configuration at `~/.protokoll`. You have 3 projects configured. Based on the file location, I'm not sure which project this belongs to. Is this for:
178
232
  1. Client Alpha
179
233
  2. Internal Notes
180
- 3. Personal"
234
+ 3. Personal
181
235
 
182
- **You**: "It's for Client Alpha"
236
+ **User**: It's for Client Alpha
183
237
 
184
238
  **AI**: *Calls `protokoll_process_audio` with projectId: "client-alpha"*
185
239
 
186
- "Done! I've transcribed the standup recording. The transcript is saved at `~/notes/client-alpha/2026-01/standup-a3f2b1c8.md`.
240
+ Done! I've transcribed the standup recording. The transcript is saved at `~/notes/client-alpha/2026-01/standup-a3f2b1c8.md`.
187
241
 
188
242
  Key highlights:
189
243
  - Discussed sprint progress with Sarah and Marcus
190
244
  - Mentioned the API integration deadline
191
245
  - Applied 4 name corrections using your context
192
246
 
193
- Would you like me to make any changes to the transcript?"
247
+ Would you like me to make any changes to the transcript?
194
248
 
195
249
  ### Adding Context
196
250
 
197
- **You**: "Whisper keeps mishearing 'Priya Sharma' as 'pre a Sharma'. Can you add her to my context?"
251
+ **User**: Whisper keeps mishearing 'Priya Sharma' as 'pre a Sharma'. Can you add her to my context?
198
252
 
199
253
  **AI**: *Calls `protokoll_add_person`*
200
254
 
201
- "I've added Priya Sharma to your context with the phonetic variant 'pre a'. Future transcripts will automatically correct this. Would you like to add any other information about Priya (role, company, etc.)?"
255
+ I've added Priya Sharma to your context with the phonetic variant 'pre a'. Future transcripts will automatically correct this. Would you like to add any other information about Priya (role, company, etc.)?
202
256
 
203
257
  ### Providing Feedback
204
258
 
205
- **You**: "In the transcript at ~/notes/meeting.md, 'WCMP' should be 'WCNP' and this should be in the Infrastructure project"
259
+ **User**: In the transcript at ~/notes/meeting.md, 'WCMP' should be 'WCNP' and this should be in the Infrastructure project
206
260
 
207
261
  **AI**: *Calls `protokoll_provide_feedback`*
208
262
 
209
- "I've made the corrections:
263
+ I've made the corrections:
210
264
  1. Changed 'WCMP' to 'WCNP' (2 occurrences)
211
265
  2. Moved the transcript to the Infrastructure project at `~/notes/infrastructure/meeting.md`
212
266
 
213
- I also noticed 'WCNP' isn't in your terms. Would you like me to add it so future transcripts get it right?"
267
+ I also noticed 'WCNP' isn't in your terms. Would you like me to add it so future transcripts get it right?
214
268
 
215
269
  ### Combining Transcripts
216
270
 
217
- **You**: "Can you combine all the meeting parts in ~/notes/client-alpha/2026-01/ into a single transcript?"
271
+ **User**: Can you combine all the meeting parts in ~/notes/client-alpha/2026-01/ into a single transcript?
218
272
 
219
273
  **AI**: *Lists files, then calls `protokoll_combine_transcripts`*
220
274
 
221
- "I found 3 meeting parts. Combined them into 'Full Meeting Notes' at `~/notes/client-alpha/2026-01/full-meeting-notes.md`. The original part files have been removed."
275
+ I found 3 meeting parts. Combined them into 'Full Meeting Notes' at `~/notes/client-alpha/2026-01/full-meeting-notes.md`. The original part files have been removed.
276
+
277
+ ### Editing Entities
278
+
279
+ **User**: I need to add "Cartesian Time" as a sounds_like variant for the "cardigantime" term
280
+
281
+ **AI**: *Uses the edit_entity prompt or calls `protokoll_edit_term` directly*
282
+
283
+ I'll add that sounds_like variant for you.
284
+
285
+ *Calls `protokoll_edit_term` with:*
286
+ - id: "cardigantime"
287
+ - add_sounds_like: ["Cartesian Time"]
288
+
289
+ Done! I've added "Cartesian Time" as a phonetic variant for "cardigantime". Future transcripts will recognize this variant and correct it automatically.
290
+
291
+ **User**: Can you also change Priya's company to "new-corp"?
292
+
293
+ **AI**: *Calls `protokoll_edit_person`*
294
+
295
+ Updated Priya's company from "acme-corp" to "new-corp".
222
296
 
223
297
  ## Best Practices
224
298
 
@@ -289,7 +363,7 @@ This is expected behavior. The AI needs clarification when:
289
363
  - File location doesn't clearly belong to one project
290
364
  - Multiple projects could apply
291
365
 
292
- Just tell it which project to use.
366
+ Simply specify which project to use.
293
367
 
294
368
  ### "API key not set"
295
369