@redaksjon/protokoll 0.0.14 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,371 @@
1
+ # Transcript Listing and Search
2
+
3
+ ## Overview
4
+
5
+ Protokoll provides powerful tools for browsing, searching, and filtering your transcript library. The `protokoll transcript list` command and `protokoll_list_transcripts` MCP tool enable efficient navigation of large transcript collections with pagination, date filtering, and full-text search.
6
+
7
+ ## CLI Command
8
+
9
+ ### Basic Usage
10
+
11
+ ```bash
12
+ protokoll transcript list <directory>
13
+ ```
14
+
15
+ ### Options
16
+
17
+ | Option | Description | Default |
18
+ |--------|-------------|---------|
19
+ | `--limit <number>` | Maximum results to return | 50 |
20
+ | `--offset <number>` | Results to skip (pagination) | 0 |
21
+ | `--sort-by <field>` | Sort by: date, filename, title | date |
22
+ | `--start-date <YYYY-MM-DD>` | Filter from this date | none |
23
+ | `--end-date <YYYY-MM-DD>` | Filter to this date | none |
24
+ | `--search <text>` | Search filename and content | none |
25
+
26
+ ### Examples
27
+
28
+ #### List Recent Transcripts
29
+
30
+ ```bash
31
+ # Default: 50 most recent, sorted by date
32
+ protokoll transcript list ~/notes
33
+ ```
34
+
35
+ Output:
36
+ ```
37
+ 📂 Transcripts in: ~/notes
38
+ 📊 Showing 1-3 of 45 total
39
+
40
+ ✅ 2026-01-18 14:30 - Meeting with Priya about Q1 Planning
41
+ 2026-01-18-1430_Meeting_with_Priya.md
42
+
43
+ ✅ 2026-01-17 - Quick Ideas for New Feature
44
+ 2026-01-17_Quick_Ideas.md
45
+
46
+ 2026-01-16 09:15 - Sprint Planning Session
47
+ 2026-01-16-0915_Sprint_Planning.md
48
+
49
+ 💡 More results available. Use --offset 50 to see the next page.
50
+ ```
51
+
52
+ #### Search Transcripts
53
+
54
+ ```bash
55
+ # Find all transcripts mentioning "kubernetes"
56
+ protokoll transcript list ~/notes --search "kubernetes"
57
+ ```
58
+
59
+ Searches in:
60
+ - Filename
61
+ - Transcript content (full text)
62
+ - Entity metadata
63
+
64
+ #### Filter by Date Range
65
+
66
+ ```bash
67
+ # January 2026 transcripts only
68
+ protokoll transcript list ~/notes \
69
+ --start-date 2026-01-01 \
70
+ --end-date 2026-01-31
71
+ ```
72
+
73
+ #### Pagination
74
+
75
+ ```bash
76
+ # First 25
77
+ protokoll transcript list ~/notes --limit 25
78
+
79
+ # Next 25
80
+ protokoll transcript list ~/notes --limit 25 --offset 25
81
+
82
+ # Third page
83
+ protokoll transcript list ~/notes --limit 25 --offset 50
84
+ ```
85
+
86
+ #### Sort Options
87
+
88
+ ```bash
89
+ # Sort by title alphabetically
90
+ protokoll transcript list ~/notes --sort-by title
91
+
92
+ # Sort by filename
93
+ protokoll transcript list ~/notes --sort-by filename
94
+
95
+ # Sort by date (default, newest first)
96
+ protokoll transcript list ~/notes --sort-by date
97
+ ```
98
+
99
+ #### Combined Filtering
100
+
101
+ ```bash
102
+ # Find meetings about Kubernetes in January
103
+ protokoll transcript list ~/notes \
104
+ --search "kubernetes meeting" \
105
+ --start-date 2026-01-01 \
106
+ --end-date 2026-01-31 \
107
+ --sort-by date \
108
+ --limit 20
109
+ ```
110
+
111
+ ## MCP Tool
112
+
113
+ ### protokoll_list_transcripts
114
+
115
+ AI assistants can browse your transcript library using this tool.
116
+
117
+ **Parameters:**
118
+
119
+ ```typescript
120
+ {
121
+ directory: string; // Required: directory to search
122
+ limit?: number; // Default: 50
123
+ offset?: number; // Default: 0
124
+ sortBy?: 'date' | 'filename' | 'title'; // Default: 'date'
125
+ startDate?: string; // Format: YYYY-MM-DD
126
+ endDate?: string; // Format: YYYY-MM-DD
127
+ search?: string; // Search term
128
+ }
129
+ ```
130
+
131
+ **Returns:**
132
+
133
+ ```typescript
134
+ {
135
+ directory: string;
136
+ transcripts: Array<{
137
+ path: string;
138
+ filename: string;
139
+ date: string; // YYYY-MM-DD
140
+ time?: string; // HH:MM if present in filename
141
+ title: string; // Extracted from # heading
142
+ hasRawTranscript: boolean; // Has raw Whisper output
143
+ entities?: { // Entity references (if present)
144
+ people?: Array<{ id: string; name: string }>;
145
+ projects?: Array<{ id: string; name: string }>;
146
+ terms?: Array<{ id: string; name: string }>;
147
+ companies?: Array<{ id: string; name: string }>;
148
+ };
149
+ }>;
150
+ pagination: {
151
+ total: number;
152
+ limit: number;
153
+ offset: number;
154
+ hasMore: boolean;
155
+ nextOffset: number | null;
156
+ };
157
+ filters: {
158
+ sortBy: string;
159
+ startDate?: string;
160
+ endDate?: string;
161
+ search?: string;
162
+ };
163
+ }
164
+ ```
165
+
166
+ ### Example: AI Assistant Usage
167
+
168
+ **User**: "Show me all my transcripts from last week that mention Kubernetes"
169
+
170
+ **AI**:
171
+ ```typescript
172
+ const lastWeek = calculateLastWeekDates(); // Helper function
173
+
174
+ const result = await use_mcp_tool('protokoll_list_transcripts', {
175
+ directory: '/Users/me/notes',
176
+ search: 'kubernetes',
177
+ startDate: lastWeek.start, // '2026-01-11'
178
+ endDate: lastWeek.end, // '2026-01-18'
179
+ sortBy: 'date'
180
+ });
181
+
182
+ // AI can now analyze the transcripts and respond
183
+ ```
184
+
185
+ ## Display Format
186
+
187
+ ### Status Indicator
188
+
189
+ - ✅ = Has raw Whisper transcript (enables comparison with `protokoll transcript compare`)
190
+ - (blank) = No raw transcript available
191
+
192
+ ### Date and Time
193
+
194
+ - **Date**: Extracted from filename (YYYY-MM-DD)
195
+ - **Time**: Extracted if present in filename (HH:MM format)
196
+ - Falls back to file creation time if not in filename
197
+
198
+ ### Title Extraction
199
+
200
+ Protokoll extracts the title from the first `# heading` in the markdown file. If no heading exists, uses the first line of content (truncated to 100 characters).
201
+
202
+ ### Entity Metadata
203
+
204
+ When a transcript includes entity references in the footer, those are returned in the `entities` field:
205
+
206
+ ```json
207
+ {
208
+ "entities": {
209
+ "people": [
210
+ { "id": "priya-sharma", "name": "Priya Sharma" }
211
+ ],
212
+ "terms": [
213
+ { "id": "kubernetes", "name": "Kubernetes" },
214
+ { "id": "docker", "name": "Docker" }
215
+ ]
216
+ }
217
+ }
218
+ ```
219
+
220
+ This enables powerful queries like:
221
+ - "Show all transcripts that mention Priya"
222
+ - "Find discussions about Kubernetes"
223
+ - "List Project Alpha transcripts"
224
+
225
+ ## Performance
226
+
227
+ ### Recursive Search
228
+
229
+ The list command searches recursively through all subdirectories:
230
+
231
+ ```
232
+ ~/notes/
233
+ ├── 2026/
234
+ │ ├── 01/
235
+ │ │ ├── file1.md ← Found
236
+ │ │ └── file2.md ← Found
237
+ │ └── 02/
238
+ │ └── file3.md ← Found
239
+ └── archive/
240
+ └── old.md ← Found
241
+ ```
242
+
243
+ ### Exclusions
244
+
245
+ Automatically excludes:
246
+ - `**/node_modules/**`
247
+ - `**/.git/**`
248
+ - `**/.transcript/**` (raw transcript storage)
249
+
250
+ ### Search Performance
251
+
252
+ - Text search scans both filename and content
253
+ - Entity metadata is parsed only when needed
254
+ - Results are paginated to avoid memory issues with large collections
255
+
256
+ ## Common Use Cases
257
+
258
+ ### Find Transcripts by Person
259
+
260
+ ```bash
261
+ # All transcripts mentioning Priya Sharma
262
+ protokoll transcript list ~/notes --search "Priya Sharma"
263
+ ```
264
+
265
+ The search will find matches in:
266
+ 1. Filename
267
+ 2. Transcript content
268
+ 3. Entity References section (if present)
269
+
270
+ ### Find Transcripts by Project
271
+
272
+ ```bash
273
+ # All transcripts about Project Alpha
274
+ protokoll transcript list ~/notes --search "Project Alpha"
275
+ ```
276
+
277
+ ### Browse by Date Range
278
+
279
+ ```bash
280
+ # Q1 2026 transcripts
281
+ protokoll transcript list ~/notes \
282
+ --start-date 2026-01-01 \
283
+ --end-date 2026-03-31
284
+ ```
285
+
286
+ ### Recent Activity
287
+
288
+ ```bash
289
+ # Last 10 transcripts
290
+ protokoll transcript list ~/notes --limit 10
291
+ ```
292
+
293
+ ### Build a Knowledge Index
294
+
295
+ ```bash
296
+ # Export all transcripts with entity metadata for indexing
297
+ protokoll transcript list ~/notes --limit 1000 > transcripts-index.json
298
+ ```
299
+
300
+ The JSON output includes entity references, enabling you to build:
301
+ - Person-to-transcript mappings
302
+ - Project knowledge bases
303
+ - Term frequency analysis
304
+ - Cross-reference graphs
305
+
306
+ ## Integration with Other Tools
307
+
308
+ ### With Feedback Command
309
+
310
+ ```bash
311
+ # Find transcript, then provide feedback
312
+ protokoll transcript list ~/notes --search "meeting with John"
313
+ # Note the path, then:
314
+ protokoll feedback --file ~/notes/2026/01/meeting.md "John Smith should be John Doe"
315
+ ```
316
+
317
+ ### With Action Command
318
+
319
+ ```bash
320
+ # Find transcripts to combine
321
+ protokoll transcript list ~/notes --search "sprint planning" --start-date 2026-01-15
322
+ # Then combine them:
323
+ protokoll action --combine "path1.md
324
+ path2.md
325
+ path3.md" --title "Complete Sprint Planning"
326
+ ```
327
+
328
+ ### With Context Commands
329
+
330
+ Search results show which entities are referenced, helping you understand what context you already have:
331
+
332
+ ```bash
333
+ protokoll transcript list ~/notes --search "kubernetes"
334
+ # See that many transcripts reference it
335
+ # Check if term is already in context:
336
+ protokoll context list terms
337
+ ```
338
+
339
+ ## Troubleshooting
340
+
341
+ ### No Results Found
342
+
343
+ Check:
344
+ 1. Directory path is correct
345
+ 2. Markdown files (`.md`) exist in directory
346
+ 3. Search term spelling
347
+ 4. Date range isn't too narrow
348
+
349
+ ### Slow Performance
350
+
351
+ For very large directories (10,000+ files):
352
+ - Use more specific date ranges
353
+ - Search in specific subdirectories
354
+ - Reduce limit to smaller batches
355
+
356
+ ### Entity Metadata Not Showing
357
+
358
+ Entity metadata only appears if:
359
+ 1. Transcript was processed with entity tracking enabled (recent feature)
360
+ 2. Entities were actually referenced during processing
361
+ 3. Transcript includes the "## Entity References" footer
362
+
363
+ Older transcripts won't have entity metadata until reprocessed.
364
+
365
+ ## Future Enhancements
366
+
367
+ Planned improvements:
368
+ - **Entity-specific filters**: `--person priya-sharma`, `--project alpha`
369
+ - **Export formats**: JSON, CSV, SQLite database
370
+ - **Fuzzy search**: Typo-tolerant searching
371
+ - **Smart suggestions**: "Transcripts like this one"
package/guide/action.md CHANGED
@@ -1,16 +1,278 @@
1
- # Transcript Actions
1
+ # Transcript Management
2
2
 
3
- Protokoll includes the `action` command for editing and combining existing transcripts. These capabilities help you organize, merge, and manage your transcript library after the initial transcription.
3
+ Protokoll provides comprehensive tools for managing your transcript library: listing, searching, editing, and combining transcripts.
4
4
 
5
- ## Overview
5
+ ## List Transcripts
6
6
 
7
- The `action` command provides two modes:
7
+ Browse, search, and filter your transcript library with pagination.
8
+
9
+ ### Basic Usage
10
+
11
+ ```bash
12
+ protokoll transcript list <directory>
13
+ ```
14
+
15
+ ### Examples
16
+
17
+ ```bash
18
+ # List recent transcripts (default: 50 most recent, sorted by date)
19
+ protokoll transcript list ~/notes
20
+
21
+ # Search for transcripts containing specific text
22
+ protokoll transcript list ~/notes --search "kubernetes"
23
+
24
+ # Filter by date range
25
+ protokoll transcript list ~/notes --start-date 2026-01-01 --end-date 2026-01-31
26
+
27
+ # Sort by filename or title
28
+ protokoll transcript list ~/notes --sort-by title
29
+
30
+ # Pagination
31
+ protokoll transcript list ~/notes --limit 25 --offset 50
32
+
33
+ # Combine multiple filters
34
+ protokoll transcript list ~/notes \
35
+ --search "meeting" \
36
+ --start-date 2026-01-01 \
37
+ --sort-by date \
38
+ --limit 20
39
+ ```
40
+
41
+ ### Output Example
42
+
43
+ ```
44
+ 📂 Transcripts in: ~/notes
45
+ 🔍 Search: "kubernetes"
46
+ 📊 Showing 1-3 of 12 total
47
+
48
+ ✅ 2026-01-18 14:30 - Setting up Kubernetes Cluster
49
+ 2026-01-18-1430_Kubernetes_Setup.md
50
+
51
+ ✅ 2026-01-17 - Kubernetes Deployment Notes
52
+ 2026-01-17_K8s_Deploy.md
53
+
54
+ 2026-01-15 09:00 - DevOps Discussion with Kubernetes
55
+ 2026-01-15-0900_DevOps_Talk.md
56
+
57
+ 💡 More results available. Use --offset 20 to see the next page.
58
+ ```
59
+
60
+ ### Options
61
+
62
+ | Option | Description | Default |
63
+ |--------|-------------|---------|
64
+ | `--limit <number>` | Max results to return | 50 |
65
+ | `--offset <number>` | Results to skip (pagination) | 0 |
66
+ | `--sort-by <field>` | Sort by: date, filename, title | date |
67
+ | `--start-date <YYYY-MM-DD>` | Filter from this date | none |
68
+ | `--end-date <YYYY-MM-DD>` | Filter to this date | none |
69
+ | `--search <text>` | Search filename and content | none |
70
+
71
+ ### Display Format
72
+
73
+ Each transcript shows:
74
+ - **Status indicator**: ✅ = has raw Whisper output, blank = no raw data
75
+ - **Date**: Extracted from filename (YYYY-MM-DD)
76
+ - **Time**: If present in filename (HH:MM)
77
+ - **Title**: Extracted from `# heading` in the document
78
+ - **Filename**: The actual file name
79
+
80
+ ### Features
81
+
82
+ - **Recursive search**: Finds transcripts in all subdirectories
83
+ - **Fast text search**: Searches both filename and content
84
+ - **Flexible sorting**: Sort by date (newest first), filename, or title
85
+ - **Date filtering**: Focus on specific time periods
86
+ - **Pagination**: Handle large libraries efficiently
87
+
88
+ ### Entity Metadata in Results
89
+
90
+ The list command returns entity metadata for each transcript (when available):
91
+
92
+ ```json
93
+ {
94
+ "transcripts": [
95
+ {
96
+ "path": "/notes/2026-01-18_Meeting.md",
97
+ "date": "2026-01-18",
98
+ "title": "Meeting with Priya",
99
+ "entities": {
100
+ "people": [
101
+ { "id": "priya-sharma", "name": "Priya Sharma" }
102
+ ],
103
+ "projects": [
104
+ { "id": "project-alpha", "name": "Project Alpha" }
105
+ ],
106
+ "terms": [
107
+ { "id": "kubernetes", "name": "Kubernetes" }
108
+ ]
109
+ }
110
+ }
111
+ ]
112
+ }
113
+ ```
114
+
115
+ This enables:
116
+ - Querying transcripts by entity ("show all transcripts mentioning Priya")
117
+ - Building knowledge graphs
118
+ - Cross-referencing content
119
+ - Automated indexing
120
+
121
+ ### MCP Tool
122
+
123
+ Also available as `protokoll_list_transcripts` for AI assistants to browse your transcript library programmatically.
124
+
125
+ ### Querying by Entity
126
+
127
+ Find transcripts that reference specific entities:
128
+
129
+ ```bash
130
+ # Search for transcripts mentioning a person
131
+ protokoll transcript list ~/notes --search "Priya Sharma"
132
+
133
+ # Find all transcripts about a project
134
+ protokoll transcript list ~/notes --search "Project Alpha"
135
+
136
+ # Combine with date filters
137
+ protokoll transcript list ~/notes \
138
+ --search "kubernetes" \
139
+ --start-date 2026-01-01 \
140
+ --end-date 2026-01-31
141
+ ```
142
+
143
+ **Programmatic Access:** Use `parseEntityMetadata()` from `util/metadata.ts` to extract entity references from transcript content.
144
+
145
+ ## Entity Metadata in Transcripts
146
+
147
+ ### Overview
148
+
149
+ Every transcript automatically includes structured entity metadata at the bottom. This enables powerful querying and knowledge management capabilities.
150
+
151
+ ### Format
152
+
153
+ ```markdown
154
+ ---
155
+
156
+ ## Entity References
157
+
158
+ <!-- Machine-readable entity metadata for indexing and querying -->
159
+
160
+ ### People
161
+
162
+ - `priya-sharma`: Priya Sharma
163
+ - `john-smith`: John Smith
164
+
165
+ ### Projects
166
+
167
+ - `project-alpha`: Project Alpha
168
+
169
+ ### Terms
170
+
171
+ - `kubernetes`: Kubernetes
172
+ - `docker`: Docker
173
+ ```
174
+
175
+ ### What Gets Tracked
176
+
177
+ Entities are automatically recorded when:
178
+ - **lookup_person** finds a person
179
+ - **lookup_project** matches a project
180
+ - **verify_spelling** corrects a term
181
+ - Any tool references an entity from your context
182
+
183
+ ### Benefits
184
+
185
+ 1. **Find all transcripts mentioning a person:**
186
+ ```bash
187
+ protokoll transcript list ~/notes --search "priya-sharma"
188
+ ```
189
+
190
+ 2. **Find all discussions about a technology:**
191
+ ```bash
192
+ protokoll transcript list ~/notes --search "kubernetes"
193
+ ```
194
+
195
+ 3. **Build knowledge graphs:** Connect people, projects, and topics
196
+
197
+ 4. **Create indexes:** Programmatically build searchable databases
198
+
199
+ 5. **Cross-reference:** Find related content based on shared entities
200
+
201
+ ### Example Transcript with Metadata
202
+
203
+ ```markdown
204
+ # Meeting Notes
205
+
206
+ ## Metadata
207
+
208
+ **Date**: January 18, 2026
209
+ **Project**: Project Alpha
210
+
211
+ ---
212
+
213
+ Discussion content here...
214
+
215
+ ---
216
+
217
+ ## Entity References
218
+
219
+ ### People
220
+
221
+ - `priya-sharma`: Priya Sharma
222
+
223
+ ### Projects
224
+
225
+ - `project-alpha`: Project Alpha
226
+
227
+ ### Terms
228
+
229
+ - `kubernetes`: Kubernetes
230
+ ```
231
+
232
+ ### Programmatic Usage
233
+
234
+ ```typescript
235
+ import { parseEntityMetadata } from '@/util/metadata';
236
+ import { readFile } from 'fs/promises';
237
+
238
+ const content = await readFile('transcript.md', 'utf-8');
239
+ const entities = parseEntityMetadata(content);
240
+
241
+ // Query by entity
242
+ if (entities?.people) {
243
+ const hasPriya = entities.people.some(p => p.id === 'priya-sharma');
244
+ console.log('Priya mentioned:', hasPriya);
245
+ }
246
+ ```
247
+
248
+ ### Integration with List Command
249
+
250
+ The `protokoll transcript list` command automatically includes entity metadata in results, enabling queries like:
251
+
252
+ ```typescript
253
+ const results = await listTranscripts({
254
+ directory: '~/notes',
255
+ search: 'kubernetes'
256
+ });
257
+
258
+ // Filter to only transcripts that mention Priya
259
+ const priyaKubernetes = results.transcripts.filter(t =>
260
+ t.entities?.people?.some(p => p.id === 'priya-sharma') &&
261
+ t.entities?.terms?.some(term => term.id === 'kubernetes')
262
+ );
263
+ ```
264
+
265
+ ## Edit & Combine
266
+
267
+ The `action` command provides editing and combining capabilities:
8
268
 
9
269
  | Mode | Usage | Description |
10
270
  |------|-------|-------------|
11
271
  | **Edit** | `protokoll action [options] <file>` | Edit a single transcript (title, project) |
12
272
  | **Combine** | `protokoll action --combine "<files>"` | Merge multiple transcripts into one |
13
273
 
274
+ ### Edit a Single Transcript
275
+
14
276
  ## Edit Mode
15
277
 
16
278
  Edit a single transcript to change its title and/or project.