@redaksjon/protokoll 0.0.14 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,416 @@
1
+ # Entity Metadata in Transcripts
2
+
3
+ ## Overview
4
+
5
+ Every transcript generated by Protokoll includes structured entity metadata in a footer section. This machine-readable metadata records all entities (people, projects, terms, companies) that were referenced or used during transcript processing, enabling powerful querying and indexing capabilities.
6
+
7
+ ## Format
8
+
9
+ Entity metadata appears at the bottom of each transcript in this format:
10
+
11
+ ```markdown
12
+ ---
13
+
14
+ ## Entity References
15
+
16
+ <!-- Machine-readable entity metadata for indexing and querying -->
17
+
18
+ ### People
19
+
20
+ - `priya-sharma`: Priya Sharma
21
+ - `john-smith`: John Smith
22
+
23
+ ### Projects
24
+
25
+ - `project-alpha`: Project Alpha
26
+ - `client-beta`: Client Beta
27
+
28
+ ### Terms
29
+
30
+ - `kubernetes`: Kubernetes
31
+ - `docker`: Docker
32
+ - `graphql`: GraphQL
33
+
34
+ ### Companies
35
+
36
+ - `acme-corp`: Acme Corp
37
+ ```
38
+
39
+ ## How Entities Are Tracked
40
+
41
+ Entities are automatically tracked during the transcription process whenever:
42
+
43
+ 1. **lookup_person tool** finds a person in your context
44
+ 2. **lookup_project tool** matches a project
45
+ 3. **verify_spelling tool** corrects a term
46
+ 4. **route_note tool** routes to a project
47
+
48
+ Every successful entity lookup adds that entity to the transcript's metadata.
49
+
50
+ ## Example Transcript
51
+
52
+ Here's a complete example showing how entity metadata appears in context:
53
+
54
+ ```markdown
55
+ # Meeting with Priya about Kubernetes Deployment
56
+
57
+ ## Metadata
58
+
59
+ **Date**: January 18, 2026
60
+ **Time**: 02:30 PM
61
+
62
+ **Project**: Project Alpha
63
+ **Project ID**: `project-alpha`
64
+
65
+ ### Routing
66
+
67
+ **Destination**: ~/work/project-alpha/notes
68
+ **Confidence**: 95.0%
69
+
70
+ **Tags**: `kubernetes`, `deployment`, `infrastructure`
71
+
72
+ **Duration**: 15m 30s
73
+
74
+ ---
75
+
76
+ ## Content
77
+
78
+ We discussed the Kubernetes deployment strategy for Project Alpha with Priya Sharma.
79
+ Key points:
80
+
81
+ - Use Docker containers for microservices
82
+ - Deploy to AWS EKS cluster
83
+ - Set up CI/CD with GitHub Actions
84
+
85
+ Priya will follow up with the infrastructure team at Acme Corp next week.
86
+
87
+ ---
88
+
89
+ ## Entity References
90
+
91
+ <!-- Machine-readable entity metadata for indexing and querying -->
92
+
93
+ ### People
94
+
95
+ - `priya-sharma`: Priya Sharma
96
+
97
+ ### Projects
98
+
99
+ - `project-alpha`: Project Alpha
100
+
101
+ ### Terms
102
+
103
+ - `kubernetes`: Kubernetes
104
+ - `docker`: Docker
105
+ - `aws`: AWS
106
+ - `eks`: EKS
107
+
108
+ ### Companies
109
+
110
+ - `acme-corp`: Acme Corp
111
+ ```
112
+
113
+ ## Benefits
114
+
115
+ ### 1. Queryable Knowledge Base
116
+
117
+ Find all transcripts that mention a specific entity:
118
+
119
+ ```bash
120
+ # All transcripts mentioning Priya
121
+ protokoll transcript list ~/notes --search "priya-sharma"
122
+
123
+ # All Project Alpha transcripts
124
+ protokoll transcript list ~/notes --search "project-alpha"
125
+
126
+ # All transcripts discussing Kubernetes
127
+ protokoll transcript list ~/notes --search "kubernetes"
128
+ ```
129
+
130
+ ### 2. Knowledge Graphs
131
+
132
+ Build relationships between entities:
133
+
134
+ ```typescript
135
+ // Parse all transcripts
136
+ const transcripts = await listTranscripts({ directory: '~/notes' });
137
+
138
+ // Build person-to-project mapping
139
+ const personProjects = new Map();
140
+ for (const t of transcripts.transcripts) {
141
+ for (const person of t.entities?.people || []) {
142
+ for (const project of t.entities?.projects || []) {
143
+ // person.id worked on project.id
144
+ addRelationship(personProjects, person.id, project.id);
145
+ }
146
+ }
147
+ }
148
+ ```
149
+
150
+ ### 3. Context Discovery
151
+
152
+ See which entities appear most frequently in your transcripts:
153
+
154
+ ```typescript
155
+ // Count entity references
156
+ const entityCounts = countEntities(transcripts);
157
+ // Result: { 'kubernetes': 24, 'docker': 18, 'priya-sharma': 12, ... }
158
+ ```
159
+
160
+ ### 4. Automated Indexing
161
+
162
+ Build search indexes for fast lookup:
163
+
164
+ ```typescript
165
+ // Create inverted index
166
+ const index = {
167
+ 'priya-sharma': ['transcript1.md', 'transcript5.md', 'transcript12.md'],
168
+ 'kubernetes': ['transcript2.md', 'transcript5.md', 'transcript8.md'],
169
+ ...
170
+ };
171
+ ```
172
+
173
+ ### 5. Cross-Referencing
174
+
175
+ Find related transcripts based on shared entities:
176
+
177
+ ```typescript
178
+ // Find transcripts that share entities with a given transcript
179
+ const related = findRelatedTranscripts(transcript1, allTranscripts);
180
+ // Returns transcripts mentioning same people/projects/terms
181
+ ```
182
+
183
+ ## Programmatic Access
184
+
185
+ ### Reading Entity Metadata
186
+
187
+ ```typescript
188
+ import { parseEntityMetadata } from '@/util/metadata';
189
+ import { readFile } from 'fs/promises';
190
+
191
+ const content = await readFile('transcript.md', 'utf-8');
192
+ const entities = parseEntityMetadata(content);
193
+
194
+ if (entities?.people) {
195
+ console.log('People mentioned:', entities.people.map(p => p.name));
196
+ }
197
+ ```
198
+
199
+ ### Querying Transcripts
200
+
201
+ ```typescript
202
+ import { listTranscripts } from '@/cli/transcript';
203
+
204
+ // Find all transcripts about a person
205
+ const results = await listTranscripts({
206
+ directory: '~/notes',
207
+ search: 'priya-sharma',
208
+ sortBy: 'date',
209
+ limit: 100
210
+ });
211
+
212
+ // Filter by entity
213
+ const priyaTranscripts = results.transcripts.filter(t =>
214
+ t.entities?.people?.some(p => p.id === 'priya-sharma')
215
+ );
216
+ ```
217
+
218
+ ### Building Custom Tools
219
+
220
+ ```typescript
221
+ // Example: Find collaboration patterns
222
+ function findCollaborations(transcripts) {
223
+ const collaborations = new Map();
224
+
225
+ for (const t of transcripts) {
226
+ const people = t.entities?.people || [];
227
+ if (people.length < 2) continue;
228
+
229
+ // Record all pairs of people appearing together
230
+ for (let i = 0; i < people.length; i++) {
231
+ for (let j = i + 1; j < people.length; j++) {
232
+ const key = [people[i].id, people[j].id].sort().join(':');
233
+ collaborations.set(key, (collaborations.get(key) || 0) + 1);
234
+ }
235
+ }
236
+ }
237
+
238
+ return collaborations;
239
+ }
240
+ ```
241
+
242
+ ## Entity Types
243
+
244
+ ### People
245
+
246
+ Referenced when:
247
+ - lookup_person tool finds them in context
248
+ - Name correction happens
249
+ - Interactive wizard creates new person
250
+
251
+ **Structure:**
252
+ ```typescript
253
+ {
254
+ id: string; // e.g., "priya-sharma"
255
+ name: string; // e.g., "Priya Sharma"
256
+ type: 'person';
257
+ }
258
+ ```
259
+
260
+ ### Projects
261
+
262
+ Referenced when:
263
+ - lookup_project tool matches a project
264
+ - route_note tool routes to a project
265
+ - Transcript is explicitly assigned to a project
266
+
267
+ **Structure:**
268
+ ```typescript
269
+ {
270
+ id: string; // e.g., "project-alpha"
271
+ name: string; // e.g., "Project Alpha"
272
+ type: 'project';
273
+ }
274
+ ```
275
+
276
+ ### Terms
277
+
278
+ Referenced when:
279
+ - verify_spelling tool corrects technical term
280
+ - Term is looked up in context
281
+ - Interactive wizard creates new term
282
+
283
+ **Structure:**
284
+ ```typescript
285
+ {
286
+ id: string; // e.g., "kubernetes"
287
+ name: string; // e.g., "Kubernetes"
288
+ type: 'term';
289
+ }
290
+ ```
291
+
292
+ ### Companies
293
+
294
+ Referenced when:
295
+ - Company is mentioned and found in context
296
+ - Person's company is looked up
297
+ - Interactive wizard creates new company
298
+
299
+ **Structure:**
300
+ ```typescript
301
+ {
302
+ id: string; // e.g., "acme-corp"
303
+ name: string; // e.g., "Acme Corp"
304
+ type: 'company';
305
+ }
306
+ ```
307
+
308
+ ## Advanced Use Cases
309
+
310
+ ### Building a Personal CRM
311
+
312
+ Track all interactions with people:
313
+
314
+ ```bash
315
+ # Find all conversations with Priya
316
+ protokoll transcript list ~/notes --search "priya-sharma"
317
+
318
+ # Export for analysis
319
+ protokoll transcript list ~/notes \
320
+ --search "priya-sharma" \
321
+ --limit 1000 > priya-transcripts.json
322
+ ```
323
+
324
+ ### Project Knowledge Base
325
+
326
+ Collect all knowledge about a project:
327
+
328
+ ```bash
329
+ # All Project Alpha transcripts
330
+ protokoll transcript list ~/notes --search "project-alpha"
331
+
332
+ # Combine them into project documentation
333
+ protokoll action --combine "$(protokoll transcript list ~/notes --search 'project-alpha' --limit 100 | grep '.md' | awk '{print $NF}')"
334
+ ```
335
+
336
+ ### Technology Research
337
+
338
+ Track discussions about specific technologies:
339
+
340
+ ```bash
341
+ # All Kubernetes discussions
342
+ protokoll transcript list ~/notes --search "kubernetes"
343
+
344
+ # Time-based analysis
345
+ protokoll transcript list ~/notes --search "kubernetes" --start-date 2026-01-01
346
+ protokoll transcript list ~/notes --search "kubernetes" --start-date 2025-01-01 --end-date 2025-12-31
347
+ ```
348
+
349
+ ### Meeting Participant Analysis
350
+
351
+ See who you've met with most:
352
+
353
+ ```bash
354
+ # Export all transcripts with entity data
355
+ protokoll transcript list ~/notes --limit 10000 > all-transcripts.json
356
+
357
+ # Analyze in your favorite tool
358
+ cat all-transcripts.json | jq '.transcripts[].entities.people[].name' | sort | uniq -c | sort -rn
359
+ ```
360
+
361
+ ## Migration Notes
362
+
363
+ ### Existing Transcripts
364
+
365
+ Transcripts created before this feature won't have entity metadata. To add it:
366
+
367
+ 1. **Reprocess**: Re-run through Protokoll (will regenerate with metadata)
368
+ 2. **Use feedback**: `protokoll feedback` can add entities to existing transcripts
369
+ 3. **Manual addition**: Add the section manually following the format above
370
+
371
+ ### Backward Compatibility
372
+
373
+ The entity metadata section is optional and won't break:
374
+ - Existing tools that read transcripts
375
+ - Markdown renderers
376
+ - Search tools
377
+ - Version control systems
378
+
379
+ It's just additional structured data at the end.
380
+
381
+ ## Best Practices
382
+
383
+ 1. **Regular Reprocessing**: Periodically reprocess old transcripts to add entity metadata
384
+ 2. **Use Search**: Leverage entity IDs (e.g., `priya-sharma`) for precise searches
385
+ 3. **Export for Analysis**: Use `--limit 10000` to export large datasets
386
+ 4. **Build Indexes**: Create custom indexes for frequent queries
387
+ 5. **Cross-Reference**: Use entity data to find related transcripts
388
+
389
+ ## Technical Details
390
+
391
+ ### Storage Format
392
+
393
+ Entity metadata is stored as Markdown at the end of each transcript file:
394
+ - **Human-readable**: You can read and edit it
395
+ - **Machine-parseable**: Structured format for programmatic access
396
+ - **Version-controllable**: Works with Git
397
+ - **Future-proof**: Plain text, no proprietary format
398
+
399
+ ### Parsing Implementation
400
+
401
+ The `parseEntityMetadata()` function:
402
+ 1. Finds the "## Entity References" section
403
+ 2. Extracts each entity type section (People, Projects, Terms, Companies)
404
+ 3. Parses list items in format: `` - `entity-id`: Entity Name ``
405
+ 4. Returns structured object or undefined if no entities
406
+
407
+ Available in: `@/util/metadata`
408
+
409
+ ## Future Enhancements
410
+
411
+ Planned improvements:
412
+ - **Entity-specific list filters**: `--entity-type person --entity-id priya-sharma`
413
+ - **Relationship queries**: "Find transcripts where Person X and Project Y appear together"
414
+ - **Timeline views**: See entity mentions over time
415
+ - **Export to database**: SQLite/PostgreSQL with entity tables and relationships
416
+ - **Graph visualization**: Visual knowledge graph from entity relationships
package/docs/examples.md CHANGED
@@ -146,6 +146,27 @@ protokoll --input-directory ~/backlog \
146
146
  protokoll --input-directory ~/backlog --batch
147
147
  ```
148
148
 
149
+ ## Scenario 20: Non-Interactive Project Creation
150
+
151
+ Add projects automatically without confirming AI suggestions.
152
+
153
+ ```bash
154
+ # Trust AI suggestions completely
155
+ protokoll project add --name "FjellGrunn" --yes
156
+
157
+ # With a source URL
158
+ protokoll project add https://github.com/myorg/myproject --name "My Project" --yes
159
+
160
+ # With local README
161
+ protokoll project add /path/to/README.md --name "Documentation" --yes
162
+ ```
163
+
164
+ Result:
165
+ - AI generates phonetic variants automatically
166
+ - Trigger phrases generated without prompts
167
+ - Topics and description extracted (if source provided)
168
+ - Project saved immediately with all AI suggestions
169
+
149
170
  ## Scenario 8: Quality Review
150
171
 
151
172
  Review transcription quality after processing.