@redaksjon/protokoll 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -36,6 +36,15 @@ npm run lint
36
36
  npm run lint:fix # Auto-fix issues
37
37
  ```
38
38
 
39
+ ## Recent Fixes
40
+
41
+ ### Tag Deduplication (2026-01-19)
42
+
43
+ Tags in transcript metadata are now automatically deduplicated. Previously, when the same entity (e.g., "xenocline") was identified through multiple classification signals (e.g., both as an explicit phrase and as an associated project), it would appear multiple times in the tags array. Now tags are deduplicated using Set, ensuring each tag appears only once regardless of how many signals identify it.
44
+
45
+ **Changed**: `src/util/metadata.ts` - `extractTagsFromSignals()` function
46
+ **Tests**: `tests/util/metadata.test.ts` - Added deduplication test case
47
+
39
48
  ## Project Structure
40
49
 
41
50
  ```
package/guide/index.md CHANGED
@@ -25,6 +25,7 @@ Protokoll transforms audio recordings into intelligent, context-enhanced transcr
25
25
  - [**Transcript Actions**](./action.md): Edit, combine, and manage transcripts
26
26
  - [**Feedback**](./feedback.md): Intelligent feedback for corrections
27
27
 
28
+
28
29
  ### AI Integration
29
30
  - [**MCP Integration**](./mcp-integration.md): Use Protokoll through AI assistants
30
31
 
@@ -152,16 +152,23 @@ If no `.protokoll` directory exists in the hierarchy, the AI will:
152
152
  |------|-------------|
153
153
  | `protokoll_add_person` | Add a new person to context |
154
154
  | `protokoll_add_project` | Add a new project with smart assistance |
155
- | `protokoll_suggest_project_metadata` | Generate project suggestions without creating |
156
- | `protokoll_update_project` | Update project by regenerating from source URL/file |
157
155
  | `protokoll_add_term` | Add a technical term with smart assistance |
158
- | `protokoll_suggest_term_metadata` | Generate term suggestions without creating |
159
- | `protokoll_update_term` | Update term by regenerating from source URL/file |
160
- | `protokoll_merge_terms` | Merge duplicate terms into one |
161
156
  | `protokoll_add_company` | Add a company |
162
- | `protokoll_add_ignored_term` | Add a term to ignore |
163
157
  | `protokoll_delete_entity` | Remove an entity |
164
158
 
159
+ ### Entity Editing Tools
160
+
161
+ | Tool | Description |
162
+ |------|-------------|
163
+ | `protokoll_edit_person` | Edit person: add sounds_like variants, change role/company, etc. |
164
+ | `protokoll_edit_term` | Edit term: add sounds_like variants, topics, projects, domain, etc. |
165
+ | `protokoll_edit_project` | Edit project: add sounds_like variants, phrases, routing config, etc. |
166
+ | `protokoll_update_term` | Regenerate term metadata from source URL/file (uses LLM) |
167
+ | `protokoll_update_project` | Regenerate project metadata from source URL/file (uses LLM) |
168
+ | `protokoll_merge_terms` | Merge duplicate terms into one |
169
+ | `protokoll_suggest_project_metadata` | Generate project suggestions without creating |
170
+ | `protokoll_suggest_term_metadata` | Generate term suggestions without creating |
171
+
165
172
  ### Transcript Tools
166
173
 
167
174
  | Tool | Description |
@@ -267,6 +274,26 @@ I also noticed 'WCNP' isn't in your terms. Would you like me to add it so future
267
274
 
268
275
  I found 3 meeting parts. Combined them into 'Full Meeting Notes' at `~/notes/client-alpha/2026-01/full-meeting-notes.md`. The original part files have been removed.
269
276
 
277
+ ### Editing Entities
278
+
279
+ **User**: I need to add "Cartesian Time" as a sounds_like variant for the "cardigantime" term
280
+
281
+ **AI**: *Uses the edit_entity prompt or calls `protokoll_edit_term` directly*
282
+
283
+ I'll add that sounds_like variant for you.
284
+
285
+ *Calls `protokoll_edit_term` with:*
286
+ - id: "cardigantime"
287
+ - add_sounds_like: ["Cartesian Time"]
288
+
289
+ Done! I've added "Cartesian Time" as a phonetic variant for "cardigantime". Future transcripts will recognize this variant and correct it automatically.
290
+
291
+ **User**: Can you also change Priya's company to "new-corp"?
292
+
293
+ **AI**: *Calls `protokoll_edit_person`*
294
+
295
+ Updated Priya's company from "acme-corp" to "new-corp".
296
+
270
297
  ## Best Practices
271
298
 
272
299
  ### 1. Set Up Project Configurations
package/guide/routing.md CHANGED
@@ -66,6 +66,198 @@ routing:
66
66
  active: true
67
67
  ```
68
68
 
69
+ ## Classification Fields Reference
70
+
71
+ Classification determines HOW Protokoll matches transcripts to projects. Each field serves a specific purpose in the routing algorithm.
72
+
73
+ ### explicit_phrases (Trigger Phrases)
74
+
75
+ **Weight:** 90% (highest confidence)
76
+
77
+ **Purpose:** Phrases that definitively indicate this project
78
+
79
+ **When to use:** For unique project-specific phrases that rarely appear elsewhere
80
+
81
+ ```yaml
82
+ classification:
83
+ explicit_phrases:
84
+ - "quarterly planning"
85
+ - "Q1 planning meeting"
86
+ - "roadmap review session"
87
+ ```
88
+
89
+ **Example:** If transcript contains "This is a Q1 planning meeting", it routes to Quarterly Planning project with 90% confidence.
90
+
91
+ **Manage:**
92
+ ```bash
93
+ # CLI
94
+ protokoll project edit quarterly-planning \
95
+ --add-phrase "Q2 planning" \
96
+ --remove-phrase "old phrase"
97
+
98
+ # MCP
99
+ use_mcp_tool('protokoll_edit_project', {
100
+ id: 'quarterly-planning',
101
+ add_explicit_phrases: ['Q2 planning'],
102
+ remove_explicit_phrases: ['old phrase']
103
+ });
104
+ ```
105
+
106
+ ### topics
107
+
108
+ **Weight:** 30% (lower confidence)
109
+
110
+ **Purpose:** Theme keywords that suggest (but don't guarantee) this project
111
+
112
+ **When to use:** For broad topic categories that help classify when combined with other signals
113
+
114
+ ```yaml
115
+ classification:
116
+ topics:
117
+ - roadmap
118
+ - budget
119
+ - planning
120
+ - strategy
121
+ ```
122
+
123
+ **Example:** Transcript mentioning "roadmap" and "budget" gets 60% confidence (30% + 30%).
124
+
125
+ **Manage:**
126
+ ```bash
127
+ # CLI
128
+ protokoll project edit quarterly-planning \
129
+ --add-topic okrs \
130
+ --add-topic goals \
131
+ --remove-topic old-topic
132
+
133
+ # MCP
134
+ use_mcp_tool('protokoll_edit_project', {
135
+ id: 'quarterly-planning',
136
+ add_topics: ['okrs', 'goals'],
137
+ remove_topics: ['old-topic']
138
+ });
139
+ ```
140
+
141
+ ### associated_people
142
+
143
+ **Weight:** 60% (medium-high confidence)
144
+
145
+ **Purpose:** Person IDs that indicate this project when mentioned
146
+
147
+ **When to use:** When specific people are strongly tied to a project
148
+
149
+ ```yaml
150
+ classification:
151
+ associated_people:
152
+ - priya-sharma # Acme point of contact
153
+ - john-smith # Project lead
154
+ ```
155
+
156
+ **Example:** Transcript mentioning "Priya" routes to this project with 60% confidence.
157
+
158
+ **Important:** Only associate people who STRONGLY indicate the project. If someone appears in many projects, don't associate them.
159
+
160
+ **Manage:**
161
+ ```bash
162
+ # CLI
163
+ protokoll project edit client-alpha \
164
+ --add-person sarah-chen \
165
+ --remove-person old-contact
166
+
167
+ # MCP
168
+ use_mcp_tool('protokoll_edit_project', {
169
+ id: 'client-alpha',
170
+ add_associated_people: ['sarah-chen'],
171
+ remove_associated_people: ['old-contact']
172
+ });
173
+ ```
174
+
175
+ ### associated_companies
176
+
177
+ **Weight:** 60% (medium-high confidence)
178
+
179
+ **Purpose:** Company IDs that indicate this project when mentioned
180
+
181
+ **When to use:** For client projects or when company name definitively routes to a project
182
+
183
+ ```yaml
184
+ classification:
185
+ associated_companies:
186
+ - acme-corp
187
+ - beta-industries
188
+ ```
189
+
190
+ **Example:** Transcript mentioning "Acme Corp" routes to this project with 60% confidence.
191
+
192
+ **Manage:**
193
+ ```bash
194
+ # CLI
195
+ protokoll project edit client-work \
196
+ --add-company acme-corp \
197
+ --add-company beta-industries
198
+
199
+ # MCP
200
+ use_mcp_tool('protokoll_edit_project', {
201
+ id: 'client-work',
202
+ add_associated_companies: ['acme-corp', 'beta-industries']
203
+ });
204
+ ```
205
+
206
+ ### context_type
207
+
208
+ **Weight:** Modifier (affects routing decisions)
209
+
210
+ **Purpose:** Nature of content for this project
211
+
212
+ **Options:**
213
+ - `work` - Professional/business content
214
+ - `personal` - Personal notes and ideas
215
+ - `mixed` - Contains both work and personal
216
+
217
+ ```yaml
218
+ classification:
219
+ context_type: work
220
+ ```
221
+
222
+ **Manage:**
223
+ ```bash
224
+ # CLI
225
+ protokoll project edit my-project --context-type mixed
226
+
227
+ # MCP
228
+ use_mcp_tool('protokoll_edit_project', {
229
+ id: 'my-project',
230
+ contextType: 'mixed'
231
+ });
232
+ ```
233
+
234
+ ## Viewing Classification
235
+
236
+ Use `project show` to see all classification fields:
237
+
238
+ ```bash
239
+ protokoll project show quarterly-planning
240
+ ```
241
+
242
+ Output includes:
243
+ - Context Type
244
+ - Trigger Phrases (explicit_phrases)
245
+ - Topics
246
+ - Associated People (if any)
247
+ - Associated Companies (if any)
248
+ - Relationships (if configured)
249
+
250
+ ### Auto Tags
251
+
252
+ The `auto_tags` field defines tags that are automatically added to transcripts routed to this project. These tags are combined with tags extracted from classification signals and deduplicated to ensure no duplicates appear in the final transcript metadata.
253
+
254
+ ```yaml
255
+ routing:
256
+ auto_tags:
257
+ - work
258
+ - internal
259
+ ```
260
+
69
261
  ### Another Example
70
262
 
71
263
  ```yaml
@@ -173,6 +365,29 @@ Each signal contributes to a confidence score:
173
365
 
174
366
  The project with highest confidence wins.
175
367
 
368
+ ## Tags
369
+
370
+ Tags are automatically generated from classification signals and added to transcript metadata:
371
+
372
+ - **Extracted from signals**: Each classification signal's value (except `context_type`) becomes a tag
373
+ - **Project auto_tags**: Tags defined in `routing.auto_tags` are added
374
+ - **Automatic deduplication**: If the same tag appears from multiple sources (e.g., both as an `explicit_phrase` and as an `associated_project`), it appears only once in the final transcript
375
+
376
+ Example: If "xenocline" is detected as both an explicit phrase and a project name, the transcript metadata will show:
377
+
378
+ ```yaml
379
+ tags:
380
+ - xenocline
381
+ ```
382
+
383
+ Not:
384
+
385
+ ```yaml
386
+ tags:
387
+ - xenocline
388
+ - xenocline
389
+ ```
390
+
176
391
  ## API
177
392
 
178
393
  ### RoutingInstance
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@redaksjon/protokoll",
3
- "version": "0.1.0",
3
+ "version": "0.3.0",
4
4
  "description": "Focused audio transcription with intelligent context integration",
5
5
  "main": "dist/main.js",
6
6
  "type": "module",