@redaksjon/protokoll 1.0.2 → 1.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,301 @@
1
+ # Protokoll MCP Resources
2
+
3
+ Protokoll exposes several types of resources through its MCP server, providing AI assistants with read-only access to transcripts, audio files, context entities, and configuration.
4
+
5
+ ## Resource Types
6
+
7
+ ### 1. Audio Resources
8
+
9
+ #### Inbound Audio Files
10
+ **URI Template**: `protokoll://audio/inbound?directory={directory}`
11
+
12
+ Lists audio files waiting to be processed in the input directory.
13
+
14
+ **Response Format**:
15
+ ```json
16
+ {
17
+ "directory": "/absolute/path/to/recordings",
18
+ "count": 5,
19
+ "totalSize": 52428800,
20
+ "files": [
21
+ {
22
+ "filename": "recording-2026-01-29.m4a",
23
+ "path": "/absolute/path/to/recordings/recording-2026-01-29.m4a",
24
+ "size": 10485760,
25
+ "sizeHuman": "10.00 MB",
26
+ "modified": "2026-01-29T20:15:30.000Z",
27
+ "extension": "m4a"
28
+ }
29
+ ],
30
+ "supportedExtensions": ["mp3", "mp4", "mpeg", "mpga", "m4a", "wav", "webm", "qta"]
31
+ }
32
+ ```
33
+
34
+ **Use Cases**:
35
+ - Check if there are audio files ready to process
36
+ - Determine which files to transcribe next
37
+ - Monitor the inbound queue
38
+
39
+ #### Processed Audio Files
40
+ **URI Template**: `protokoll://audio/processed?directory={directory}`
41
+
42
+ Lists audio files that have been processed and moved to the processed directory.
43
+
44
+ **Response Format**: Same as inbound audio files
45
+
46
+ **Use Cases**:
47
+ - Track processing history
48
+ - Verify files were successfully processed
49
+ - Clean up old processed files
50
+
51
+ ### 2. Transcript Resources
52
+
53
+ #### Individual Transcript
54
+ **URI Template**: `protokoll://transcript/{path}`
55
+
56
+ Reads a specific transcript file.
57
+
58
+ **Response Format**: Raw markdown content of the transcript
59
+
60
+ **Use Cases**:
61
+ - Review transcript content
62
+ - Extract information from transcripts
63
+ - Analyze transcript metadata
64
+
65
+ #### Transcripts List
66
+ **URI Template**: `protokoll://transcripts?directory={directory}&startDate={date}&endDate={date}&limit={n}&offset={n}`
67
+
68
+ Lists transcripts in a directory with filtering and pagination.
69
+
70
+ **Query Parameters**:
71
+ - `directory` (required): Directory to search for transcripts
72
+ - `startDate` (optional): Filter transcripts from this date onwards (YYYY-MM-DD)
73
+ - `endDate` (optional): Filter transcripts up to this date (YYYY-MM-DD)
74
+ - `limit` (optional): Maximum number of results (default: 50)
75
+ - `offset` (optional): Number of results to skip (default: 0)
76
+
77
+ **Response Format**:
78
+ ```json
79
+ {
80
+ "directory": "/path/to/transcripts",
81
+ "transcripts": [
82
+ {
83
+ "uri": "protokoll://transcript/path/to/file.md",
84
+ "path": "/absolute/path/to/file.md",
85
+ "filename": "2026-01-29-1015-meeting-notes.md",
86
+ "date": "2026-01-29",
87
+ "time": "10:15",
88
+ "title": "Meeting Notes"
89
+ }
90
+ ],
91
+ "pagination": {
92
+ "total": 100,
93
+ "limit": 50,
94
+ "offset": 0,
95
+ "hasMore": true
96
+ },
97
+ "filters": {
98
+ "startDate": "2026-01-01",
99
+ "endDate": "2026-01-31"
100
+ }
101
+ }
102
+ ```
103
+
104
+ **Use Cases**:
105
+ - Browse recent transcripts
106
+ - Search for transcripts by date range
107
+ - Paginate through large transcript collections
108
+
109
+ ### 3. Context Entity Resources
110
+
111
+ #### Individual Entity
112
+ **URI Template**: `protokoll://entity/{type}/{id}`
113
+
114
+ Reads a specific context entity (person, project, term, company, or ignored term).
115
+
116
+ **Entity Types**:
117
+ - `person`: People mentioned in transcripts
118
+ - `project`: Projects that affect routing and classification
119
+ - `term`: Domain-specific terminology and acronyms
120
+ - `company`: Organizations referenced in notes
121
+ - `ignored`: Terms that are explicitly ignored
122
+
123
+ **Response Format**: YAML representation of the entity
124
+
125
+ **Example** (person):
126
+ ```yaml
127
+ id: john-smith
128
+ name: John Smith
129
+ type: person
130
+ firstName: John
131
+ lastName: Smith
132
+ company: acme-corp
133
+ role: Engineering Manager
134
+ sounds_like:
135
+ - jon smith
136
+ - john smyth
137
+ context: Works on the infrastructure team
138
+ ```
139
+
140
+ **Use Cases**:
141
+ - Look up entity details
142
+ - Check phonetic variants for name correction
143
+ - Review entity relationships
144
+
145
+ #### Entities List
146
+ **URI Template**: `protokoll://entities/{type}`
147
+
148
+ Lists all entities of a given type.
149
+
150
+ **Response Format**:
151
+ ```json
152
+ {
153
+ "entityType": "person",
154
+ "count": 25,
155
+ "entities": [
156
+ {
157
+ "uri": "protokoll://entity/person/john-smith",
158
+ "id": "john-smith",
159
+ "name": "John Smith",
160
+ "company": "acme-corp",
161
+ "role": "Engineering Manager"
162
+ }
163
+ ]
164
+ }
165
+ ```
166
+
167
+ **Use Cases**:
168
+ - Browse all people, projects, terms, or companies
169
+ - Build entity indexes
170
+ - Discover available context
171
+
172
+ ### 4. Configuration Resource
173
+
174
+ **URI Template**: `protokoll://config` or `protokoll://config/{path}`
175
+
176
+ Provides information about the Protokoll configuration.
177
+
178
+ **Response Format**:
179
+ ```json
180
+ {
181
+ "hasContext": true,
182
+ "discoveredDirectories": [
183
+ {
184
+ "path": "/home/user/project/.protokoll",
185
+ "level": 0,
186
+ "isPrimary": true
187
+ },
188
+ {
189
+ "path": "/home/user/.protokoll",
190
+ "level": 1,
191
+ "isPrimary": false
192
+ }
193
+ ],
194
+ "entityCounts": {
195
+ "projects": 5,
196
+ "people": 12,
197
+ "terms": 8,
198
+ "companies": 3,
199
+ "ignored": 2
200
+ },
201
+ "config": {
202
+ "outputDirectory": "~/notes",
203
+ "outputStructure": "month",
204
+ "model": "gpt-5.2",
205
+ "smartAssistance": {
206
+ "enabled": true,
207
+ "phoneticModel": "gpt-5-nano",
208
+ "analysisModel": "gpt-5-mini"
209
+ }
210
+ },
211
+ "resourceUris": {
212
+ "projects": "protokoll://entities/project",
213
+ "people": "protokoll://entities/person",
214
+ "terms": "protokoll://entities/term",
215
+ "companies": "protokoll://entities/company"
216
+ }
217
+ }
218
+ ```
219
+
220
+ **Use Cases**:
221
+ - Understand the current configuration
222
+ - Discover available context directories
223
+ - Check entity counts before querying
224
+
225
+ ## Dynamic Resources
226
+
227
+ When you call `resources/list`, Protokoll returns a list of dynamic resources based on the current context:
228
+
229
+ 1. **Current Configuration**: Link to the active configuration
230
+ 2. **Inbound Audio Files**: Link to audio files waiting to be processed
231
+ 3. **Processed Audio Files**: Link to processed audio files (if configured)
232
+ 4. **Entity Lists**: Links to all entity types with counts
233
+ 5. **Recent Transcripts**: Link to the 10 most recent transcripts
234
+
235
+ ## Usage Examples
236
+
237
+ ### Check for Audio Files to Process
238
+
239
+ ```typescript
240
+ // List resources to find inbound audio
241
+ const resources = await client.listResources();
242
+ const inboundAudio = resources.resources.find(r =>
243
+ r.name === 'Inbound Audio Files'
244
+ );
245
+
246
+ // Read the inbound audio resource
247
+ const audioList = await client.readResource(inboundAudio.uri);
248
+ const data = JSON.parse(audioList.text);
249
+
250
+ console.log(`Found ${data.count} audio files to process`);
251
+ data.files.forEach(file => {
252
+ console.log(`- ${file.filename} (${file.sizeHuman})`);
253
+ });
254
+ ```
255
+
256
+ ### Browse Recent Transcripts
257
+
258
+ ```typescript
259
+ // Get recent transcripts
260
+ const resources = await client.listResources();
261
+ const recentTranscripts = resources.resources.find(r =>
262
+ r.name === 'Recent Transcripts'
263
+ );
264
+
265
+ const transcriptList = await client.readResource(recentTranscripts.uri);
266
+ const data = JSON.parse(transcriptList.text);
267
+
268
+ // Read the most recent transcript
269
+ const latest = data.transcripts[0];
270
+ const transcript = await client.readResource(latest.uri);
271
+ console.log(transcript.text);
272
+ ```
273
+
274
+ ### Explore Context Entities
275
+
276
+ ```typescript
277
+ // List all projects
278
+ const projectsUri = 'protokoll://entities/project';
279
+ const projectsList = await client.readResource(projectsUri);
280
+ const projects = JSON.parse(projectsList.text);
281
+
282
+ // Read details for a specific project
283
+ const project = projects.entities[0];
284
+ const projectDetails = await client.readResource(project.uri);
285
+ console.log(projectDetails.text); // YAML format
286
+ ```
287
+
288
+ ## Resource Discovery Workflow
289
+
290
+ 1. **Start with `resources/list`**: Get dynamic resources for the current context
291
+ 2. **Check configuration**: Read the config resource to understand the setup
292
+ 3. **Explore entities**: Use entity list resources to discover available context
293
+ 4. **Access specific data**: Use individual resource URIs to read specific items
294
+
295
+ ## Notes
296
+
297
+ - All audio and transcript paths are absolute paths
298
+ - File sizes are provided in both bytes and human-readable format
299
+ - Transcripts are sorted by date (newest first)
300
+ - Entity lists include URIs for easy navigation
301
+ - Resources are read-only; use tools for modifications
@@ -0,0 +1,278 @@
1
+ # MCP Resources Implementation Summary
2
+
3
+ ## Overview
4
+
5
+ This document describes the implementation of comprehensive MCP resources for Protokoll, providing AI assistants with discoverable, queryable access to audio files, transcripts, and context entities.
6
+
7
+ ## What Was Implemented
8
+
9
+ ### 1. Audio Resources
10
+
11
+ #### Inbound Audio Resource
12
+ - **URI**: `protokoll://audio/inbound?directory={directory}`
13
+ - **Purpose**: Lists audio files waiting to be processed
14
+ - **Features**:
15
+ - Automatic directory discovery from config
16
+ - File metadata (size, modified date, extension)
17
+ - Human-readable file sizes
18
+ - Sorted by modification time (newest first)
19
+ - Supports all configured audio extensions
20
+
21
+ #### Processed Audio Resource
22
+ - **URI**: `protokoll://audio/processed?directory={directory}`
23
+ - **Purpose**: Lists audio files that have been processed
24
+ - **Features**: Same as inbound audio resource
25
+
26
+ ### 2. Enhanced Dynamic Resources
27
+
28
+ The `resources/list` endpoint now returns dynamic resources based on the current context:
29
+
30
+ 1. **Current Configuration**: Active Protokoll configuration
31
+ 2. **Inbound Audio Files**: Audio files ready to process
32
+ 3. **Processed Audio Files**: Previously processed audio files
33
+ 4. **Entity Lists**: All entity types (projects, people, terms, companies) with counts
34
+ 5. **Recent Transcripts**: 10 most recent transcripts in the output directory
35
+
36
+ This makes resources discoverable without requiring the client to know specific URIs.
37
+
38
+ ### 3. Type System Updates
39
+
40
+ #### New Resource Types
41
+ - `audio-inbound`: Inbound audio files
42
+ - `audio-processed`: Processed audio files
43
+
44
+ #### New URI Types
45
+ - `AudioInboundUri`: Parsed inbound audio URI
46
+ - `AudioProcessedUri`: Parsed processed audio URI
47
+
48
+ ### 4. URI Parser Enhancements
49
+
50
+ Added support for parsing audio URIs:
51
+ - `protokoll://audio/inbound`
52
+ - `protokoll://audio/inbound?directory=/path`
53
+ - `protokoll://audio/processed`
54
+ - `protokoll://audio/processed?directory=/path`
55
+
56
+ ### 5. URI Builder Functions
57
+
58
+ New builder functions:
59
+ - `buildAudioInboundUri(directory?: string)`
60
+ - `buildAudioProcessedUri(directory?: string)`
61
+
62
+ ### 6. Resource Reader Functions
63
+
64
+ New reader functions:
65
+ - `readAudioInboundResource(directory?: string)`
66
+ - `readAudioProcessedResource(directory?: string)`
67
+ - `listAudioFiles(directory: string)` (helper)
68
+ - `formatBytes(bytes: number)` (utility)
69
+
70
+ ## File Changes
71
+
72
+ ### Modified Files
73
+
74
+ 1. **src/mcp/resources.ts**
75
+ - Added audio resource templates
76
+ - Enhanced `getDynamicResources()` to include audio and entity list resources
77
+ - Added `listAudioFiles()` helper function
78
+ - Added `readAudioInboundResource()` and `readAudioProcessedResource()`
79
+ - Added `formatBytes()` utility function
80
+ - Updated `handleReadResource()` to handle audio resources
81
+
82
+ 2. **src/mcp/types.ts**
83
+ - Added `audio-inbound` and `audio-processed` to `ResourceType`
84
+ - Added `AudioInboundUri` interface
85
+ - Added `AudioProcessedUri` interface
86
+
87
+ 3. **src/mcp/uri.ts**
88
+ - Added imports for new URI types
89
+ - Added `parseAudioUri()` function
90
+ - Updated `parseUri()` to handle audio URIs
91
+ - Added `buildAudioInboundUri()` function
92
+ - Added `buildAudioProcessedUri()` function
93
+ - Updated `getResourceType()` to recognize audio URIs
94
+
95
+ 4. **tests/mcp/uri.test.ts**
96
+ - Added tests for parsing audio URIs (5 tests)
97
+ - Added tests for building audio URIs (4 tests)
98
+ - Added tests for `getResourceType()` with audio URIs (2 tests)
99
+ - Total: 39 tests (all passing)
100
+
101
+ ### New Files
102
+
103
+ 1. **docs/MCP_RESOURCES.md**
104
+ - Comprehensive documentation of all resource types
105
+ - Usage examples
106
+ - Response format specifications
107
+ - Resource discovery workflow
108
+
109
+ 2. **docs/MCP_RESOURCES_IMPLEMENTATION.md** (this file)
110
+ - Implementation summary
111
+ - Technical details
112
+ - Design decisions
113
+
114
+ ## Design Decisions
115
+
116
+ ### 1. Audio File Discovery
117
+
118
+ Audio files are discovered by:
119
+ 1. Reading the directory specified in the URI parameter
120
+ 2. Falling back to the config's `inputDirectory` or `processedDirectory`
121
+ 3. Filtering files by extension using `DEFAULT_AUDIO_EXTENSIONS`
122
+ 4. Sorting by modification time (newest first)
123
+
124
+ ### 2. Dynamic Resource Generation
125
+
126
+ Dynamic resources are generated when `resources/list` is called:
127
+ - Provides immediate visibility into available data
128
+ - Includes entity counts to help clients decide whether to query
129
+ - Uses the current context to determine what's available
130
+ - Returns empty list if no context is found (graceful degradation)
131
+
132
+ ### 3. Error Handling
133
+
134
+ - Returns empty array if directory doesn't exist (ENOENT)
135
+ - Throws error if context is not available
136
+ - Validates URI format and throws descriptive errors
137
+ - Handles missing optional parameters gracefully
138
+
139
+ ### 4. File Size Formatting
140
+
141
+ Human-readable file sizes are provided alongside raw bytes:
142
+ - Makes it easier for AI assistants to communicate with users
143
+ - Uses standard units (B, KB, MB, GB)
144
+ - Rounds to 2 decimal places
145
+
146
+ ### 5. URI Structure
147
+
148
+ Audio URIs follow the pattern:
149
+ - `protokoll://audio/{type}?directory={path}`
150
+ - Type is either `inbound` or `processed`
151
+ - Directory parameter is optional (falls back to config)
152
+
153
+ This structure:
154
+ - Groups related resources under `/audio`
155
+ - Clearly distinguishes between inbound and processed
156
+ - Allows directory override without requiring config changes
157
+
158
+ ## Testing
159
+
160
+ ### Test Coverage
161
+
162
+ - 39 URI tests (all passing)
163
+ - 100% coverage of URI parsing and building functions
164
+ - Tests for:
165
+ - Parsing audio URIs with and without directory
166
+ - Building audio URIs with and without directory
167
+ - Resource type detection for audio URIs
168
+ - Invalid audio type handling
169
+
170
+ ### Manual Testing Checklist
171
+
172
+ - [ ] List resources returns audio resources when context is available
173
+ - [ ] Read inbound audio resource returns correct file list
174
+ - [ ] Read processed audio resource returns correct file list
175
+ - [ ] Audio files are sorted by modification time
176
+ - [ ] File sizes are formatted correctly
177
+ - [ ] Directory parameter override works
178
+ - [ ] Fallback to config directories works
179
+ - [ ] Empty directory returns empty file list
180
+ - [ ] Non-existent directory returns empty file list
181
+ - [ ] Invalid audio type throws error
182
+
183
+ ## Integration Points
184
+
185
+ ### 1. Context System
186
+
187
+ Resources integrate with the context system to:
188
+ - Discover input and output directories
189
+ - Load entity data for entity list resources
190
+ - Determine if context is available
191
+
192
+ ### 2. Configuration
193
+
194
+ Resources use configuration to:
195
+ - Get default input directory (`inputDirectory`)
196
+ - Get processed directory (`processedDirectory`)
197
+ - Get output directory for transcripts (`outputDirectory`)
198
+
199
+ ### 3. Audio Tools
200
+
201
+ Audio resources complement the audio tools:
202
+ - Tools process audio files
203
+ - Resources discover and list audio files
204
+ - Together they provide complete audio workflow visibility
205
+
206
+ ## Future Enhancements
207
+
208
+ ### Potential Additions
209
+
210
+ 1. **Audio File Metadata Resource**
211
+ - `protokoll://audio/file/{path}`
212
+ - Detailed metadata for a specific audio file
213
+ - Duration, format, bitrate, etc.
214
+
215
+ 2. **Transcript Search Resource**
216
+ - `protokoll://transcripts/search?query={text}`
217
+ - Full-text search across transcripts
218
+ - Returns matching transcripts with context
219
+
220
+ 3. **Entity Search Resource**
221
+ - `protokoll://entities/search?query={text}`
222
+ - Search across all entity types
223
+ - Returns matching entities with relevance scores
224
+
225
+ 4. **Recent Activity Resource**
226
+ - `protokoll://activity/recent?limit={n}`
227
+ - Recent transcriptions, entity changes, etc.
228
+ - Provides activity timeline
229
+
230
+ 5. **Statistics Resource**
231
+ - `protokoll://stats`
232
+ - Aggregate statistics (total transcripts, audio files, entities)
233
+ - Processing history and trends
234
+
235
+ ### Subscription Support
236
+
237
+ Currently, resources have `subscribe: false` in the server capabilities. Future versions could support:
238
+ - Notifications when new audio files appear
239
+ - Notifications when transcripts are created
240
+ - Notifications when entities are modified
241
+
242
+ This would enable real-time monitoring and reactive workflows.
243
+
244
+ ## Performance Considerations
245
+
246
+ ### Current Implementation
247
+
248
+ - File listing uses `readdir()` with `withFileTypes: true` for efficiency
249
+ - Stat calls are made only for matching audio files
250
+ - Sorting is done in-memory (acceptable for typical directory sizes)
251
+ - No caching (resources are read fresh each time)
252
+
253
+ ### Scalability
254
+
255
+ For large directories (1000+ files):
256
+ - Consider adding pagination to audio resources
257
+ - Consider adding file count limits
258
+ - Consider caching with TTL
259
+ - Consider background indexing for search
260
+
261
+ ## Conclusion
262
+
263
+ This implementation provides comprehensive resource discovery for Protokoll's MCP server. AI assistants can now:
264
+
265
+ 1. **Discover available data** through dynamic resources
266
+ 2. **Query audio files** waiting to be processed or already processed
267
+ 3. **Browse transcripts** with filtering and pagination
268
+ 4. **Explore context entities** to understand available knowledge
269
+ 5. **Access configuration** to understand the current setup
270
+
271
+ The implementation follows MCP best practices:
272
+ - Resources are read-only (modifications use tools)
273
+ - URIs are structured and predictable
274
+ - Response formats are consistent and well-documented
275
+ - Error handling is graceful and informative
276
+ - Type system is comprehensive and type-safe
277
+
278
+ This foundation enables powerful workflows where AI assistants can autonomously discover, query, and process audio files and transcripts.