@aborruso/ckan-mcp-server 0.4.6 → 0.4.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/PRD.md CHANGED
@@ -2,35 +2,39 @@
2
2
 
3
3
  ## CKAN MCP Server
4
4
 
5
- **Version**: 1.0.0
6
- **Last Updated**: 2026-01-08
5
+ **Version**: 0.4.7
6
+ **Last Updated**: 2026-01-10
7
7
  **Author**: onData
8
- **Status**: Implemented
8
+ **Status**: Production
9
9
 
10
10
  ---
11
11
 
12
12
  ## 1. Executive Summary
13
13
 
14
- CKAN MCP Server è un server Model Context Protocol (MCP) che permette a AI agent (come Claude Desktop) di interagire con oltre 500 portali di dati aperti basati su CKAN in tutto il mondo. Il server espone strumenti MCP per ricercare dataset, esplorare organizzazioni, interrogare dati tabulari e accedere a metadati completi.
14
+ CKAN MCP Server is a Model Context Protocol (MCP) server that enables AI agents (like Claude Desktop) to interact with over 500 CKAN-based open data portals worldwide. The server exposes MCP tools to search datasets, explore organizations, query tabular data, and access complete metadata.
15
15
 
16
16
  ### 1.1 Problem Statement
17
17
 
18
- Gli AI agent non hanno un modo nativo per:
19
- - Scoprire e cercare dataset negli open data portal
20
- - Interrogare metadati strutturati di dataset governativi
21
- - Eseguire query su dati tabulari pubblicati su portali CKAN
22
- - Esplorare organizzazioni pubbliche e la loro produzione di dati aperti
18
+ AI agents lack native capabilities to:
19
+ - Discover and search datasets in open data portals
20
+ - Query structured metadata from government datasets
21
+ - Execute queries on tabular data published on CKAN portals
22
+ - Explore public organizations and their open data production
23
23
 
24
24
  ### 1.2 Solution
25
25
 
26
- Un server MCP che espone tool per interagire con le API CKAN v3, permettendo agli AI agent di:
27
- - Cercare dataset con query Solr avanzate
28
- - Ottenere metadati completi di dataset e risorse
29
- - Esplorare organizzazioni e gruppi
30
- - Interrogare il DataStore con filtri e sorting
31
- - Analizzare statistiche tramite faceting
26
+ An MCP server that exposes tools to interact with CKAN API v3, enabling AI agents to:
27
+ - Search datasets with advanced Solr queries and relevance ranking
28
+ - Get complete metadata for datasets and resources
29
+ - Explore organizations, groups, and tags
30
+ - Query DataStore with filters, sorting, and SQL queries
31
+ - Analyze statistics through faceting
32
+ - Global deployment on Cloudflare Workers for worldwide edge access
32
33
 
33
- **Distribution Strategy**: Pubblicazione su npm registry per installazione semplice e universale (come PyPI per Python), permettendo a chiunque di installare il server con un singolo comando senza bisogno di clonare repository o compilare codice.
34
+ **Distribution Strategy**: Multi-platform deployment:
35
+ - **npm registry**: Global installation with `npm install -g @aborruso/ckan-mcp-server`
36
+ - **Cloudflare Workers**: Global edge deployment (https://ckan-mcp-server.andy-pr.workers.dev)
37
+ - **Self-hosted**: HTTP server mode for custom infrastructure
34
38
 
35
39
  ---
36
40
 
@@ -38,16 +42,16 @@ Un server MCP che espone tool per interagire con le API CKAN v3, permettendo agl
38
42
 
39
43
  ### 2.1 Primary Users
40
44
 
41
- - **Data Scientist & Analyst**: Ricerca e analisi di dataset pubblici
42
- - **Civic Hacker & Developer**: Sviluppo applicazioni su dati aperti
43
- - **Researcher & Journalist**: Investigazione e analisi dati governativi
44
- - **Public Administration**: Esplorazione cataloghi dati aperti
45
+ - **Data Scientist & Analyst**: Research and analysis of public datasets
46
+ - **Civic Hacker & Developer**: Application development on open data
47
+ - **Researcher & Journalist**: Investigation and analysis of government data
48
+ - **Public Administration**: Exploration of open data catalogs
45
49
 
46
50
  ### 2.2 AI Agent Use Cases
47
51
 
48
- - **Claude Desktop**: Integrazione nativa tramite configurazione MCP
49
- - **Altri client MCP**: Qualsiasi client compatibile con MCP protocol
50
- - **Automazioni**: Script e workflow che necessitano accesso a CKAN
52
+ - **Claude Desktop**: Native integration via MCP configuration
53
+ - **Other MCP clients**: Any MCP protocol-compatible client
54
+ - **Automation**: Scripts and workflows requiring CKAN access
51
55
 
52
56
  ---
53
57
 
@@ -57,101 +61,135 @@ Un server MCP che espone tool per interagire con le API CKAN v3, permettendo agl
57
61
 
58
62
  #### FR-1: Dataset Search
59
63
  - **Priority**: High
60
- - **Description**: Cercare dataset su qualsiasi server CKAN usando sintassi Solr
64
+ - **Description**: Search datasets on any CKAN server using Solr syntax
61
65
  - **Acceptance Criteria**:
62
- - Supporto query full-text (q parameter)
63
- - Filtri avanzati (fq parameter)
64
- - Faceting per statistiche (organization, tags, formats)
65
- - Paginazione (start/rows)
66
- - Ordinamento (sort parameter)
67
- - Output in formato Markdown o JSON
66
+ - Full-text query support (q parameter)
67
+ - Advanced filters (fq parameter)
68
+ - Faceting for statistics (organization, tags, formats)
69
+ - Pagination (start/rows)
70
+ - Sorting (sort parameter)
71
+ - Output in Markdown or JSON format
68
72
  - **Implementation Status**: ✅ Implemented (`ckan_package_search`)
69
73
 
70
74
  #### FR-2: Dataset Details
71
75
  - **Priority**: High
72
- - **Description**: Ottenere metadati completi di un dataset specifico
76
+ - **Description**: Get complete metadata for a specific dataset
73
77
  - **Acceptance Criteria**:
74
- - Ricerca per ID o name
75
- - Metadati base (title, description, author, license)
76
- - Lista risorse con dettagli (format, size, URL, DataStore status)
77
- - Organizzazione e tag
78
- - Campi extra custom
79
- - Tracking statistics opzionale
78
+ - Search by ID or name
79
+ - Basic metadata (title, description, author, license)
80
+ - Resource list with details (format, size, URL, DataStore status)
81
+ - Organization and tags
82
+ - Custom extra fields
83
+ - Optional tracking statistics
80
84
  - **Implementation Status**: ✅ Implemented (`ckan_package_show`)
81
85
 
82
86
  #### FR-3: Organization Discovery
83
87
  - **Priority**: Medium
84
- - **Description**: Esplorare organizzazioni che pubblicano dataset
88
+ - **Description**: Explore organizations publishing datasets
85
89
  - **Acceptance Criteria**:
86
- - Lista tutte le organizzazioni (con/senza dettagli completi)
87
- - Ricerca per pattern nel nome
88
- - Ordinamento e paginazione
89
- - Conteggio dataset per organizzazione
90
- - Dettagli completi organizzazione con lista dataset
90
+ - List all organizations (with/without full details)
91
+ - Search by name pattern
92
+ - Sorting and pagination
93
+ - Dataset count per organization
94
+ - Complete organization details with dataset list
91
95
  - **Implementation Status**: ✅ Implemented (`ckan_organization_list`, `ckan_organization_show`, `ckan_organization_search`)
92
96
 
93
97
  #### FR-4: DataStore Query
94
- - **Priority**: Medium
95
- - **Description**: Interrogare dati tabulari nel CKAN DataStore
98
+ - **Priority**: High
99
+ - **Description**: Query tabular data in CKAN DataStore with standard queries and SQL
96
100
  - **Acceptance Criteria**:
97
- - Query per resource_id
98
- - Filtri chiave-valore
101
+ - Query by resource_id
102
+ - Key-value filters
99
103
  - Full-text search (q parameter)
100
- - Ordinamento
101
- - Selezione campi specifici
102
- - Paginazione (limit/offset)
103
- - Valori distinct
104
- - **Implementation Status**: ✅ Implemented (`ckan_datastore_search`)
104
+ - Sorting and field selection
105
+ - Pagination (limit/offset)
106
+ - Distinct values
107
+ - SQL queries with SELECT, WHERE, JOIN, GROUP BY
108
+ - **Implementation Status**: ✅ Implemented (`ckan_datastore_search`, `ckan_datastore_search_sql`)
105
109
 
106
- #### FR-5: Server Status Check
107
- - **Priority**: Low
108
- - **Description**: Verificare disponibilità e versione di un server CKAN
110
+ #### FR-5: Tag Management
111
+ - **Priority**: Medium
112
+ - **Description**: Explore available tags in CKAN portals
109
113
  - **Acceptance Criteria**:
110
- - Verifica connessione server
111
- - Informazioni versione CKAN
112
- - Site title e URL
113
- - **Implementation Status**: ✅ Implemented (`ckan_status_show`)
114
+ - List all tags with dataset count
115
+ - Search by name pattern
116
+ - Pagination and sorting
117
+ - Faceting with vocabularies
118
+ - **Implementation Status**: ✅ Implemented (`ckan_tag_list`)
119
+
120
+ #### FR-6: Group Management
121
+ - **Priority**: Medium
122
+ - **Description**: Explore thematic groups in CKAN portals
123
+ - **Acceptance Criteria**:
124
+ - List all groups
125
+ - Search by pattern
126
+ - Group details with included datasets
127
+ - Sorting and pagination
128
+ - **Implementation Status**: ✅ Implemented (`ckan_group_list`, `ckan_group_show`, `ckan_group_search`)
129
+
130
+ #### FR-7: AI-Powered Dataset Discovery
131
+ - **Priority**: High
132
+ - **Description**: Search datasets with AI-based relevance ranking
133
+ - **Acceptance Criteria**:
134
+ - Natural language queries
135
+ - Scoring based on title/description/tags match
136
+ - Automatic relevance ranking
137
+ - Output with score visibility
138
+ - **Implementation Status**: ✅ Implemented (`ckan_find_relevant_datasets`)
114
139
 
115
- #### FR-6: Package List
140
+ #### FR-8: Server Status Check
116
141
  - **Priority**: Low
117
- - **Description**: Lista semplice di tutti i dataset (solo nomi)
142
+ - **Description**: Check availability and version of a CKAN server
118
143
  - **Acceptance Criteria**:
119
- - Lista nomi dataset
120
- - Paginazione
121
- - **Implementation Status**: Not implemented (menzionato nel README ma non presente nel codice)
144
+ - Server connection verification
145
+ - CKAN version information
146
+ - Site title and URL
147
+ - **Implementation Status**: ✅ Implemented (`ckan_status_show`)
122
148
 
123
149
  ### 3.2 Non-Functional Requirements
124
150
 
125
151
  #### NFR-1: Performance
126
- - **Response Time**: Timeout HTTP a 30 secondi
127
- - **Throughput**: Limitato dalle API CKAN del server remoto
128
- - **Scalability**: Stateless, può gestire richieste multiple in parallelo
152
+ - **Response Time**: HTTP timeout at 30 seconds
153
+ - **Throughput**: Limited by remote CKAN server APIs
154
+ - **Scalability**:
155
+ - Stateless, can handle multiple parallel requests
156
+ - Cloudflare Workers: global edge deployment with cold start < 60ms
157
+ - Workers free tier: 100,000 requests/day
158
+ - **Bundle Size**: ~420KB (135KB gzipped)
129
159
 
130
160
  #### NFR-2: Reliability
131
161
  - **Error Handling**:
132
- - Gestione errori HTTP (404, 500, timeout)
133
- - Validazione input con Zod strict schemas
134
- - Messaggi errore descrittivi
135
- - **Availability**: Dipende dalla disponibilità dei server CKAN remoti
162
+ - HTTP error management (404, 500, timeout)
163
+ - Input validation with Zod strict schemas
164
+ - Descriptive error messages
165
+ - **Availability**: Depends on remote CKAN server availability
136
166
 
137
167
  #### NFR-3: Usability
138
168
  - **Output Format**:
139
- - Markdown per leggibilità umana (default)
140
- - JSON per elaborazione programmatica
141
- - **Character Limit**: Troncamento automatico a 50.000 caratteri
142
- - **Documentation**: README completo con esempi
169
+ - Markdown for human readability (default)
170
+ - JSON for programmatic processing
171
+ - **Character Limit**: Automatic truncation at 50,000 characters
172
+ - **Documentation**:
173
+ - Complete README with examples
174
+ - EXAMPLES.md with advanced use cases
175
+ - HTML readme on worker root endpoint
176
+ - Complete deployment guide
143
177
 
144
178
  #### NFR-4: Compatibility
145
- - **CKAN Versions**: API v3 (compatibile con CKAN 2.x e 3.x)
146
- - **Node.js**: Richiede Node.js >= 18.0.0
179
+ - **CKAN Versions**: API v3 (compatible with CKAN 2.x and 3.x)
180
+ - **Node.js**: >= 18.0.0 (for local installation)
147
181
  - **Transport Modes**:
148
- - stdio (default) per integrazione locale
149
- - HTTP per accesso remoto
182
+ - stdio (default) for local integration
183
+ - HTTP for remote access
184
+ - Cloudflare Workers for global edge deployment
185
+ - **Runtimes**:
186
+ - Node.js (local/self-hosted)
187
+ - Cloudflare Workers (browser runtime, Web Standards API)
150
188
 
151
189
  #### NFR-5: Security
152
- - **Authentication**: Non supportata (solo endpoint pubblici)
153
- - **Read-Only**: Tutti i tool sono read-only, nessuna modifica ai dati
154
- - **Input Validation**: Strict schema validation con Zod
190
+ - **Authentication**: Not supported (public endpoints only)
191
+ - **Read-Only**: All tools are read-only, no data modification
192
+ - **Input Validation**: Strict schema validation with Zod
155
193
 
156
194
  ---
157
195
 
@@ -160,18 +198,23 @@ Un server MCP che espone tool per interagire con le API CKAN v3, permettendo agl
160
198
  ### 4.1 Technology Stack
161
199
 
162
200
  **Runtime**:
163
- - Node.js >= 18.0.0
201
+ - Node.js >= 18.0.0 (local/self-hosted)
202
+ - Cloudflare Workers (browser runtime, edge deployment)
164
203
  - TypeScript (ES2022)
165
204
 
166
205
  **Dependencies**:
167
206
  - `@modelcontextprotocol/sdk@^1.0.4` - MCP protocol implementation
168
207
  - `axios@^1.7.2` - HTTP client
169
208
  - `zod@^3.23.8` - Schema validation
170
- - `express@^4.19.2` - HTTP server (modalità HTTP)
209
+ - `express@^4.19.2` - HTTP server (HTTP mode, optional)
171
210
 
172
211
  **Build Tools**:
173
- - `esbuild@^0.27.2` - Bundler ultra-veloce (~4ms)
174
- - `typescript@^5.4.5` - Type checking e editor support
212
+ - `esbuild@^0.27.2` - Ultra-fast bundler (~50ms)
213
+ - `typescript@^5.4.5` - Type checking and editor support
214
+ - `wrangler@^4.58.0` - Cloudflare Workers CLI
215
+
216
+ **Test Framework**:
217
+ - `vitest@^4.0.16` - Test runner (130 tests, 100% passing)
175
218
 
176
219
  ### 4.2 Architecture Diagram
177
220
 
@@ -181,23 +224,29 @@ Un server MCP che espone tool per interagire con le API CKAN v3, permettendo agl
181
224
  │ (Claude Desktop, etc.) │
182
225
  └─────────────┬───────────────────────────────────────┘
183
226
 
184
- │ MCP Protocol (stdio or HTTP)
227
+ │ MCP Protocol (stdio, HTTP, or Workers)
185
228
 
186
229
  ┌─────────────▼───────────────────────────────────────┐
187
230
  │ CKAN MCP Server │
231
+ │ (Node.js or Workers runtime) │
188
232
  │ ┌───────────────────────────────────────────────┐ │
189
- │ │ MCP Tool Registry │ │
233
+ │ │ MCP Tool Registry (13 tools) │ │
190
234
  │ │ - ckan_package_search │ │
191
235
  │ │ - ckan_package_show │ │
236
+ │ │ - ckan_find_relevant_datasets │ │
192
237
  │ │ - ckan_organization_list/show/search │ │
238
+ │ │ - ckan_group_list/show/search │ │
239
+ │ │ - ckan_tag_list │ │
193
240
  │ │ - ckan_datastore_search │ │
241
+ │ │ - ckan_datastore_search_sql │ │
194
242
  │ │ - ckan_status_show │ │
195
243
  │ └───────────┬───────────────────────────────────┘ │
196
244
  │ │ │
197
245
  │ ┌───────────▼───────────────────────────────────┐ │
198
- │ │ HTTP Client (axios) │ │
246
+ │ │ HTTP Client (axios/fetch) │ │
199
247
  │ │ - Timeout: 30s │ │
200
- │ │ - User-Agent: CKAN-MCP-Server/1.0 │ │
248
+ │ │ - User-Agent: CKAN-MCP-Server/0.4.x │ │
249
+ │ │ - Portal config with search parser override │ │
201
250
  │ └───────────┬───────────────────────────────────┘ │
202
251
  └──────────────┼───────────────────────────────────────┘
203
252
 
@@ -210,29 +259,29 @@ Un server MCP che espone tool per interagire con le API CKAN v3, permettendo agl
210
259
  │ - open.canada.ca (CA) │
211
260
  │ - data.gov.uk (UK) │
212
261
  │ - data.europa.eu (EU) │
213
- │ - 500+ altri portali CKAN
262
+ │ - 500+ other CKAN portals
214
263
  └──────────────────────────────────────────────────────┘
215
264
  ```
216
265
 
217
266
  ### 4.3 Component Description
218
267
 
219
268
  #### MCP Tool Registry
220
- Registra i tool MCP disponibili con:
269
+ Registers available MCP tools with:
221
270
  - Input schema (Zod validation)
222
271
  - Output format (Markdown/JSON)
223
272
  - MCP annotations (readonly, idempotent, etc.)
224
273
  - Handler function
225
274
 
226
275
  #### HTTP Client Layer
227
- - Normalizza URL server (rimuove trailing slash)
228
- - Costruisce endpoint API: `{server_url}/api/3/action/{action}`
229
- - Gestisce timeout e errori
230
- - Valida response (`success: true`)
276
+ - Normalizes server URL (removes trailing slash)
277
+ - Builds API endpoint: `{server_url}/api/3/action/{action}`
278
+ - Handles timeout and errors
279
+ - Validates response (`success: true`)
231
280
 
232
281
  #### Output Formatter
233
- - Markdown: Tabelle, sezioni, formatting per leggibilità
234
- - JSON: Structured output con `structuredContent`
235
- - Truncation: Limita output a CHARACTER_LIMIT (50000)
282
+ - Markdown: Tables, sections, formatting for readability
283
+ - JSON: Structured output with `structuredContent`
284
+ - Truncation: Limits output to CHARACTER_LIMIT (50000)
236
285
 
237
286
  ---
238
287
 
@@ -240,29 +289,29 @@ Registra i tool MCP disponibili con:
240
289
 
241
290
  ### 5.1 ckan_package_search
242
291
 
243
- **Purpose**: Ricerca dataset con query Solr avanzate
292
+ **Purpose**: Search datasets with advanced Solr queries
244
293
 
245
294
  **Input Parameters**:
246
295
  ```typescript
247
296
  {
248
- server_url: string (required) // Base URL server CKAN
249
- q: string (default: "*:*") // Query Solr
297
+ server_url: string (required) // CKAN server base URL
298
+ q: string (default: "*:*") // Solr query
250
299
  fq?: string // Filter query
251
- rows: number (default: 10) // Risultati per pagina (max 1000)
252
- start: number (default: 0) // Offset paginazione
253
- sort?: string // Es: "metadata_modified desc"
254
- facet_field?: string[] // Campi per faceting
255
- facet_limit: number (default: 50) // Max valori per facet
300
+ rows: number (default: 10) // Results per page (max 1000)
301
+ start: number (default: 0) // Pagination offset
302
+ sort?: string // E.g.: "metadata_modified desc"
303
+ facet_field?: string[] // Fields for faceting
304
+ facet_limit: number (default: 50) // Max values per facet
256
305
  include_drafts: boolean (default: false)
257
306
  response_format: "markdown" | "json" (default: "markdown")
258
307
  }
259
308
  ```
260
309
 
261
310
  **Output**:
262
- - Conteggio totale risultati
263
- - Array di dataset con metadati base
264
- - Facets (se richiesti)
265
- - Link paginazione
311
+ - Total results count
312
+ - Array of datasets with basic metadata
313
+ - Facets (if requested)
314
+ - Pagination links
266
315
 
267
316
  **Solr Query Examples**:
268
317
  - `q: "popolazione"` - Full-text search
@@ -273,28 +322,28 @@ Registra i tool MCP disponibili con:
273
322
 
274
323
  ### 5.2 ckan_package_show
275
324
 
276
- **Purpose**: Dettagli completi di un dataset
325
+ **Purpose**: Complete details of a dataset
277
326
 
278
327
  **Input Parameters**:
279
328
  ```typescript
280
329
  {
281
330
  server_url: string (required)
282
- id: string (required) // Dataset ID o name
331
+ id: string (required) // Dataset ID or name
283
332
  include_tracking: boolean (default: false)
284
333
  response_format: "markdown" | "json"
285
334
  }
286
335
  ```
287
336
 
288
337
  **Output**:
289
- - Metadati completi (title, description, author, license)
290
- - Organizzazione
291
- - Tags e gruppi
292
- - Lista risorse con dettagli (format, size, URL, DataStore status)
293
- - Extra fields custom
338
+ - Complete metadata (title, description, author, license)
339
+ - Organization
340
+ - Tags and groups
341
+ - Resource list with details (format, size, URL, DataStore status)
342
+ - Custom extra fields
294
343
 
295
344
  ### 5.3 ckan_organization_list
296
345
 
297
- **Purpose**: Lista organizzazioni
346
+ **Purpose**: List organizations
298
347
 
299
348
  **Input Parameters**:
300
349
  ```typescript
@@ -302,25 +351,25 @@ Registra i tool MCP disponibili con:
302
351
  server_url: string (required)
303
352
  all_fields: boolean (default: false)
304
353
  sort: string (default: "name asc")
305
- limit: number (default: 100) // 0 per solo count
354
+ limit: number (default: 100) // 0 for count only
306
355
  offset: number (default: 0)
307
356
  response_format: "markdown" | "json"
308
357
  }
309
358
  ```
310
359
 
311
360
  **Output**:
312
- - Array di organizzazioni (nomi o oggetti completi)
313
- - Se `limit=0`: conteggio organizzazioni con dataset
361
+ - Array of organizations (names or complete objects)
362
+ - If `limit=0`: count of organizations with datasets
314
363
 
315
364
  ### 5.4 ckan_organization_show
316
365
 
317
- **Purpose**: Dettagli organizzazione specifica
366
+ **Purpose**: Specific organization details
318
367
 
319
368
  **Input Parameters**:
320
369
  ```typescript
321
370
  {
322
371
  server_url: string (required)
323
- id: string (required) // Organization ID o name
372
+ id: string (required) // Organization ID or name
324
373
  include_datasets: boolean (default: true)
325
374
  include_users: boolean (default: false)
326
375
  response_format: "markdown" | "json"
@@ -328,33 +377,33 @@ Registra i tool MCP disponibili con:
328
377
  ```
329
378
 
330
379
  **Output**:
331
- - Dettagli organizzazione
332
- - Lista dataset (opzionale)
333
- - Lista utenti con ruoli (opzionale)
380
+ - Organization details
381
+ - Dataset list (optional)
382
+ - User list with roles (optional)
334
383
 
335
384
  ### 5.5 ckan_organization_search
336
385
 
337
- **Purpose**: Ricerca organizzazioni per pattern
386
+ **Purpose**: Search organizations by pattern
338
387
 
339
388
  **Input Parameters**:
340
389
  ```typescript
341
390
  {
342
391
  server_url: string (required)
343
- pattern: string (required) // Pattern (wildcards automatici)
392
+ pattern: string (required) // Pattern (automatic wildcards)
344
393
  response_format: "markdown" | "json"
345
394
  }
346
395
  ```
347
396
 
348
397
  **Output**:
349
- - Lista organizzazioni matchanti
350
- - Conteggio dataset per organizzazione
351
- - Totale dataset
398
+ - List of matching organizations
399
+ - Dataset count per organization
400
+ - Total datasets
352
401
 
353
- **Implementation**: Usa `package_search` con `organization:*{pattern}*` e faceting
402
+ **Implementation**: Uses `package_search` with `organization:*{pattern}*` and faceting
354
403
 
355
404
  ### 5.6 ckan_datastore_search
356
405
 
357
- **Purpose**: Query dati tabulari nel DataStore
406
+ **Purpose**: Query tabular data in DataStore
358
407
 
359
408
  **Input Parameters**:
360
409
  ```typescript
@@ -362,11 +411,11 @@ Registra i tool MCP disponibili con:
362
411
  server_url: string (required)
363
412
  resource_id: string (required)
364
413
  q?: string // Full-text search
365
- filters?: Record<string, any> // Filtri chiave-valore
414
+ filters?: Record<string, any> // Key-value filters
366
415
  limit: number (default: 100) // Max 32000
367
416
  offset: number (default: 0)
368
- fields?: string[] // Campi da restituire
369
- sort?: string // Es: "anno desc"
417
+ fields?: string[] // Fields to return
418
+ sort?: string // E.g.: "anno desc"
370
419
  distinct: boolean (default: false)
371
420
  response_format: "markdown" | "json"
372
421
  }
@@ -375,16 +424,16 @@ Registra i tool MCP disponibili con:
375
424
  **Output**:
376
425
  - Total records count
377
426
  - Fields metadata (type, id)
378
- - Records (max 50 in markdown per leggibilità)
379
- - Paginazione info
427
+ - Records (max 50 in markdown for readability)
428
+ - Pagination info
380
429
 
381
430
  **Limitations**:
382
- - Non tutte le risorse hanno DataStore attivo
383
- - Max 32000 record per query
431
+ - Not all resources have active DataStore
432
+ - Max 32000 records per query
384
433
 
385
434
  ### 5.7 ckan_status_show
386
435
 
387
- **Purpose**: Verifica stato server CKAN
436
+ **Purpose**: Check CKAN server status
388
437
 
389
438
  **Input Parameters**:
390
439
  ```typescript
@@ -403,7 +452,7 @@ Registra i tool MCP disponibili con:
403
452
 
404
453
  ## 6. Supported CKAN Portals
405
454
 
406
- Il server può connettersi a **qualsiasi server CKAN pubblico**. Principali portali:
455
+ The server can connect to **any public CKAN server**. Main portals:
407
456
 
408
457
  | Country | Portal | URL |
409
458
  |---------|--------|-----|
@@ -414,9 +463,9 @@ Il server può connettersi a **qualsiasi server CKAN pubblico**. Principali port
414
463
  | 🇪🇺 EU | European Data Portal | https://data.europa.eu |
415
464
  | 🌍 Demo | CKAN Official Demo | https://demo.ckan.org |
416
465
 
417
- **Compatibilità**:
418
- - CKAN API v3 (CKAN 2.x e 3.x)
419
- - Oltre 500 portali nel mondo
466
+ **Compatibility**:
467
+ - CKAN API v3 (CKAN 2.x and 3.x)
468
+ - Over 500 portals worldwide
420
469
 
421
470
  ---
422
471
 
@@ -425,7 +474,7 @@ Il server può connettersi a **qualsiasi server CKAN pubblico**. Principali port
425
474
  ### 7.1 Prerequisites
426
475
 
427
476
  - Node.js >= 18.0.0
428
- - npm o yarn
477
+ - npm or yarn
429
478
 
430
479
  ### 7.2 Installation
431
480
 
@@ -446,7 +495,7 @@ npm install ckan-mcp-server
446
495
  npx ckan-mcp-server
447
496
  ```
448
497
 
449
- > **Note**: La pubblicazione su npm registry è pianificata per permettere installazione semplice come PyPI in Python. Attualmente richiede installazione da repository.
498
+ > **Note**: Publishing to npm registry is planned to enable simple installation like PyPI in Python. Currently requires installation from repository.
450
499
 
451
500
  #### Option 2: From Source (Current)
452
501
 
@@ -460,7 +509,7 @@ npm run build
460
509
  ### 7.3 Usage Modes
461
510
 
462
511
  #### stdio Mode (Default)
463
- Per integrazione con Claude Desktop e altri client MCP locali:
512
+ For integration with Claude Desktop and other local MCP clients:
464
513
 
465
514
  ```bash
466
515
  npm start
@@ -493,13 +542,13 @@ npm start
493
542
  ```
494
543
 
495
544
  #### HTTP Mode
496
- Per accesso remoto via HTTP:
545
+ For remote access via HTTP:
497
546
 
498
547
  ```bash
499
548
  TRANSPORT=http PORT=3000 npm start
500
549
  ```
501
550
 
502
- Server disponibile su: `http://localhost:3000/mcp`
551
+ Server available at: `http://localhost:3000/mcp`
503
552
 
504
553
  **Test HTTP endpoint**:
505
554
  ```bash
@@ -510,10 +559,10 @@ curl -X POST http://localhost:3000/mcp \
510
559
 
511
560
  ### 7.4 Build System
512
561
 
513
- Il progetto usa **esbuild** (non tsc) per:
514
- - Build ultra-veloce (~4ms vs minuti con tsc)
515
- - Minimo utilizzo memoria (importante in WSL)
516
- - Bundle automatico con tree-shaking
562
+ The project uses **esbuild** (not tsc) for:
563
+ - Ultra-fast build (~4ms vs minutes with tsc)
564
+ - Minimal memory usage (important in WSL)
565
+ - Automatic bundling with tree-shaking
517
566
 
518
567
  ```bash
519
568
  npm run build # Build con esbuild
@@ -527,7 +576,7 @@ npm run dev # Build + run
527
576
 
528
577
  ### 8.1 Use Case 1: Dataset Discovery
529
578
 
530
- **Scenario**: Un data scientist cerca dataset su popolazione italiana
579
+ **Scenario**: A data scientist searches for datasets on Italian population
531
580
 
532
581
  ```typescript
533
582
  // Step 1: Cerca dataset
@@ -538,7 +587,7 @@ ckan_package_search({
538
587
  sort: "metadata_modified desc"
539
588
  })
540
589
 
541
- // Step 2: Ottieni dettagli dataset interessante
590
+ // Step 2: Get details of interesting dataset
542
591
  ckan_package_show({
543
592
  server_url: "https://www.dati.gov.it/opendata",
544
593
  id: "popolazione-residente-2023"
@@ -547,23 +596,23 @@ ckan_package_show({
547
596
 
548
597
  ### 8.2 Use Case 2: Organization Analysis
549
598
 
550
- **Scenario**: Analizzare la produzione di dati aperti regionali
599
+ **Scenario**: Analyze regional open data production
551
600
 
552
601
  ```typescript
553
- // Step 1: Cerca organizzazioni regionali
602
+ // Step 1: Search for regional organizations
554
603
  ckan_organization_search({
555
604
  server_url: "https://www.dati.gov.it/opendata",
556
605
  pattern: "regione"
557
606
  })
558
607
 
559
- // Step 2: Analizza dataset di una regione
608
+ // Step 2: Analyze datasets of a region
560
609
  ckan_organization_show({
561
610
  server_url: "https://www.dati.gov.it/opendata",
562
611
  id: "regione-siciliana",
563
612
  include_datasets: true
564
613
  })
565
614
 
566
- // Step 3: Cerca dataset specifici dell'organizzazione
615
+ // Step 3: Search for organization-specific datasets
567
616
  ckan_package_search({
568
617
  server_url: "https://www.dati.gov.it/opendata",
569
618
  fq: "organization:regione-siciliana",
@@ -574,7 +623,7 @@ ckan_package_search({
574
623
 
575
624
  ### 8.3 Use Case 3: Data Analysis with DataStore
576
625
 
577
- **Scenario**: Analizzare dati tabulari COVID-19
626
+ **Scenario**: Analyze COVID-19 tabular data
578
627
 
579
628
  ```typescript
580
629
  // Step 1: Cerca dataset COVID
@@ -584,7 +633,7 @@ ckan_package_search({
584
633
  fq: "res_format:CSV"
585
634
  })
586
635
 
587
- // Step 2: Ottieni dettagli e resource_id
636
+ // Step 2: Get details and resource_id
588
637
  ckan_package_show({
589
638
  server_url: "https://www.dati.gov.it/opendata",
590
639
  id: "covid-19-italia"
@@ -602,18 +651,18 @@ ckan_datastore_search({
602
651
 
603
652
  ### 8.4 Use Case 4: Statistical Analysis with Faceting
604
653
 
605
- **Scenario**: Analizzare distribuzione dataset per formato e organizzazione
654
+ **Scenario**: Analyze dataset distribution by format and organization
606
655
 
607
656
  ```typescript
608
- // Statistiche formati
657
+ // Format statistics
609
658
  ckan_package_search({
610
659
  server_url: "https://www.dati.gov.it/opendata",
611
660
  facet_field: ["res_format"],
612
661
  facet_limit: 100,
613
- rows: 0 // Solo facets, no results
662
+ rows: 0 // Only facets, no results
614
663
  })
615
664
 
616
- // Statistiche organizzazioni
665
+ // Organization statistics
617
666
  ckan_package_search({
618
667
  server_url: "https://www.dati.gov.it/opendata",
619
668
  facet_field: ["organization"],
@@ -621,7 +670,7 @@ ckan_package_search({
621
670
  rows: 0
622
671
  })
623
672
 
624
- // Distribuzione tag
673
+ // Tag distribution
625
674
  ckan_package_search({
626
675
  server_url: "https://www.dati.gov.it/opendata",
627
676
  facet_field: ["tags"],
@@ -637,89 +686,112 @@ ckan_package_search({
637
686
  ### 9.1 Current Limitations
638
687
 
639
688
  1. **Read-Only**:
640
- - Non supporta creazione/modifica dataset
641
- - Solo endpoint pubblici (no autenticazione)
689
+ - Does not support dataset creation/modification
690
+ - Only public endpoints (no authentication)
642
691
 
643
692
  2. **Character Limit**:
644
- - Output troncato a 50.000 caratteri
645
- - Hardcoded, non configurabile
693
+ - Output truncated at 50,000 characters
694
+ - Hardcoded, not configurable
646
695
 
647
696
  3. **No Caching**:
648
- - Ogni richiesta fa chiamata HTTP fresca
649
- - Nessuna cache locale
697
+ - Every request makes a fresh HTTP call
698
+ - Cloudflare Workers can use edge cache (optional)
650
699
 
651
700
  4. **DataStore Limitations**:
652
- - Non tutte le risorse hanno DataStore attivo
653
- - Max 32.000 record per query
654
- - Dipende dalla configurazione del server CKAN
701
+ - Not all resources have active DataStore
702
+ - Max 32,000 records per query
703
+ - Depends on CKAN server configuration
655
704
 
656
705
  5. **SQL Support Limitations**:
657
- - `ckan_datastore_search_sql` funziona solo se il portale espone l'endpoint SQL
658
- - Alcuni portali disabilitano SQL per motivi di sicurezza
706
+ - `ckan_datastore_search_sql` works only if the portal exposes the SQL endpoint
707
+ - Some portals disable SQL for security reasons
708
+ - Workers runtime supports SQL queries without limitations
659
709
 
660
710
  6. **Timeout**:
661
- - 30 secondi fissi per HTTP request
662
- - Non configurabile
711
+ - Fixed 30 seconds for HTTP request
712
+ - Cloudflare Workers has stricter timeout (10s per fetch)
663
713
 
664
714
  7. **Locale**:
665
- - Date formattate in ISO `YYYY-MM-DD`
666
- - Non parametrizzato
715
+ - Dates formatted in ISO `YYYY-MM-DD`
716
+ - Not parameterized
667
717
 
668
718
  ### 9.2 External Dependencies
669
719
 
670
- - **Network**: Richiede connessione internet
671
- - **CKAN Server Availability**: Dipende dalla disponibilità dei server remoti
672
- - **CKAN API Compatibility**: Richiede CKAN API v3
720
+ - **Network**: Requires internet connection
721
+ - **CKAN Server Availability**: Depends on remote server availability
722
+ - **CKAN API Compatibility**: Requires CKAN API v3
673
723
 
674
724
  ### 9.3 Known Issues
675
725
 
676
- - Tool `ckan_package_list` documentato nel README ma non implementato
677
- - Nessun test automatizzato
678
- - Error messages potrebbero essere più specifici
726
+ - Cloudflare Workers has stricter timeout (10s) compared to Node.js (30s)
727
+ - Some CKAN portals have non-standard configurations that might require workarounds
679
728
 
680
729
  ---
681
730
 
682
731
  ## 10. Future Enhancements
683
732
 
684
- ### 10.1 Planned Features
733
+ ### 10.1 Completed Features
685
734
 
686
- #### High Priority
735
+ #### npm Package Publication (v0.3.2+)
736
+ - Published on npm registry: `@aborruso/ckan-mcp-server`
737
+ - Global installation: `npm install -g @aborruso/ckan-mcp-server`
738
+ - Executable CLI: `ckan-mcp-server`
739
+ - Semantic versioning (semver)
740
+
741
+ #### ✅ SQL Query Support (v0.4.4)
742
+ - Implemented `ckan_datastore_search_sql`
743
+ - Full support for SELECT, WHERE, JOIN, GROUP BY
744
+ - Requires portals with active DataStore SQL
745
+
746
+ #### ✅ AI-Powered Discovery (v0.4.6)
747
+ - Tool `ckan_find_relevant_datasets`
748
+ - Relevance ranking with scoring
749
+ - Natural language queries
750
+
751
+ #### ✅ Tags and Groups (v0.4.3)
752
+ - Tool `ckan_tag_list` with faceting
753
+ - Tool `ckan_group_list`, `ckan_group_show`, `ckan_group_search`
754
+ - Full support for taxonomy exploration
755
+
756
+ #### ✅ Cloudflare Workers Deployment (v0.4.0)
757
+ - Global edge deployment: https://ckan-mcp-server.andy-pr.workers.dev
758
+ - Free tier: 100k requests/day
759
+ - Cold start < 60ms
760
+ - Complete documentation in DEPLOYMENT.md
761
+
762
+ #### ✅ Portal Search Parser Configuration (v0.4.7)
763
+ - Per-portal query parser override
764
+ - Handling portals with restrictive parsers
765
+ - URL generator for browse/search links
766
+
767
+ ### 10.2 Planned Features
687
768
 
688
- - [ ] **npm Package Publication**
689
- - Pubblicare su npm registry pubblico
690
- - Installazione globale/locale senza build
691
- - CLI eseguibile: `ckan-mcp-server`
692
- - npx support per uso senza installazione
693
- - **Goal**: Semplicità di installazione come `pip install` in Python
769
+ #### High Priority
694
770
 
695
771
  - [ ] **Authentication Support**
696
- - API key per endpoint privati
697
- - OAuth per portali che lo supportano
772
+ - API key for private endpoints
773
+ - OAuth for portals that support it
698
774
 
699
- - [x] **SQL Query Support**
700
- - Implementato `ckan_datastore_search_sql`
701
- - Richiede portali con DataStore SQL attivo
702
-
703
775
  - [ ] **Caching Layer**
704
- - Cache risultati frequenti
705
- - TTL configurabile
776
+ - Cache frequent results in Workers KV
777
+ - Configurable TTL
706
778
  - Invalidation strategy
707
779
 
708
780
  #### Medium Priority
709
781
 
710
782
  - [ ] **Advanced DataStore Features**
711
783
  - Support for aggregations
712
- - JOIN tra risorse
784
+ - JOIN between resources
713
785
  - Computed fields
714
786
 
715
787
  - [ ] **Batch Operations**
716
- - Query multiple in parallelo
788
+ - Multiple parallel queries
717
789
  - Bulk export
718
790
 
719
791
  - [ ] **Configuration**
720
- - Timeout configurabile
721
- - Character limit configurabile
722
- - Locale configurabile
792
+ - Configurable timeout
793
+ - Configurable character limit
794
+ - Configurable locale
723
795
 
724
796
  #### Low Priority
725
797
 
@@ -737,40 +809,45 @@ ckan_package_search({
737
809
  - Excel export
738
810
  - Graph visualization data
739
811
 
740
- ### 10.2 Distribution & Installation
741
-
742
- - [ ] **npm Package Publication**
743
- - Pubblicare su npm registry (come PyPI per Python)
744
- - Installazione globale: `npm install -g ckan-mcp-server`
745
- - Installazione locale: `npm install ckan-mcp-server`
746
- - Versioning semantico (semver)
747
- - Changelog automatico
748
- - Pre-built binaries per evitare build locale
812
+ ### 10.2 Distribution & Deployment
749
813
 
750
- - [ ] **Executable CLI**
751
- - Comando globale: `ckan-mcp-server`
752
- - npx support: `npx ckan-mcp-server`
753
- - Configurazione via flags o file
814
+ **Completed**:
815
+ - npm registry publication: `@aborruso/ckan-mcp-server`
816
+ - Global installation: `npm install -g @aborruso/ckan-mcp-server`
817
+ - CLI command: `ckan-mcp-server`
818
+ - GitHub Releases with semantic tags
819
+ - Cloudflare Workers deployment
754
820
 
755
- - [ ] **Distribution Channels**
756
- - npm registry (principale)
757
- - GitHub Releases con assets
758
- - Docker image (opzionale)
821
+ **Future**:
822
+ - [ ] Docker image (optional)
823
+ - [ ] Kubernetes deployment examples
759
824
 
760
825
  ### 10.3 Testing & Quality
761
826
 
762
- - [ ] Unit tests per utility functions
763
- - [ ] Integration tests per tool handlers
764
- - [ ] E2E tests con server CKAN demo
827
+ **Current State**:
828
+ - 130 unit and integration tests (100% passing)
829
+ - vitest test runner
830
+ - Coverage for all 13 tools
831
+ - Fixtures for offline testing
832
+
833
+ **Future**:
765
834
  - [ ] Performance benchmarks
766
- - [ ] Error scenario coverage
835
+ - [ ] E2E tests with live CKAN server
836
+ - [ ] Load testing on Workers
767
837
 
768
838
  ### 10.4 Documentation
769
839
 
770
- - [ ] OpenAPI/Swagger spec per HTTP mode
840
+ **Current State**:
841
+ - Complete README with examples
842
+ - EXAMPLES.md with advanced use cases
843
+ - DEPLOYMENT.md with release workflow
844
+ - HTML readme on worker root
845
+ - Updated PRD.md
846
+
847
+ **Future**:
848
+ - [ ] OpenAPI/Swagger spec for HTTP mode
771
849
  - [ ] Video tutorial
772
- - [ ] More real-world examples
773
- - [ ] Best practices guide
850
+ - [ ] Best practices guide for query optimization
774
851
 
775
852
  ---
776
853
 
@@ -778,18 +855,25 @@ ckan_package_search({
778
855
 
779
856
  ### 11.1 Technical Metrics
780
857
 
781
- - **Build Time**: < 5ms (esbuild)
782
- - **Bundle Size**: < 500KB
783
- - **Memory Usage**: < 100MB runtime
784
- - **Response Time**: < 30s (CKAN API timeout)
858
+ - **Build Time**: ~50-70ms (esbuild Node.js + Workers)
859
+ - **Bundle Size**: ~420KB (~135KB gzipped)
860
+ - **Memory Usage**: < 50MB runtime (Node.js), Workers limits apply
861
+ - **Response Time**: < 30s (CKAN API timeout), < 10s (Workers)
862
+ - **Cold Start**: < 60ms (Cloudflare Workers)
863
+ - **Test Coverage**: 130 tests (100% passing)
785
864
 
786
865
  ### 11.2 Distribution Metrics
787
866
 
788
- - **npm Downloads**: Weekly/monthly downloads from npm registry
789
- - **Installation Success Rate**: % of successful installations
790
- - **Installation Time**: < 2 minutes from `npm install` to running
791
- - **Global vs Local**: Ratio di installazioni globali vs locali
792
- - **npx Usage**: Utilizzo via npx senza installazione
867
+ **Achieved**:
868
+ - npm package published: `@aborruso/ckan-mcp-server`
869
+ - Installation time: < 1 minute
870
+ - GitHub releases with semantic versioning
871
+ - Cloudflare Workers deployment live
872
+
873
+ **Future tracking**:
874
+ - npm weekly/monthly downloads
875
+ - Workers request count (100k/day free tier)
876
+ - Installation success rate
793
877
 
794
878
  ### 11.3 Usage Metrics
795
879
 
@@ -824,9 +908,12 @@ ckan_package_search({
824
908
 
825
909
  ### 12.3 Code Repository
826
910
 
827
- - **GitHub**: https://github.com/ondata/ckan-mcp-server (presumed)
911
+ - **GitHub**: https://github.com/aborruso/ckan-mcp-server
912
+ - **npm**: https://www.npmjs.com/package/@aborruso/ckan-mcp-server
913
+ - **Live Demo**: https://ckan-mcp-server.andy-pr.workers.dev
828
914
  - **License**: MIT License
829
- - **Contact**: onData community
915
+ - **Author**: Andrea Borruso (@aborruso)
916
+ - **Community**: onData
830
917
 
831
918
  ---
832
919
 
@@ -834,13 +921,13 @@ ckan_package_search({
834
921
 
835
922
  ### 13.1 Glossary
836
923
 
837
- - **CKAN**: Comprehensive Knowledge Archive Network - piattaforma open source per portali dati aperti
838
- - **MCP**: Model Context Protocol - protocollo per integrare AI agent con strumenti esterni
839
- - **Solr**: Apache Solr - motore di ricerca full-text usato da CKAN
840
- - **DataStore**: Feature CKAN per query SQL-like su dati tabulari
841
- - **Faceting**: Aggregazioni statistiche per analisi distributiva
842
- - **Package**: Termine CKAN per "dataset"
843
- - **Resource**: File o API endpoint associato a un dataset
924
+ - **CKAN**: Comprehensive Knowledge Archive Network - open source platform for open data portals
925
+ - **MCP**: Model Context Protocol - protocol for integrating AI agents with external tools
926
+ - **Solr**: Apache Solr - full-text search engine used by CKAN
927
+ - **DataStore**: CKAN feature for SQL-like queries on tabular data
928
+ - **Faceting**: Statistical aggregations for distributive analysis
929
+ - **Package**: CKAN term for "dataset"
930
+ - **Resource**: File or API endpoint associated with a dataset
844
931
 
845
932
  ### 13.2 Solr Query Syntax Quick Reference
846
933