@aborruso/ckan-mcp-server 0.4.54 → 0.4.56
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +17 -736
- package/package.json +4 -2
- package/README.bak.md +0 -760
package/README.md
CHANGED
|
@@ -5,178 +5,37 @@
|
|
|
5
5
|
|
|
6
6
|
# CKAN MCP Server
|
|
7
7
|
|
|
8
|
-
*Turn any
|
|
8
|
+
*Turn any CKAN open data portal into a conversation.*
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
Most public open data portals worldwide — Italy's dati.gov.it, the US data.gov, Canada's open.canada.ca, and hundreds more — run on [CKAN](https://ckan.org/), an open-source platform with a fully documented public API. Navigating these portals usually requires knowing their structure, search syntax, and APIs.
|
|
11
11
|
|
|
12
|
-
|
|
12
|
+
This MCP server removes that barrier. Once connected, your AI assistant can search datasets, explore organizations, query tabular data, and read metadata — all through natural language. No CKAN knowledge required.
|
|
13
13
|
|
|
14
|
-
**This is possible because of open standards and open source.** CKAN exposes a
|
|
15
|
-
|
|
16
|
-
**Who is this for?** Everyone. Journalists looking for data to verify a story. Researchers exploring public datasets. Public servants checking what data their administration publishes. Developers building data pipelines. No CKAN knowledge required.
|
|
17
|
-
|
|
18
|
-
**Two ways to use it — pick the one that suits you:**
|
|
19
|
-
|
|
20
|
-
| | Option A: Install locally | Option B: No install |
|
|
21
|
-
|---|---|---|
|
|
22
|
-
| **How** | `npm install -g @aborruso/ckan-mcp-server` | Point your tool to the hosted HTTP endpoint |
|
|
23
|
-
| **Best for** | Runs on your machine, works with any local tool | Quick start, zero setup |
|
|
24
|
-
| **Limits** | None | 100k requests/day shared quota |
|
|
25
|
-
|
|
26
|
-
Hosted endpoint: `https://ckan-mcp-server.andy-pr.workers.dev/mcp`
|
|
27
|
-
|
|
28
|
-
> **Recommendation**: Option B is a great way to get started and try things out without any setup. Once you're familiar with what the server can do, switching to Option A (local install) gives you unlimited usage with no shared quotas.
|
|
29
|
-
|
|
30
|
-
👉 Want to explore the codebase? The [**AI-generated DeepWiki**](https://deepwiki.com/ondata/ckan-mcp-server) is a great starting point.
|
|
31
|
-
|
|
32
|
-
**License**: MIT — see [LICENSE.txt](LICENSE.txt) for complete details. Third-party notices: [NOTICE.md](NOTICE.md).
|
|
33
|
-
|
|
34
|
-

|
|
14
|
+
**This is possible because of open standards and open source.** CKAN exposes a public API. Metadata follows [DCAT](https://www.w3.org/TR/vocab-dcat/), an open W3C standard. Both are free to use and maintained by open communities. This server stands on that foundation.
|
|
35
15
|
|
|
36
16
|
---
|
|
37
17
|
|
|
38
|
-
##
|
|
39
|
-
|
|
40
|
-
[ChatGPT](#chatgpt) | [Claude Desktop](#claude-desktop) | [Claude Code](#claude-code) | [Gemini CLI](#gemini-cli) | [VS Code](#vs-code) | [Codex CLI](#codex-cli)
|
|
41
|
-
|
|
42
|
-
This server works with any MCP-compatible client. The sections below cover some of the most popular ones — if your tool isn't listed, check its documentation for MCP configuration and use the same endpoint URL or command.
|
|
43
|
-
|
|
44
|
-
All examples below work with **both** the local installation and the hosted endpoint. Where both options differ, both are shown.
|
|
45
|
-
|
|
46
|
-
> **Using local installation?** You need to install the server first — see [Run locally](#run-locally).
|
|
47
|
-
|
|
48
|
-
### ChatGPT
|
|
49
|
-
|
|
50
|
-
> Requires a ChatGPT Plus, Team, or Enterprise plan.
|
|
51
|
-
|
|
52
|
-
1. Open the profile menu and go to **Settings → Apps → Advanced settings**
|
|
53
|
-
2. Enable **Developer mode**
|
|
54
|
-
3. Click **Create app** (top-right)
|
|
55
|
-
4. Fill in the form:
|
|
56
|
-
- **Name:** CKAN MCP Server
|
|
57
|
-
- **Description:** Search datasets on CKAN open data portals
|
|
58
|
-
- **MCP Server URL:** `https://ckan-mcp-server.andy-pr.workers.dev/mcp`
|
|
59
|
-
- **Authentication:** No Auth
|
|
60
|
-
- Check the confirmation box, then click **Create**
|
|
61
|
-
5. In a new chat, click **+** → **More** and select **CKAN MCP Server**
|
|
62
|
-
|
|
63
|
-
> For a step-by-step walkthrough with screenshots, see the [full ChatGPT guide](docs/guide/chatgpt/chatgpt_web.md).
|
|
64
|
-
|
|
65
|
-
### Claude Desktop
|
|
66
|
-
|
|
67
|
-
**Using the hosted endpoint (no install) — via connector UI:**
|
|
68
|
-
|
|
69
|
-
1. Open Claude Desktop and go to **Settings → Integrations**
|
|
70
|
-
2. Click **Add custom integration**
|
|
71
|
-
3. Fill in the details:
|
|
72
|
-
- **Name:** CKAN MCP Server
|
|
73
|
-
- **MCP Server URL:** `https://ckan-mcp-server.andy-pr.workers.dev/mcp`
|
|
74
|
-
4. Click **Add** to save
|
|
75
|
-
5. Open a new chat, click **+**, select **Integrations**, and enable **CKAN MCP Server**
|
|
76
|
-
6. When Claude asks to use a tool, click **Allow** (or **Always allow**)
|
|
18
|
+
## Quick Start
|
|
77
19
|
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
**Using the hosted endpoint (no install) — via config file:**
|
|
81
|
-
|
|
82
|
-
Configuration file location:
|
|
83
|
-
|
|
84
|
-
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
85
|
-
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
|
|
86
|
-
- **Linux**: `~/.config/Claude/claude_desktop_config.json`
|
|
87
|
-
|
|
88
|
-
```json
|
|
89
|
-
{
|
|
90
|
-
"mcpServers": {
|
|
91
|
-
"ckan": {
|
|
92
|
-
"url": "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
|
|
93
|
-
}
|
|
94
|
-
}
|
|
95
|
-
}
|
|
96
|
-
```
|
|
97
|
-
|
|
98
|
-
**Using local installation:**
|
|
99
|
-
|
|
100
|
-
```json
|
|
101
|
-
{
|
|
102
|
-
"mcpServers": {
|
|
103
|
-
"ckan": {
|
|
104
|
-
"command": "npx",
|
|
105
|
-
"args": ["@aborruso/ckan-mcp-server@latest"]
|
|
106
|
-
}
|
|
107
|
-
}
|
|
108
|
-
}
|
|
109
|
-
```
|
|
110
|
-
|
|
111
|
-
### Claude Code
|
|
112
|
-
|
|
113
|
-
**Using the hosted endpoint (no install):**
|
|
114
|
-
|
|
115
|
-
```bash
|
|
116
|
-
claude mcp add --transport http --scope user ckan https://ckan-mcp-server.andy-pr.workers.dev/mcp
|
|
117
|
-
```
|
|
118
|
-
|
|
119
|
-
**Using local installation:**
|
|
20
|
+
**Option A — Install locally (recommended):**
|
|
120
21
|
|
|
121
22
|
```bash
|
|
122
|
-
|
|
123
|
-
```
|
|
124
|
-
|
|
125
|
-
> `--scope user` makes the server available globally across all your projects, not just the current one.
|
|
126
|
-
|
|
127
|
-
To add it only for a specific project, run from the project folder without the `--scope user` flag:
|
|
128
|
-
|
|
129
|
-
```bash
|
|
130
|
-
claude mcp add --transport http ckan https://ckan-mcp-server.andy-pr.workers.dev/mcp
|
|
23
|
+
npm install -g @aborruso/ckan-mcp-server
|
|
131
24
|
```
|
|
132
25
|
|
|
133
|
-
|
|
26
|
+
**Option B — No install**, use the hosted endpoint directly:
|
|
134
27
|
|
|
135
|
-
Add to `~/.gemini/settings.json`:
|
|
136
|
-
|
|
137
|
-
**Using the hosted endpoint (no install):**
|
|
138
|
-
|
|
139
|
-
```json
|
|
140
|
-
{
|
|
141
|
-
"mcpServers": {
|
|
142
|
-
"ckan": {
|
|
143
|
-
"httpUrl": "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
|
|
144
|
-
}
|
|
145
|
-
}
|
|
146
|
-
}
|
|
147
28
|
```
|
|
148
|
-
|
|
149
|
-
**Using local installation:**
|
|
150
|
-
|
|
151
|
-
```json
|
|
152
|
-
{
|
|
153
|
-
"mcpServers": {
|
|
154
|
-
"ckan": {
|
|
155
|
-
"command": "npx",
|
|
156
|
-
"args": ["@aborruso/ckan-mcp-server@latest"]
|
|
157
|
-
}
|
|
158
|
-
}
|
|
159
|
-
}
|
|
29
|
+
https://ckan-mcp-server.andy-pr.workers.dev/mcp
|
|
160
30
|
```
|
|
161
31
|
|
|
162
|
-
|
|
32
|
+
> The hosted endpoint has a 100k requests/day shared quota. Local installation has no limits.
|
|
163
33
|
|
|
164
|
-
|
|
34
|
+
---
|
|
165
35
|
|
|
166
|
-
|
|
36
|
+
## Connect to your AI tool
|
|
167
37
|
|
|
168
|
-
|
|
169
|
-
{
|
|
170
|
-
"mcpServers": {
|
|
171
|
-
"ckan": {
|
|
172
|
-
"url": "https://ckan-mcp-server.andy-pr.workers.dev/mcp",
|
|
173
|
-
"type": "http"
|
|
174
|
-
}
|
|
175
|
-
}
|
|
176
|
-
}
|
|
177
|
-
```
|
|
178
|
-
|
|
179
|
-
**Using local installation:**
|
|
38
|
+
Most MCP-compatible tools accept a configuration like this:
|
|
180
39
|
|
|
181
40
|
```json
|
|
182
41
|
{
|
|
@@ -189,593 +48,15 @@ Add to your User Settings or `.vscode/settings.json`:
|
|
|
189
48
|
}
|
|
190
49
|
```
|
|
191
50
|
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
Add to `~/.codex/config.toml`:
|
|
195
|
-
|
|
196
|
-
**Using the hosted endpoint (no install):**
|
|
197
|
-
|
|
198
|
-
```toml
|
|
199
|
-
[mcp_servers.ckan]
|
|
200
|
-
url = "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
|
|
201
|
-
```
|
|
202
|
-
|
|
203
|
-
**Using local installation:**
|
|
204
|
-
|
|
205
|
-
```toml
|
|
206
|
-
[mcp_servers.ckan]
|
|
207
|
-
command = "npx"
|
|
208
|
-
args = ["-y", "@aborruso/ckan-mcp-server@latest"]
|
|
209
|
-
```
|
|
210
|
-
|
|
211
|
-
---
|
|
212
|
-
|
|
213
|
-
## 🖥️ Run locally
|
|
214
|
-
|
|
215
|
-
### Option 1 — Install via npm
|
|
216
|
-
|
|
217
|
-
The quickest way. Install the package globally and it's immediately available as a command:
|
|
218
|
-
|
|
219
|
-
```bash
|
|
220
|
-
npm install -g @aborruso/ckan-mcp-server
|
|
221
|
-
```
|
|
222
|
-
|
|
223
|
-
The server will be available as `ckan-mcp-server`, or you can run it without installing via:
|
|
224
|
-
|
|
225
|
-
```bash
|
|
226
|
-
npx @aborruso/ckan-mcp-server@latest
|
|
227
|
-
```
|
|
228
|
-
|
|
229
|
-
### Option 2 — Clone and build
|
|
230
|
-
|
|
231
|
-
For development or if you want to run the latest unreleased code:
|
|
232
|
-
|
|
233
|
-
```bash
|
|
234
|
-
git clone https://github.com/ondata/ckan-mcp-server.git
|
|
235
|
-
cd ckan-mcp-server
|
|
236
|
-
npm install
|
|
237
|
-
npm run build
|
|
238
|
-
node dist/index.js
|
|
239
|
-
```
|
|
51
|
+
Works with ChatGPT, VS Code, Gemini CLI, Codex CLI, and more — see the full guide below.
|
|
240
52
|
|
|
241
53
|
---
|
|
242
|
-
## 🛠️ Available Tools
|
|
243
54
|
|
|
244
|
-
|
|
245
|
-
|
|
246
|
-
- **ckan_package_search**: Search datasets with Solr queries
|
|
247
|
-
- **ckan_find_relevant_datasets**: Rank datasets by relevance score
|
|
248
|
-
- **ckan_package_show**: Complete details of a dataset
|
|
249
|
-
- **ckan_tag_list**: List tags with counts
|
|
250
|
-
|
|
251
|
-
### Organizations
|
|
252
|
-
|
|
253
|
-
- **ckan_organization_list**: List all organizations
|
|
254
|
-
- **ckan_organization_show**: Details of an organization
|
|
255
|
-
- **ckan_organization_search**: Search organizations by name
|
|
256
|
-
|
|
257
|
-
### Groups
|
|
258
|
-
|
|
259
|
-
- **ckan_group_list**: List groups
|
|
260
|
-
- **ckan_group_show**: Show group details
|
|
261
|
-
- **ckan_group_search**: Search groups by name
|
|
262
|
-
|
|
263
|
-
### DataStore
|
|
264
|
-
|
|
265
|
-
- **ckan_datastore_search**: Query tabular data
|
|
266
|
-
- **ckan_datastore_search_sql**: SQL queries on DataStore
|
|
267
|
-
|
|
268
|
-
### Quality Metrics
|
|
269
|
-
|
|
270
|
-
- **ckan_get_mqa_quality**: Get MQA quality score and metrics for dati.gov.it datasets (accessibility, reusability, interoperability, findability)
|
|
271
|
-
- **ckan_get_mqa_quality_details**: Get detailed MQA quality reasons and failing flags for dati.gov.it datasets
|
|
272
|
-
|
|
273
|
-
### Utilities
|
|
274
|
-
|
|
275
|
-
- **ckan_status_show**: Verify server status
|
|
276
|
-
|
|
277
|
-
---
|
|
278
|
-
|
|
279
|
-
## 📎 MCP Resource Templates
|
|
280
|
-
|
|
281
|
-
Direct data access via `ckan://` URI scheme:
|
|
282
|
-
|
|
283
|
-
- `ckan://{server}/dataset/{id}` - Dataset metadata
|
|
284
|
-
- `ckan://{server}/resource/{id}` - Resource metadata and download URL
|
|
285
|
-
- `ckan://{server}/organization/{name}` - Organization details
|
|
286
|
-
- `ckan://{server}/group/{name}/datasets` - Datasets by group (theme)
|
|
287
|
-
- `ckan://{server}/organization/{name}/datasets` - Datasets by organization
|
|
288
|
-
- `ckan://{server}/tag/{name}/datasets` - Datasets by tag
|
|
289
|
-
- `ckan://{server}/format/{format}/datasets` - Datasets by resource format (res_format + distribution_format)
|
|
290
|
-
|
|
291
|
-
Examples:
|
|
292
|
-
|
|
293
|
-
```
|
|
294
|
-
ckan://dati.gov.it/dataset/vaccini-covid
|
|
295
|
-
ckan://demo.ckan.org/resource/abc-123
|
|
296
|
-
ckan://data.gov/organization/sample-org
|
|
297
|
-
ckan://dati.gov.it/group/ambiente/datasets
|
|
298
|
-
ckan://dati.gov.it/organization/regione-toscana/datasets
|
|
299
|
-
ckan://dati.gov.it/tag/turismo/datasets
|
|
300
|
-
ckan://dati.gov.it/format/csv/datasets
|
|
301
|
-
```
|
|
302
|
-
|
|
303
|
-
---
|
|
304
|
-
|
|
305
|
-
## 💡 Usage Examples
|
|
306
|
-
|
|
307
|
-
### A natural language conversation
|
|
308
|
-
|
|
309
|
-
Once connected, just ask in plain language. No query syntax needed:
|
|
310
|
-
|
|
311
|
-
> *"Search dati.gov.it for datasets about air quality in Milan, then summarize what each contains — time coverage, license, and best download format."*
|
|
312
|
-
|
|
313
|
-
The server finds 31 datasets, groups them by structural pattern, and returns a clear summary — including series names, years covered, publisher, and format. No CKAN knowledge required.
|
|
314
|
-
|
|
315
|
-
---
|
|
316
|
-
|
|
317
|
-
The examples below show natural language requests alongside the actual tool call the LLM will generate internally and send to the CKAN portal. You never write these queries yourself — they are shown here **to illustrate how your question gets translated under the hood**.
|
|
318
|
-
|
|
319
|
-
### Search datasets (natural language: "search for population datasets")
|
|
320
|
-
|
|
321
|
-
```typescript
|
|
322
|
-
ckan_package_search({
|
|
323
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
324
|
-
q: "popolazione",
|
|
325
|
-
rows: 20
|
|
326
|
-
})
|
|
327
|
-
```
|
|
328
|
-
|
|
329
|
-
### Force text-field parser for long OR queries (natural language: "find hotel or accommodation datasets")
|
|
330
|
-
|
|
331
|
-
```typescript
|
|
332
|
-
ckan_package_search({
|
|
333
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
334
|
-
q: "hotel OR alberghi OR \"strutture ricettive\" OR ospitalità OR ricettività",
|
|
335
|
-
query_parser: "text",
|
|
336
|
-
rows: 0 // returns only the total count, no dataset records — useful to check how many results match before fetching them
|
|
337
|
-
})
|
|
338
|
-
```
|
|
339
|
-
|
|
340
|
-
Note: when `query_parser: "text"` is used, Solr special characters in the query are escaped automatically.
|
|
341
|
-
|
|
342
|
-
### Rank datasets by relevance (natural language: "find most relevant datasets about urban mobility")
|
|
343
|
-
|
|
344
|
-
```typescript
|
|
345
|
-
ckan_find_relevant_datasets({
|
|
346
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
347
|
-
query: "mobilità urbana",
|
|
348
|
-
limit: 5
|
|
349
|
-
})
|
|
350
|
-
```
|
|
351
|
-
|
|
352
|
-
### Filter by organization (natural language: "show recent datasets from Tuscany Region")
|
|
353
|
-
|
|
354
|
-
```typescript
|
|
355
|
-
ckan_package_search({
|
|
356
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
357
|
-
fq: "organization:regione-toscana",
|
|
358
|
-
sort: "metadata_modified desc"
|
|
359
|
-
})
|
|
360
|
-
```
|
|
361
|
-
|
|
362
|
-
### Get statistics with faceting (natural language: "show statistics by organization, tags and format")
|
|
363
|
-
|
|
364
|
-
```typescript
|
|
365
|
-
ckan_package_search({
|
|
366
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
367
|
-
facet_field: ["organization", "tags", "res_format"],
|
|
368
|
-
rows: 0 // skip dataset records, return only the facet counts
|
|
369
|
-
})
|
|
370
|
-
```
|
|
371
|
-
|
|
372
|
-
### List tags (natural language: "show top tags about health")
|
|
373
|
-
|
|
374
|
-
```typescript
|
|
375
|
-
ckan_tag_list({
|
|
376
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
377
|
-
tag_query: "salute",
|
|
378
|
-
limit: 25
|
|
379
|
-
})
|
|
380
|
-
```
|
|
381
|
-
|
|
382
|
-
### Search groups (natural language: "find groups about environment")
|
|
383
|
-
|
|
384
|
-
```typescript
|
|
385
|
-
ckan_group_search({
|
|
386
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
387
|
-
pattern: "ambiente"
|
|
388
|
-
})
|
|
389
|
-
```
|
|
390
|
-
|
|
391
|
-
### DataStore Query (natural language: "query tabular data filtering by region and year")
|
|
392
|
-
|
|
393
|
-
> **What is DataStore?** CKAN DataStore is an optional extension that imports tabular resources (CSV, Excel) into a queryable database. It allows filtering, sorting, and field selection directly on the data — without downloading the file. Not all portals have it enabled, and not all datasets use it even when the portal supports it. Check `datastore_active: true` on a resource to confirm availability.
|
|
394
|
-
|
|
395
|
-
```typescript
|
|
396
|
-
// Ordinanze viabili del Comune di Messina — resource with datastore_active: true
|
|
397
|
-
ckan_datastore_search({
|
|
398
|
-
server_url: "https://dati.comune.messina.it",
|
|
399
|
-
resource_id: "17301b8b-2a5b-425f-80b0-5b75bb1793e9",
|
|
400
|
-
filters: { "tipo": "lavori" },
|
|
401
|
-
sort: "data_pubblicazione desc",
|
|
402
|
-
limit: 10
|
|
403
|
-
})
|
|
404
|
-
```
|
|
405
|
-
|
|
406
|
-
> 👏 A shout-out to [Comune di Messina](https://dati.comune.messina.it/) and all public administrations that enable the DataStore extension: by doing so, they make their data dramatically easier to query and explore — including through AI tools like this one.
|
|
407
|
-
|
|
408
|
-
### DataStore SQL Query (natural language: "count road orders by type")
|
|
409
|
-
|
|
410
|
-
```typescript
|
|
411
|
-
// Count ordinanze viabili by tipo — Comune di Messina
|
|
412
|
-
ckan_datastore_search_sql({
|
|
413
|
-
server_url: "https://dati.comune.messina.it",
|
|
414
|
-
sql: "SELECT tipo, COUNT(*) AS total FROM \"17301b8b-2a5b-425f-80b0-5b75bb1793e9\" GROUP BY tipo ORDER BY total DESC LIMIT 5"
|
|
415
|
-
})
|
|
416
|
-
```
|
|
417
|
-
|
|
418
|
-
---
|
|
419
|
-
|
|
420
|
-
## 🌍 Supported CKAN Portals
|
|
421
|
-
|
|
422
|
-
Some examples of supported portals:
|
|
423
|
-
|
|
424
|
-
- 🇮🇹 **https://www.dati.gov.it/opendata** - Italian National Open Data Portal (CKAN 2.10.3)
|
|
425
|
-
- 🇺🇸 **https://catalog.data.gov** - United States Open Data (CKAN 2.11.4)
|
|
426
|
-
- 🇨🇦 **https://open.canada.ca/data** - Canada Open Government (CKAN 2.10.8)
|
|
427
|
-
- 🇦🇺 **https://data.gov.au** - Australian Government Open Data (CKAN 2.11.4)
|
|
428
|
-
- 🇬🇧 **https://data.gov.uk** - United Kingdom Open Data
|
|
429
|
-
- And many more portals worldwide
|
|
430
|
-
|
|
431
|
-
---
|
|
432
|
-
|
|
433
|
-
## 🔍 Advanced Solr Queries
|
|
434
|
-
|
|
435
|
-
CKAN uses [Apache Solr](https://solr.apache.org/) as its default search engine. Understanding Solr syntax unlocks the full power of dataset search — from simple keywords to complex boolean expressions, fuzzy matching, proximity searches, and date math.
|
|
436
|
-
|
|
437
|
-
### Basic syntax
|
|
438
|
-
|
|
439
|
-
```
|
|
440
|
-
# Basic search
|
|
441
|
-
q: "popolazione"
|
|
442
|
-
|
|
443
|
-
# Field search
|
|
444
|
-
q: "title:popolazione"
|
|
445
|
-
q: "notes:sanità"
|
|
446
|
-
|
|
447
|
-
# Boolean operators
|
|
448
|
-
q: "popolazione AND sicilia"
|
|
449
|
-
q: "popolazione OR abitanti"
|
|
450
|
-
q: "popolazione NOT censimento"
|
|
451
|
-
|
|
452
|
-
# Filters (fq)
|
|
453
|
-
fq: "organization:comune-palermo"
|
|
454
|
-
fq: "tags:sanità"
|
|
455
|
-
fq: "res_format:CSV"
|
|
456
|
-
|
|
457
|
-
# Wildcard
|
|
458
|
-
q: "popolaz*"
|
|
459
|
-
|
|
460
|
-
# Date range
|
|
461
|
-
fq: "metadata_modified:[2023-01-01T00:00:00Z TO *]"
|
|
462
|
-
```
|
|
463
|
-
|
|
464
|
-
### Advanced Query Examples
|
|
465
|
-
|
|
466
|
-
These real-world examples demonstrate powerful Solr query combinations tested on the Italian open data portal (dati.gov.it):
|
|
467
|
-
|
|
468
|
-
#### 1. Fuzzy Search + Date Math + Boosting (natural language: "find healthcare datasets modified in last 6 months")
|
|
469
|
-
|
|
470
|
-
Find healthcare datasets (tolerating spelling errors) modified in the last 6 months, prioritizing title matches:
|
|
471
|
-
|
|
472
|
-
```typescript
|
|
473
|
-
ckan_package_search({
|
|
474
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
475
|
-
q: "(title:sanità~2^3 OR title:salute~2^3 OR notes:sanità~1) AND metadata_modified:[NOW-6MONTHS TO *]",
|
|
476
|
-
sort: "score desc, metadata_modified desc",
|
|
477
|
-
rows: 30
|
|
478
|
-
})
|
|
479
|
-
```
|
|
480
|
-
|
|
481
|
-
**Techniques used**:
|
|
482
|
-
|
|
483
|
-
- `sanità~2` - Fuzzy search with edit distance 2 (finds "sanita", "sanitá", minor typos)
|
|
484
|
-
- `^3` - Boosts title matches 3x higher in relevance scoring
|
|
485
|
-
- `NOW-6MONTHS` - Dynamic date math for rolling time windows
|
|
486
|
-
- Combined boolean logic with multiple field searches
|
|
487
|
-
|
|
488
|
-
**Results**: 949 datasets including hospital units, healthcare organizations, medical services
|
|
489
|
-
|
|
490
|
-
#### 2. Proximity Search + Complex Boolean (natural language: "find air pollution datasets excluding water")
|
|
491
|
-
|
|
492
|
-
Environmental datasets where "inquinamento" and "aria" (air pollution) appear close together, excluding water-related datasets:
|
|
493
|
-
|
|
494
|
-
```typescript
|
|
495
|
-
ckan_package_search({
|
|
496
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
497
|
-
q: "(notes:\"inquinamento aria\"~5 OR title:\"qualità aria\"~3) AND NOT (title:acqua OR title:mare)",
|
|
498
|
-
facet_field: ["organization", "res_format"],
|
|
499
|
-
rows: 25
|
|
500
|
-
})
|
|
501
|
-
```
|
|
502
|
-
|
|
503
|
-
**Techniques used**:
|
|
504
|
-
|
|
505
|
-
- `"inquinamento aria"~5` - Proximity search (words within 5 positions)
|
|
506
|
-
- `~3` - Tighter proximity for title matches
|
|
507
|
-
- `NOT (title:acqua OR title:mare)` - Exclude water/sea datasets
|
|
508
|
-
- Faceting for statistical breakdown
|
|
509
|
-
|
|
510
|
-
**Results**: 305 datasets
|
|
511
|
-
|
|
512
|
-
#### 3. Wildcard + Field Existence + Date Math (natural language: "regional datasets with any format from last month")
|
|
513
|
-
|
|
514
|
-
Regional datasets published in the last month that have at least one resource format declared:
|
|
515
|
-
|
|
516
|
-
```typescript
|
|
517
|
-
ckan_package_search({
|
|
518
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
519
|
-
q: "organization:regione* AND metadata_created:[NOW-1MONTH TO *] AND res_format:*",
|
|
520
|
-
sort: "metadata_modified desc",
|
|
521
|
-
facet_field: ["organization"],
|
|
522
|
-
rows: 10
|
|
523
|
-
})
|
|
524
|
-
```
|
|
525
|
-
|
|
526
|
-
**Techniques used**:
|
|
527
|
-
|
|
528
|
-
- `regione*` - Wildcard matches all regional organizations
|
|
529
|
-
- `res_format:*` - Field existence check (has at least one resource format declared)
|
|
530
|
-
- `NOW-1MONTH` - Rolling 30-day window
|
|
531
|
-
|
|
532
|
-
**Results**: 293 datasets
|
|
533
|
-
|
|
534
|
-
#### 4. Explicit Date Range + Facets (natural language: "Ministry of Labour datasets updated in 2025")
|
|
535
|
-
|
|
536
|
-
Datasets from the Italian Ministry of Labour modified during 2025, with facets by format and tags:
|
|
537
|
-
|
|
538
|
-
```typescript
|
|
539
|
-
ckan_package_search({
|
|
540
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
541
|
-
q: "organization:ministero-del-lavoro AND metadata_modified:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]",
|
|
542
|
-
sort: "metadata_modified desc",
|
|
543
|
-
facet_field: ["res_format", "tags"],
|
|
544
|
-
rows: 10
|
|
545
|
-
})
|
|
546
|
-
```
|
|
547
|
-
|
|
548
|
-
**Techniques used**:
|
|
549
|
-
|
|
550
|
-
- `[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]` - Explicit date range (full year)
|
|
551
|
-
- `organization:ministero-del-lavoro` - Filter by specific organization
|
|
552
|
-
- Multiple facets for format and topic breakdown
|
|
553
|
-
|
|
554
|
-
**Results**: 83 datasets
|
|
555
|
-
|
|
556
|
-
### Solr Query Syntax Reference
|
|
557
|
-
|
|
558
|
-
**Boolean Operators**: `AND`, `OR`, `NOT`, `+required`, `-excluded`
|
|
559
|
-
**Wildcards**: `*` (multiple chars), `?` (single char) - Note: left truncation not supported
|
|
560
|
-
**Fuzzy**: `~N` (edit distance), e.g., `health~2`
|
|
561
|
-
**Proximity**: `"phrase"~N` (words within N positions)
|
|
562
|
-
**Boosting**: `^N` (relevance multiplier), e.g., `title:water^2`
|
|
563
|
-
**Ranges**:
|
|
564
|
-
|
|
565
|
-
- Inclusive: `[a TO b]`, e.g., `num_resources:[5 TO 10]`
|
|
566
|
-
- Exclusive: `{a TO b}`, e.g., `num_resources:{0 TO 100}`
|
|
567
|
-
- Open-ended: `[2024-01-01T00:00:00Z TO *]`
|
|
568
|
-
|
|
569
|
-
**Date Math**: `NOW`, `NOW-1YEAR`, `NOW-6MONTHS`, `NOW-7DAYS`, `NOW/DAY`
|
|
570
|
-
**Field Existence**: `field:*` (field exists), `NOT field:*` (field missing)
|
|
571
|
-
|
|
572
|
-
---
|
|
573
|
-
|
|
574
|
-
## 📅 Understanding date fields
|
|
575
|
-
|
|
576
|
-
CKAN portals can be *source* catalogs (data published directly by the organization) or *harvesting aggregators* (data collected from many other portals). This distinction matters a lot when filtering by date.
|
|
577
|
-
|
|
578
|
-
| Field | Meaning on source portal | Meaning on aggregator |
|
|
579
|
-
|---|---|---|
|
|
580
|
-
| `issued` | When the publisher released the dataset | When the publisher released the dataset |
|
|
581
|
-
| `metadata_created` | When the record was first created | When the record was first harvested |
|
|
582
|
-
| `metadata_modified` | When the record was last updated | When the record was last re-harvested |
|
|
583
|
-
|
|
584
|
-
On an aggregator like `dati.gov.it`, `metadata_modified` is updated every time the portal re-harvests — even if the dataset content hasn't changed. This makes it unsuitable for finding "recently updated content".
|
|
585
|
-
|
|
586
|
-
**Example — same dataset, three different timestamps on dati.gov.it (aggregator):**
|
|
587
|
-
|
|
588
|
-
```json
|
|
589
|
-
{
|
|
590
|
-
"issued": "2024-12-10",
|
|
591
|
-
"metadata_created": "2024-12-16",
|
|
592
|
-
"metadata_modified": "2026-02-28"
|
|
593
|
-
}
|
|
594
|
-
```
|
|
595
|
-
|
|
596
|
-
> `metadata_modified` is February 2026 only because the portal re-harvested it then — not because the data changed.
|
|
597
|
-
|
|
598
|
-
**Which date fields are filterable on dati.gov.it?**
|
|
599
|
-
|
|
600
|
-
All three fields are Solr-indexed and usable in queries:
|
|
601
|
-
|
|
602
|
-
| Field | Solr-indexed | What queries return |
|
|
603
|
-
|---|---|---|
|
|
604
|
-
| `issued` | ✅ | Datasets by publisher release date — most meaningful, but ~14% of datasets lack it |
|
|
605
|
-
| `metadata_created` | ✅ | Datasets by first harvesting date on dati.gov.it |
|
|
606
|
-
| `metadata_modified` | ✅ | Datasets by last re-harvesting date — often noisy |
|
|
607
|
-
|
|
608
|
-
**Query examples (dati.gov.it):**
|
|
609
|
-
|
|
610
|
-
```typescript
|
|
611
|
-
# Datasets about road accidents published by the original source in 2025
|
|
612
|
-
ckan_package_search({
|
|
613
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
614
|
-
q: "incidenti stradali",
|
|
615
|
-
fq: "issued:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]"
|
|
616
|
-
})
|
|
617
|
-
// → ~121 results (only datasets where publisher filled in `issued`)
|
|
618
|
-
|
|
619
|
-
# Datasets first appearing on dati.gov.it in 2025
|
|
620
|
-
ckan_package_search({
|
|
621
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
622
|
-
q: "incidenti stradali",
|
|
623
|
-
fq: "metadata_created:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]"
|
|
624
|
-
})
|
|
625
|
-
// → ~164 results (includes older datasets harvested for the first time in 2025)
|
|
626
|
-
```
|
|
627
|
-
|
|
628
|
-
> **Note on `issued` coverage**: ~59,700 of 69,000+ datasets on dati.gov.it have `issued` populated. Queries on `issued` are accurate but incomplete — datasets without the field are silently excluded. Prefer `issued` for content-date queries; use `metadata_created` only as a fallback for "when did this appear on the portal".
|
|
629
|
-
|
|
630
|
-
**Recommendation**: use `issued` to find datasets by publication date. Use `metadata_created` to find datasets that appeared on the portal recently.
|
|
631
|
-
|
|
632
|
-
---
|
|
633
|
-
|
|
634
|
-
## 👩💻 Developer Reference
|
|
635
|
-
|
|
636
|
-
### Project Structure
|
|
637
|
-
|
|
638
|
-
```
|
|
639
|
-
ckan-mcp-server/
|
|
640
|
-
├── src/
|
|
641
|
-
│ ├── index.ts # Entry point
|
|
642
|
-
│ ├── server.ts # MCP server setup
|
|
643
|
-
│ ├── worker.ts # Cloudflare Workers entry
|
|
644
|
-
│ ├── types.ts # Types & schemas
|
|
645
|
-
│ ├── utils/
|
|
646
|
-
│ │ ├── http.ts # CKAN API client
|
|
647
|
-
│ │ ├── formatting.ts # Output formatting
|
|
648
|
-
│ │ └── url-generator.ts
|
|
649
|
-
│ ├── tools/
|
|
650
|
-
│ │ ├── package.ts # Package search/show
|
|
651
|
-
│ │ ├── organization.ts # Organization tools
|
|
652
|
-
│ │ ├── datastore.ts # DataStore queries
|
|
653
|
-
│ │ ├── status.ts # Server status
|
|
654
|
-
│ │ ├── tag.ts # Tag tools
|
|
655
|
-
│ │ └── group.ts # Group tools
|
|
656
|
-
│ ├── resources/ # MCP Resource Templates
|
|
657
|
-
│ │ ├── index.ts
|
|
658
|
-
│ │ ├── uri.ts
|
|
659
|
-
│ │ ├── dataset.ts
|
|
660
|
-
│ │ ├── resource.ts
|
|
661
|
-
│ │ └── organization.ts
|
|
662
|
-
│ ├── prompts/ # MCP Guided Prompts
|
|
663
|
-
│ │ ├── index.ts
|
|
664
|
-
│ │ ├── theme.ts
|
|
665
|
-
│ │ ├── organization.ts
|
|
666
|
-
│ │ ├── format.ts
|
|
667
|
-
│ │ ├── recent.ts
|
|
668
|
-
│ │ └── dataset-analysis.ts
|
|
669
|
-
│ └── transport/
|
|
670
|
-
│ ├── stdio.ts
|
|
671
|
-
│ └── http.ts
|
|
672
|
-
├── tests/ # Test suite
|
|
673
|
-
├── dist/ # Compiled output (generated)
|
|
674
|
-
├── package.json
|
|
675
|
-
└── README.md
|
|
676
|
-
```
|
|
677
|
-
|
|
678
|
-
### Build & Test
|
|
679
|
-
|
|
680
|
-
```bash
|
|
681
|
-
# Build (esbuild, ~4ms)
|
|
682
|
-
npm run build
|
|
683
|
-
|
|
684
|
-
# Watch mode
|
|
685
|
-
npm run watch
|
|
686
|
-
|
|
687
|
-
# Run all tests
|
|
688
|
-
npm test
|
|
689
|
-
|
|
690
|
-
# Watch mode for tests
|
|
691
|
-
npm run test:watch
|
|
692
|
-
|
|
693
|
-
# Coverage report
|
|
694
|
-
npm run test:coverage
|
|
695
|
-
```
|
|
696
|
-
|
|
697
|
-
### Explore with MCP Inspector
|
|
698
|
-
|
|
699
|
-
The [MCP Inspector](https://github.com/modelcontextprotocol/inspector) lets you browse tools, test calls interactively, and debug responses in a web UI:
|
|
700
|
-
|
|
701
|
-
```bash
|
|
702
|
-
npm install -g @modelcontextprotocol/inspector
|
|
703
|
-
npm run build
|
|
704
|
-
npx @modelcontextprotocol/inspector node dist/index.js
|
|
705
|
-
```
|
|
706
|
-
|
|
707
|
-
Opens at `http://localhost:5173`.
|
|
708
|
-
|
|
709
|
-
### Manual HTTP Testing
|
|
710
|
-
|
|
711
|
-
```bash
|
|
712
|
-
# Start server
|
|
713
|
-
TRANSPORT=http PORT=3001 node dist/index.js
|
|
714
|
-
|
|
715
|
-
# List available tools
|
|
716
|
-
curl -s -X POST http://localhost:3001/mcp \
|
|
717
|
-
-H 'Content-Type: application/json' \
|
|
718
|
-
-H 'Accept: application/json, text/event-stream' \
|
|
719
|
-
-d '{"jsonrpc":"2.0","method":"tools/list","id":1}'
|
|
720
|
-
|
|
721
|
-
# Call a tool
|
|
722
|
-
curl -s -X POST http://localhost:3001/mcp \
|
|
723
|
-
-H 'Content-Type: application/json' \
|
|
724
|
-
-H 'Accept: application/json, text/event-stream' \
|
|
725
|
-
-d '{
|
|
726
|
-
"jsonrpc":"2.0","method":"tools/call",
|
|
727
|
-
"params":{"name":"ckan_package_search","arguments":{"server_url":"https://www.dati.gov.it/opendata","q":"ambiente","rows":3}},
|
|
728
|
-
"id":1
|
|
729
|
-
}' | jq -r '.result.content[0].text'
|
|
730
|
-
```
|
|
731
|
-
|
|
732
|
-
### Portal View URL Templates
|
|
733
|
-
|
|
734
|
-
Some CKAN portals expose non-standard web URLs for viewing datasets or organizations. To support those cases, this project ships with [`src/portals.json`](src/portals.json), which maps known portal API URLs (and aliases) to custom view URL templates.
|
|
735
|
-
|
|
736
|
-
When generating a dataset or organization view link, the server:
|
|
737
|
-
|
|
738
|
-
- matches the `server_url` against `api_url` and `api_url_aliases` in [`src/portals.json`](src/portals.json)
|
|
739
|
-
- uses the portal-specific `dataset_view_url` / `organization_view_url` template when available
|
|
740
|
-
- falls back to the generic defaults (`{server_url}/dataset/{name}` and `{server_url}/organization/{name}`)
|
|
741
|
-
|
|
742
|
-
### Troubleshooting
|
|
743
|
-
|
|
744
|
-
**Wrong URL for Italian portal** — use `https://www.dati.gov.it/opendata` (not `https://dati.gov.it`).
|
|
745
|
-
|
|
746
|
-
**Connection error**
|
|
747
|
-
|
|
748
|
-
```
|
|
749
|
-
Error: Server not found: https://example.gov
|
|
750
|
-
```
|
|
751
|
-
|
|
752
|
-
Verify the URL is reachable and use `ckan_status_show` to confirm the portal is responding.
|
|
753
|
-
|
|
754
|
-
**No results** — broaden your query or check what's available with facets:
|
|
755
|
-
|
|
756
|
-
```typescript
|
|
757
|
-
ckan_package_search({
|
|
758
|
-
server_url: "https://www.dati.gov.it/opendata",
|
|
759
|
-
q: "*:*",
|
|
760
|
-
facet_field: ["tags", "organization"],
|
|
761
|
-
rows: 0
|
|
762
|
-
})
|
|
763
|
-
```
|
|
764
|
-
|
|
765
|
-
---
|
|
766
|
-
|
|
767
|
-
## 🆘 Support
|
|
768
|
-
|
|
769
|
-
For issues or questions, [open an issue on GitHub](https://github.com/ondata/ckan-mcp-server/issues/new/choose).
|
|
770
|
-
|
|
771
|
-
---
|
|
55
|
+
## Full documentation
|
|
772
56
|
|
|
773
|
-
|
|
57
|
+
Complete setup guides, all available tools, usage examples, and advanced Solr query reference:
|
|
774
58
|
|
|
775
|
-
|
|
776
|
-
- [CKAN API Documentation](https://docs.ckan.org/en/latest/api/) — full reference for the CKAN API v3
|
|
777
|
-
- [DCAT Vocabulary (W3C)](https://www.w3.org/TR/vocab-dcat/) — the metadata standard used by CKAN portals to describe datasets
|
|
778
|
-
- [MCP Protocol](https://modelcontextprotocol.io/) — Model Context Protocol specification
|
|
59
|
+
👉 **[Full README on GitHub](https://github.com/ondata/ckan-mcp-server#readme)**
|
|
779
60
|
|
|
780
61
|
---
|
|
781
62
|
|