@aborruso/ckan-mcp-server 0.4.54 → 0.4.56

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +17 -736
  2. package/package.json +4 -2
  3. package/README.bak.md +0 -760
package/README.md CHANGED
@@ -5,178 +5,37 @@
5
5
 
6
6
  # CKAN MCP Server
7
7
 
8
- *Turn any (CKAN) open data portal into a conversation.*
8
+ *Turn any CKAN open data portal into a conversation.*
9
9
 
10
- **Give your AI assistant direct access to any CKAN open data portalsearch datasets, explore organizations, query tabular data, and read metadata, all through natural language.**
10
+ Most public open data portals worldwide Italy's dati.gov.it, the US data.gov, Canada's open.canada.ca, and hundreds more run on [CKAN](https://ckan.org/), an open-source platform with a fully documented public API. Navigating these portals usually requires knowing their structure, search syntax, and APIs.
11
11
 
12
- CKAN is the open-source platform behind most public open data portals worldwide (Italy's dati.gov.it, the US data.gov, Canada's open.canada.ca, and many more). Navigating these portals usually requires knowing their structure, APIs, and search syntax. This MCP server removes that barrier: once connected, your AI tool can do it all for you.
12
+ This MCP server removes that barrier. Once connected, your AI assistant can search datasets, explore organizations, query tabular data, and read metadata all through natural language. No CKAN knowledge required.
13
13
 
14
- **This is possible because of open standards and open source.** CKAN exposes a fully documented, public API. Metadata follows [DCAT](https://www.w3.org/TR/vocab-dcat/), an open W3C standard for describing datasets. Both are free to use, free to build on, and maintained by open communities. This server stands on that foundation.
15
-
16
- **Who is this for?** Everyone. Journalists looking for data to verify a story. Researchers exploring public datasets. Public servants checking what data their administration publishes. Developers building data pipelines. No CKAN knowledge required.
17
-
18
- **Two ways to use it — pick the one that suits you:**
19
-
20
- | | Option A: Install locally | Option B: No install |
21
- |---|---|---|
22
- | **How** | `npm install -g @aborruso/ckan-mcp-server` | Point your tool to the hosted HTTP endpoint |
23
- | **Best for** | Runs on your machine, works with any local tool | Quick start, zero setup |
24
- | **Limits** | None | 100k requests/day shared quota |
25
-
26
- Hosted endpoint: `https://ckan-mcp-server.andy-pr.workers.dev/mcp`
27
-
28
- > **Recommendation**: Option B is a great way to get started and try things out without any setup. Once you're familiar with what the server can do, switching to Option A (local install) gives you unlimited usage with no shared quotas.
29
-
30
- 👉 Want to explore the codebase? The [**AI-generated DeepWiki**](https://deepwiki.com/ondata/ckan-mcp-server) is a great starting point.
31
-
32
- **License**: MIT — see [LICENSE.txt](LICENSE.txt) for complete details. Third-party notices: [NOTICE.md](NOTICE.md).
33
-
34
- ![CKAN MCP Server demo](docs/guide/mcp_server_demo.gif)
14
+ **This is possible because of open standards and open source.** CKAN exposes a public API. Metadata follows [DCAT](https://www.w3.org/TR/vocab-dcat/), an open W3C standard. Both are free to use and maintained by open communities. This server stands on that foundation.
35
15
 
36
16
  ---
37
17
 
38
- ## 🔌 Use it in your favorite tool
39
-
40
- [ChatGPT](#chatgpt) | [Claude Desktop](#claude-desktop) | [Claude Code](#claude-code) | [Gemini CLI](#gemini-cli) | [VS Code](#vs-code) | [Codex CLI](#codex-cli)
41
-
42
- This server works with any MCP-compatible client. The sections below cover some of the most popular ones — if your tool isn't listed, check its documentation for MCP configuration and use the same endpoint URL or command.
43
-
44
- All examples below work with **both** the local installation and the hosted endpoint. Where both options differ, both are shown.
45
-
46
- > **Using local installation?** You need to install the server first — see [Run locally](#run-locally).
47
-
48
- ### ChatGPT
49
-
50
- > Requires a ChatGPT Plus, Team, or Enterprise plan.
51
-
52
- 1. Open the profile menu and go to **Settings → Apps → Advanced settings**
53
- 2. Enable **Developer mode**
54
- 3. Click **Create app** (top-right)
55
- 4. Fill in the form:
56
- - **Name:** CKAN MCP Server
57
- - **Description:** Search datasets on CKAN open data portals
58
- - **MCP Server URL:** `https://ckan-mcp-server.andy-pr.workers.dev/mcp`
59
- - **Authentication:** No Auth
60
- - Check the confirmation box, then click **Create**
61
- 5. In a new chat, click **+** → **More** and select **CKAN MCP Server**
62
-
63
- > For a step-by-step walkthrough with screenshots, see the [full ChatGPT guide](docs/guide/chatgpt/chatgpt_web.md).
64
-
65
- ### Claude Desktop
66
-
67
- **Using the hosted endpoint (no install) — via connector UI:**
68
-
69
- 1. Open Claude Desktop and go to **Settings → Integrations**
70
- 2. Click **Add custom integration**
71
- 3. Fill in the details:
72
- - **Name:** CKAN MCP Server
73
- - **MCP Server URL:** `https://ckan-mcp-server.andy-pr.workers.dev/mcp`
74
- 4. Click **Add** to save
75
- 5. Open a new chat, click **+**, select **Integrations**, and enable **CKAN MCP Server**
76
- 6. When Claude asks to use a tool, click **Allow** (or **Always allow**)
18
+ ## Quick Start
77
19
 
78
- > For a detailed walkthrough with screenshots, see the [full Claude guide](docs/guide/claude/claude_web.md).
79
-
80
- **Using the hosted endpoint (no install) — via config file:**
81
-
82
- Configuration file location:
83
-
84
- - **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
85
- - **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
86
- - **Linux**: `~/.config/Claude/claude_desktop_config.json`
87
-
88
- ```json
89
- {
90
- "mcpServers": {
91
- "ckan": {
92
- "url": "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
93
- }
94
- }
95
- }
96
- ```
97
-
98
- **Using local installation:**
99
-
100
- ```json
101
- {
102
- "mcpServers": {
103
- "ckan": {
104
- "command": "npx",
105
- "args": ["@aborruso/ckan-mcp-server@latest"]
106
- }
107
- }
108
- }
109
- ```
110
-
111
- ### Claude Code
112
-
113
- **Using the hosted endpoint (no install):**
114
-
115
- ```bash
116
- claude mcp add --transport http --scope user ckan https://ckan-mcp-server.andy-pr.workers.dev/mcp
117
- ```
118
-
119
- **Using local installation:**
20
+ **Option A Install locally (recommended):**
120
21
 
121
22
  ```bash
122
- claude mcp add --scope user ckan npx @aborruso/ckan-mcp-server@latest
123
- ```
124
-
125
- > `--scope user` makes the server available globally across all your projects, not just the current one.
126
-
127
- To add it only for a specific project, run from the project folder without the `--scope user` flag:
128
-
129
- ```bash
130
- claude mcp add --transport http ckan https://ckan-mcp-server.andy-pr.workers.dev/mcp
23
+ npm install -g @aborruso/ckan-mcp-server
131
24
  ```
132
25
 
133
- ### Gemini CLI
26
+ **Option B — No install**, use the hosted endpoint directly:
134
27
 
135
- Add to `~/.gemini/settings.json`:
136
-
137
- **Using the hosted endpoint (no install):**
138
-
139
- ```json
140
- {
141
- "mcpServers": {
142
- "ckan": {
143
- "httpUrl": "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
144
- }
145
- }
146
- }
147
28
  ```
148
-
149
- **Using local installation:**
150
-
151
- ```json
152
- {
153
- "mcpServers": {
154
- "ckan": {
155
- "command": "npx",
156
- "args": ["@aborruso/ckan-mcp-server@latest"]
157
- }
158
- }
159
- }
29
+ https://ckan-mcp-server.andy-pr.workers.dev/mcp
160
30
  ```
161
31
 
162
- ### VS Code
32
+ > The hosted endpoint has a 100k requests/day shared quota. Local installation has no limits.
163
33
 
164
- Add to your User Settings or `.vscode/settings.json`:
34
+ ---
165
35
 
166
- **Using the hosted endpoint (no install):**
36
+ ## Connect to your AI tool
167
37
 
168
- ```json
169
- {
170
- "mcpServers": {
171
- "ckan": {
172
- "url": "https://ckan-mcp-server.andy-pr.workers.dev/mcp",
173
- "type": "http"
174
- }
175
- }
176
- }
177
- ```
178
-
179
- **Using local installation:**
38
+ Most MCP-compatible tools accept a configuration like this:
180
39
 
181
40
  ```json
182
41
  {
@@ -189,593 +48,15 @@ Add to your User Settings or `.vscode/settings.json`:
189
48
  }
190
49
  ```
191
50
 
192
- ### Codex CLI
193
-
194
- Add to `~/.codex/config.toml`:
195
-
196
- **Using the hosted endpoint (no install):**
197
-
198
- ```toml
199
- [mcp_servers.ckan]
200
- url = "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
201
- ```
202
-
203
- **Using local installation:**
204
-
205
- ```toml
206
- [mcp_servers.ckan]
207
- command = "npx"
208
- args = ["-y", "@aborruso/ckan-mcp-server@latest"]
209
- ```
210
-
211
- ---
212
-
213
- ## 🖥️ Run locally
214
-
215
- ### Option 1 — Install via npm
216
-
217
- The quickest way. Install the package globally and it's immediately available as a command:
218
-
219
- ```bash
220
- npm install -g @aborruso/ckan-mcp-server
221
- ```
222
-
223
- The server will be available as `ckan-mcp-server`, or you can run it without installing via:
224
-
225
- ```bash
226
- npx @aborruso/ckan-mcp-server@latest
227
- ```
228
-
229
- ### Option 2 — Clone and build
230
-
231
- For development or if you want to run the latest unreleased code:
232
-
233
- ```bash
234
- git clone https://github.com/ondata/ckan-mcp-server.git
235
- cd ckan-mcp-server
236
- npm install
237
- npm run build
238
- node dist/index.js
239
- ```
51
+ Works with ChatGPT, VS Code, Gemini CLI, Codex CLI, and more — see the full guide below.
240
52
 
241
53
  ---
242
- ## 🛠️ Available Tools
243
54
 
244
- ### Search and Discovery
245
-
246
- - **ckan_package_search**: Search datasets with Solr queries
247
- - **ckan_find_relevant_datasets**: Rank datasets by relevance score
248
- - **ckan_package_show**: Complete details of a dataset
249
- - **ckan_tag_list**: List tags with counts
250
-
251
- ### Organizations
252
-
253
- - **ckan_organization_list**: List all organizations
254
- - **ckan_organization_show**: Details of an organization
255
- - **ckan_organization_search**: Search organizations by name
256
-
257
- ### Groups
258
-
259
- - **ckan_group_list**: List groups
260
- - **ckan_group_show**: Show group details
261
- - **ckan_group_search**: Search groups by name
262
-
263
- ### DataStore
264
-
265
- - **ckan_datastore_search**: Query tabular data
266
- - **ckan_datastore_search_sql**: SQL queries on DataStore
267
-
268
- ### Quality Metrics
269
-
270
- - **ckan_get_mqa_quality**: Get MQA quality score and metrics for dati.gov.it datasets (accessibility, reusability, interoperability, findability)
271
- - **ckan_get_mqa_quality_details**: Get detailed MQA quality reasons and failing flags for dati.gov.it datasets
272
-
273
- ### Utilities
274
-
275
- - **ckan_status_show**: Verify server status
276
-
277
- ---
278
-
279
- ## 📎 MCP Resource Templates
280
-
281
- Direct data access via `ckan://` URI scheme:
282
-
283
- - `ckan://{server}/dataset/{id}` - Dataset metadata
284
- - `ckan://{server}/resource/{id}` - Resource metadata and download URL
285
- - `ckan://{server}/organization/{name}` - Organization details
286
- - `ckan://{server}/group/{name}/datasets` - Datasets by group (theme)
287
- - `ckan://{server}/organization/{name}/datasets` - Datasets by organization
288
- - `ckan://{server}/tag/{name}/datasets` - Datasets by tag
289
- - `ckan://{server}/format/{format}/datasets` - Datasets by resource format (res_format + distribution_format)
290
-
291
- Examples:
292
-
293
- ```
294
- ckan://dati.gov.it/dataset/vaccini-covid
295
- ckan://demo.ckan.org/resource/abc-123
296
- ckan://data.gov/organization/sample-org
297
- ckan://dati.gov.it/group/ambiente/datasets
298
- ckan://dati.gov.it/organization/regione-toscana/datasets
299
- ckan://dati.gov.it/tag/turismo/datasets
300
- ckan://dati.gov.it/format/csv/datasets
301
- ```
302
-
303
- ---
304
-
305
- ## 💡 Usage Examples
306
-
307
- ### A natural language conversation
308
-
309
- Once connected, just ask in plain language. No query syntax needed:
310
-
311
- > *"Search dati.gov.it for datasets about air quality in Milan, then summarize what each contains — time coverage, license, and best download format."*
312
-
313
- The server finds 31 datasets, groups them by structural pattern, and returns a clear summary — including series names, years covered, publisher, and format. No CKAN knowledge required.
314
-
315
- ---
316
-
317
- The examples below show natural language requests alongside the actual tool call the LLM will generate internally and send to the CKAN portal. You never write these queries yourself — they are shown here **to illustrate how your question gets translated under the hood**.
318
-
319
- ### Search datasets (natural language: "search for population datasets")
320
-
321
- ```typescript
322
- ckan_package_search({
323
- server_url: "https://www.dati.gov.it/opendata",
324
- q: "popolazione",
325
- rows: 20
326
- })
327
- ```
328
-
329
- ### Force text-field parser for long OR queries (natural language: "find hotel or accommodation datasets")
330
-
331
- ```typescript
332
- ckan_package_search({
333
- server_url: "https://www.dati.gov.it/opendata",
334
- q: "hotel OR alberghi OR \"strutture ricettive\" OR ospitalità OR ricettività",
335
- query_parser: "text",
336
- rows: 0 // returns only the total count, no dataset records — useful to check how many results match before fetching them
337
- })
338
- ```
339
-
340
- Note: when `query_parser: "text"` is used, Solr special characters in the query are escaped automatically.
341
-
342
- ### Rank datasets by relevance (natural language: "find most relevant datasets about urban mobility")
343
-
344
- ```typescript
345
- ckan_find_relevant_datasets({
346
- server_url: "https://www.dati.gov.it/opendata",
347
- query: "mobilità urbana",
348
- limit: 5
349
- })
350
- ```
351
-
352
- ### Filter by organization (natural language: "show recent datasets from Tuscany Region")
353
-
354
- ```typescript
355
- ckan_package_search({
356
- server_url: "https://www.dati.gov.it/opendata",
357
- fq: "organization:regione-toscana",
358
- sort: "metadata_modified desc"
359
- })
360
- ```
361
-
362
- ### Get statistics with faceting (natural language: "show statistics by organization, tags and format")
363
-
364
- ```typescript
365
- ckan_package_search({
366
- server_url: "https://www.dati.gov.it/opendata",
367
- facet_field: ["organization", "tags", "res_format"],
368
- rows: 0 // skip dataset records, return only the facet counts
369
- })
370
- ```
371
-
372
- ### List tags (natural language: "show top tags about health")
373
-
374
- ```typescript
375
- ckan_tag_list({
376
- server_url: "https://www.dati.gov.it/opendata",
377
- tag_query: "salute",
378
- limit: 25
379
- })
380
- ```
381
-
382
- ### Search groups (natural language: "find groups about environment")
383
-
384
- ```typescript
385
- ckan_group_search({
386
- server_url: "https://www.dati.gov.it/opendata",
387
- pattern: "ambiente"
388
- })
389
- ```
390
-
391
- ### DataStore Query (natural language: "query tabular data filtering by region and year")
392
-
393
- > **What is DataStore?** CKAN DataStore is an optional extension that imports tabular resources (CSV, Excel) into a queryable database. It allows filtering, sorting, and field selection directly on the data — without downloading the file. Not all portals have it enabled, and not all datasets use it even when the portal supports it. Check `datastore_active: true` on a resource to confirm availability.
394
-
395
- ```typescript
396
- // Ordinanze viabili del Comune di Messina — resource with datastore_active: true
397
- ckan_datastore_search({
398
- server_url: "https://dati.comune.messina.it",
399
- resource_id: "17301b8b-2a5b-425f-80b0-5b75bb1793e9",
400
- filters: { "tipo": "lavori" },
401
- sort: "data_pubblicazione desc",
402
- limit: 10
403
- })
404
- ```
405
-
406
- > 👏 A shout-out to [Comune di Messina](https://dati.comune.messina.it/) and all public administrations that enable the DataStore extension: by doing so, they make their data dramatically easier to query and explore — including through AI tools like this one.
407
-
408
- ### DataStore SQL Query (natural language: "count road orders by type")
409
-
410
- ```typescript
411
- // Count ordinanze viabili by tipo — Comune di Messina
412
- ckan_datastore_search_sql({
413
- server_url: "https://dati.comune.messina.it",
414
- sql: "SELECT tipo, COUNT(*) AS total FROM \"17301b8b-2a5b-425f-80b0-5b75bb1793e9\" GROUP BY tipo ORDER BY total DESC LIMIT 5"
415
- })
416
- ```
417
-
418
- ---
419
-
420
- ## 🌍 Supported CKAN Portals
421
-
422
- Some examples of supported portals:
423
-
424
- - 🇮🇹 **https://www.dati.gov.it/opendata** - Italian National Open Data Portal (CKAN 2.10.3)
425
- - 🇺🇸 **https://catalog.data.gov** - United States Open Data (CKAN 2.11.4)
426
- - 🇨🇦 **https://open.canada.ca/data** - Canada Open Government (CKAN 2.10.8)
427
- - 🇦🇺 **https://data.gov.au** - Australian Government Open Data (CKAN 2.11.4)
428
- - 🇬🇧 **https://data.gov.uk** - United Kingdom Open Data
429
- - And many more portals worldwide
430
-
431
- ---
432
-
433
- ## 🔍 Advanced Solr Queries
434
-
435
- CKAN uses [Apache Solr](https://solr.apache.org/) as its default search engine. Understanding Solr syntax unlocks the full power of dataset search — from simple keywords to complex boolean expressions, fuzzy matching, proximity searches, and date math.
436
-
437
- ### Basic syntax
438
-
439
- ```
440
- # Basic search
441
- q: "popolazione"
442
-
443
- # Field search
444
- q: "title:popolazione"
445
- q: "notes:sanità"
446
-
447
- # Boolean operators
448
- q: "popolazione AND sicilia"
449
- q: "popolazione OR abitanti"
450
- q: "popolazione NOT censimento"
451
-
452
- # Filters (fq)
453
- fq: "organization:comune-palermo"
454
- fq: "tags:sanità"
455
- fq: "res_format:CSV"
456
-
457
- # Wildcard
458
- q: "popolaz*"
459
-
460
- # Date range
461
- fq: "metadata_modified:[2023-01-01T00:00:00Z TO *]"
462
- ```
463
-
464
- ### Advanced Query Examples
465
-
466
- These real-world examples demonstrate powerful Solr query combinations tested on the Italian open data portal (dati.gov.it):
467
-
468
- #### 1. Fuzzy Search + Date Math + Boosting (natural language: "find healthcare datasets modified in last 6 months")
469
-
470
- Find healthcare datasets (tolerating spelling errors) modified in the last 6 months, prioritizing title matches:
471
-
472
- ```typescript
473
- ckan_package_search({
474
- server_url: "https://www.dati.gov.it/opendata",
475
- q: "(title:sanità~2^3 OR title:salute~2^3 OR notes:sanità~1) AND metadata_modified:[NOW-6MONTHS TO *]",
476
- sort: "score desc, metadata_modified desc",
477
- rows: 30
478
- })
479
- ```
480
-
481
- **Techniques used**:
482
-
483
- - `sanità~2` - Fuzzy search with edit distance 2 (finds "sanita", "sanitá", minor typos)
484
- - `^3` - Boosts title matches 3x higher in relevance scoring
485
- - `NOW-6MONTHS` - Dynamic date math for rolling time windows
486
- - Combined boolean logic with multiple field searches
487
-
488
- **Results**: 949 datasets including hospital units, healthcare organizations, medical services
489
-
490
- #### 2. Proximity Search + Complex Boolean (natural language: "find air pollution datasets excluding water")
491
-
492
- Environmental datasets where "inquinamento" and "aria" (air pollution) appear close together, excluding water-related datasets:
493
-
494
- ```typescript
495
- ckan_package_search({
496
- server_url: "https://www.dati.gov.it/opendata",
497
- q: "(notes:\"inquinamento aria\"~5 OR title:\"qualità aria\"~3) AND NOT (title:acqua OR title:mare)",
498
- facet_field: ["organization", "res_format"],
499
- rows: 25
500
- })
501
- ```
502
-
503
- **Techniques used**:
504
-
505
- - `"inquinamento aria"~5` - Proximity search (words within 5 positions)
506
- - `~3` - Tighter proximity for title matches
507
- - `NOT (title:acqua OR title:mare)` - Exclude water/sea datasets
508
- - Faceting for statistical breakdown
509
-
510
- **Results**: 305 datasets
511
-
512
- #### 3. Wildcard + Field Existence + Date Math (natural language: "regional datasets with any format from last month")
513
-
514
- Regional datasets published in the last month that have at least one resource format declared:
515
-
516
- ```typescript
517
- ckan_package_search({
518
- server_url: "https://www.dati.gov.it/opendata",
519
- q: "organization:regione* AND metadata_created:[NOW-1MONTH TO *] AND res_format:*",
520
- sort: "metadata_modified desc",
521
- facet_field: ["organization"],
522
- rows: 10
523
- })
524
- ```
525
-
526
- **Techniques used**:
527
-
528
- - `regione*` - Wildcard matches all regional organizations
529
- - `res_format:*` - Field existence check (has at least one resource format declared)
530
- - `NOW-1MONTH` - Rolling 30-day window
531
-
532
- **Results**: 293 datasets
533
-
534
- #### 4. Explicit Date Range + Facets (natural language: "Ministry of Labour datasets updated in 2025")
535
-
536
- Datasets from the Italian Ministry of Labour modified during 2025, with facets by format and tags:
537
-
538
- ```typescript
539
- ckan_package_search({
540
- server_url: "https://www.dati.gov.it/opendata",
541
- q: "organization:ministero-del-lavoro AND metadata_modified:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]",
542
- sort: "metadata_modified desc",
543
- facet_field: ["res_format", "tags"],
544
- rows: 10
545
- })
546
- ```
547
-
548
- **Techniques used**:
549
-
550
- - `[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]` - Explicit date range (full year)
551
- - `organization:ministero-del-lavoro` - Filter by specific organization
552
- - Multiple facets for format and topic breakdown
553
-
554
- **Results**: 83 datasets
555
-
556
- ### Solr Query Syntax Reference
557
-
558
- **Boolean Operators**: `AND`, `OR`, `NOT`, `+required`, `-excluded`
559
- **Wildcards**: `*` (multiple chars), `?` (single char) - Note: left truncation not supported
560
- **Fuzzy**: `~N` (edit distance), e.g., `health~2`
561
- **Proximity**: `"phrase"~N` (words within N positions)
562
- **Boosting**: `^N` (relevance multiplier), e.g., `title:water^2`
563
- **Ranges**:
564
-
565
- - Inclusive: `[a TO b]`, e.g., `num_resources:[5 TO 10]`
566
- - Exclusive: `{a TO b}`, e.g., `num_resources:{0 TO 100}`
567
- - Open-ended: `[2024-01-01T00:00:00Z TO *]`
568
-
569
- **Date Math**: `NOW`, `NOW-1YEAR`, `NOW-6MONTHS`, `NOW-7DAYS`, `NOW/DAY`
570
- **Field Existence**: `field:*` (field exists), `NOT field:*` (field missing)
571
-
572
- ---
573
-
574
- ## 📅 Understanding date fields
575
-
576
- CKAN portals can be *source* catalogs (data published directly by the organization) or *harvesting aggregators* (data collected from many other portals). This distinction matters a lot when filtering by date.
577
-
578
- | Field | Meaning on source portal | Meaning on aggregator |
579
- |---|---|---|
580
- | `issued` | When the publisher released the dataset | When the publisher released the dataset |
581
- | `metadata_created` | When the record was first created | When the record was first harvested |
582
- | `metadata_modified` | When the record was last updated | When the record was last re-harvested |
583
-
584
- On an aggregator like `dati.gov.it`, `metadata_modified` is updated every time the portal re-harvests — even if the dataset content hasn't changed. This makes it unsuitable for finding "recently updated content".
585
-
586
- **Example — same dataset, three different timestamps on dati.gov.it (aggregator):**
587
-
588
- ```json
589
- {
590
- "issued": "2024-12-10",
591
- "metadata_created": "2024-12-16",
592
- "metadata_modified": "2026-02-28"
593
- }
594
- ```
595
-
596
- > `metadata_modified` is February 2026 only because the portal re-harvested it then — not because the data changed.
597
-
598
- **Which date fields are filterable on dati.gov.it?**
599
-
600
- All three fields are Solr-indexed and usable in queries:
601
-
602
- | Field | Solr-indexed | What queries return |
603
- |---|---|---|
604
- | `issued` | ✅ | Datasets by publisher release date — most meaningful, but ~14% of datasets lack it |
605
- | `metadata_created` | ✅ | Datasets by first harvesting date on dati.gov.it |
606
- | `metadata_modified` | ✅ | Datasets by last re-harvesting date — often noisy |
607
-
608
- **Query examples (dati.gov.it):**
609
-
610
- ```typescript
611
- # Datasets about road accidents published by the original source in 2025
612
- ckan_package_search({
613
- server_url: "https://www.dati.gov.it/opendata",
614
- q: "incidenti stradali",
615
- fq: "issued:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]"
616
- })
617
- // → ~121 results (only datasets where publisher filled in `issued`)
618
-
619
- # Datasets first appearing on dati.gov.it in 2025
620
- ckan_package_search({
621
- server_url: "https://www.dati.gov.it/opendata",
622
- q: "incidenti stradali",
623
- fq: "metadata_created:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]"
624
- })
625
- // → ~164 results (includes older datasets harvested for the first time in 2025)
626
- ```
627
-
628
- > **Note on `issued` coverage**: ~59,700 of 69,000+ datasets on dati.gov.it have `issued` populated. Queries on `issued` are accurate but incomplete — datasets without the field are silently excluded. Prefer `issued` for content-date queries; use `metadata_created` only as a fallback for "when did this appear on the portal".
629
-
630
- **Recommendation**: use `issued` to find datasets by publication date. Use `metadata_created` to find datasets that appeared on the portal recently.
631
-
632
- ---
633
-
634
- ## 👩‍💻 Developer Reference
635
-
636
- ### Project Structure
637
-
638
- ```
639
- ckan-mcp-server/
640
- ├── src/
641
- │ ├── index.ts # Entry point
642
- │ ├── server.ts # MCP server setup
643
- │ ├── worker.ts # Cloudflare Workers entry
644
- │ ├── types.ts # Types & schemas
645
- │ ├── utils/
646
- │ │ ├── http.ts # CKAN API client
647
- │ │ ├── formatting.ts # Output formatting
648
- │ │ └── url-generator.ts
649
- │ ├── tools/
650
- │ │ ├── package.ts # Package search/show
651
- │ │ ├── organization.ts # Organization tools
652
- │ │ ├── datastore.ts # DataStore queries
653
- │ │ ├── status.ts # Server status
654
- │ │ ├── tag.ts # Tag tools
655
- │ │ └── group.ts # Group tools
656
- │ ├── resources/ # MCP Resource Templates
657
- │ │ ├── index.ts
658
- │ │ ├── uri.ts
659
- │ │ ├── dataset.ts
660
- │ │ ├── resource.ts
661
- │ │ └── organization.ts
662
- │ ├── prompts/ # MCP Guided Prompts
663
- │ │ ├── index.ts
664
- │ │ ├── theme.ts
665
- │ │ ├── organization.ts
666
- │ │ ├── format.ts
667
- │ │ ├── recent.ts
668
- │ │ └── dataset-analysis.ts
669
- │ └── transport/
670
- │ ├── stdio.ts
671
- │ └── http.ts
672
- ├── tests/ # Test suite
673
- ├── dist/ # Compiled output (generated)
674
- ├── package.json
675
- └── README.md
676
- ```
677
-
678
- ### Build & Test
679
-
680
- ```bash
681
- # Build (esbuild, ~4ms)
682
- npm run build
683
-
684
- # Watch mode
685
- npm run watch
686
-
687
- # Run all tests
688
- npm test
689
-
690
- # Watch mode for tests
691
- npm run test:watch
692
-
693
- # Coverage report
694
- npm run test:coverage
695
- ```
696
-
697
- ### Explore with MCP Inspector
698
-
699
- The [MCP Inspector](https://github.com/modelcontextprotocol/inspector) lets you browse tools, test calls interactively, and debug responses in a web UI:
700
-
701
- ```bash
702
- npm install -g @modelcontextprotocol/inspector
703
- npm run build
704
- npx @modelcontextprotocol/inspector node dist/index.js
705
- ```
706
-
707
- Opens at `http://localhost:5173`.
708
-
709
- ### Manual HTTP Testing
710
-
711
- ```bash
712
- # Start server
713
- TRANSPORT=http PORT=3001 node dist/index.js
714
-
715
- # List available tools
716
- curl -s -X POST http://localhost:3001/mcp \
717
- -H 'Content-Type: application/json' \
718
- -H 'Accept: application/json, text/event-stream' \
719
- -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'
720
-
721
- # Call a tool
722
- curl -s -X POST http://localhost:3001/mcp \
723
- -H 'Content-Type: application/json' \
724
- -H 'Accept: application/json, text/event-stream' \
725
- -d '{
726
- "jsonrpc":"2.0","method":"tools/call",
727
- "params":{"name":"ckan_package_search","arguments":{"server_url":"https://www.dati.gov.it/opendata","q":"ambiente","rows":3}},
728
- "id":1
729
- }' | jq -r '.result.content[0].text'
730
- ```
731
-
732
- ### Portal View URL Templates
733
-
734
- Some CKAN portals expose non-standard web URLs for viewing datasets or organizations. To support those cases, this project ships with [`src/portals.json`](src/portals.json), which maps known portal API URLs (and aliases) to custom view URL templates.
735
-
736
- When generating a dataset or organization view link, the server:
737
-
738
- - matches the `server_url` against `api_url` and `api_url_aliases` in [`src/portals.json`](src/portals.json)
739
- - uses the portal-specific `dataset_view_url` / `organization_view_url` template when available
740
- - falls back to the generic defaults (`{server_url}/dataset/{name}` and `{server_url}/organization/{name}`)
741
-
742
- ### Troubleshooting
743
-
744
- **Wrong URL for Italian portal** — use `https://www.dati.gov.it/opendata` (not `https://dati.gov.it`).
745
-
746
- **Connection error**
747
-
748
- ```
749
- Error: Server not found: https://example.gov
750
- ```
751
-
752
- Verify the URL is reachable and use `ckan_status_show` to confirm the portal is responding.
753
-
754
- **No results** — broaden your query or check what's available with facets:
755
-
756
- ```typescript
757
- ckan_package_search({
758
- server_url: "https://www.dati.gov.it/opendata",
759
- q: "*:*",
760
- facet_field: ["tags", "organization"],
761
- rows: 0
762
- })
763
- ```
764
-
765
- ---
766
-
767
- ## 🆘 Support
768
-
769
- For issues or questions, [open an issue on GitHub](https://github.com/ondata/ckan-mcp-server/issues/new/choose).
770
-
771
- ---
55
+ ## Full documentation
772
56
 
773
- ## 🔗 Useful Links
57
+ Complete setup guides, all available tools, usage examples, and advanced Solr query reference:
774
58
 
775
- - [CKAN](https://ckan.org/) — the open-source platform behind most public open data portals
776
- - [CKAN API Documentation](https://docs.ckan.org/en/latest/api/) — full reference for the CKAN API v3
777
- - [DCAT Vocabulary (W3C)](https://www.w3.org/TR/vocab-dcat/) — the metadata standard used by CKAN portals to describe datasets
778
- - [MCP Protocol](https://modelcontextprotocol.io/) — Model Context Protocol specification
59
+ 👉 **[Full README on GitHub](https://github.com/ondata/ckan-mcp-server#readme)**
779
60
 
780
61
  ---
781
62