@aborruso/ckan-mcp-server 0.4.17 → 0.4.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (87) hide show
  1. package/LOG.md +59 -0
  2. package/README.md +104 -34
  3. package/dist/index.js +161 -45
  4. package/dist/worker.js +42 -42
  5. package/package.json +12 -1
  6. package/.devin/wiki.json +0 -273
  7. package/CLAUDE.md +0 -398
  8. package/PRD.md +0 -999
  9. package/REFACTORING.md +0 -238
  10. package/examples/langgraph/01_basic_workflow.py +0 -277
  11. package/examples/langgraph/02_data_exploration.py +0 -366
  12. package/examples/langgraph/README.md +0 -719
  13. package/examples/langgraph/metadata_quality.py +0 -299
  14. package/examples/langgraph/requirements.txt +0 -12
  15. package/examples/langgraph/setup.sh +0 -32
  16. package/examples/langgraph/test_setup.py +0 -106
  17. package/openspec/AGENTS.md +0 -456
  18. package/openspec/changes/add-ckan-analyze-dataset-structure/proposal.md +0 -17
  19. package/openspec/changes/add-ckan-analyze-dataset-structure/specs/ckan-insights/spec.md +0 -7
  20. package/openspec/changes/add-ckan-analyze-dataset-structure/tasks.md +0 -6
  21. package/openspec/changes/add-ckan-analyze-dataset-updates/proposal.md +0 -17
  22. package/openspec/changes/add-ckan-analyze-dataset-updates/specs/ckan-insights/spec.md +0 -7
  23. package/openspec/changes/add-ckan-analyze-dataset-updates/tasks.md +0 -6
  24. package/openspec/changes/add-ckan-audit-tool/proposal.md +0 -17
  25. package/openspec/changes/add-ckan-audit-tool/specs/ckan-insights/spec.md +0 -7
  26. package/openspec/changes/add-ckan-audit-tool/tasks.md +0 -6
  27. package/openspec/changes/add-ckan-dataset-insights/proposal.md +0 -17
  28. package/openspec/changes/add-ckan-dataset-insights/specs/ckan-insights/spec.md +0 -7
  29. package/openspec/changes/add-ckan-dataset-insights/tasks.md +0 -6
  30. package/openspec/changes/add-ckan-host-allowlist-env/design.md +0 -38
  31. package/openspec/changes/add-ckan-host-allowlist-env/proposal.md +0 -16
  32. package/openspec/changes/add-ckan-host-allowlist-env/specs/ckan-request-allowlist/spec.md +0 -15
  33. package/openspec/changes/add-ckan-host-allowlist-env/specs/cloudflare-deployment/spec.md +0 -11
  34. package/openspec/changes/add-ckan-host-allowlist-env/tasks.md +0 -12
  35. package/openspec/changes/add-escape-text-query/proposal.md +0 -12
  36. package/openspec/changes/add-escape-text-query/specs/ckan-search/spec.md +0 -11
  37. package/openspec/changes/add-escape-text-query/tasks.md +0 -8
  38. package/openspec/changes/add-mqa-quality-tool/proposal.md +0 -21
  39. package/openspec/changes/add-mqa-quality-tool/specs/ckan-quality/spec.md +0 -71
  40. package/openspec/changes/add-mqa-quality-tool/tasks.md +0 -29
  41. package/openspec/changes/archive/2026-01-08-add-mcp-resources/design.md +0 -115
  42. package/openspec/changes/archive/2026-01-08-add-mcp-resources/proposal.md +0 -52
  43. package/openspec/changes/archive/2026-01-08-add-mcp-resources/specs/mcp-resources/spec.md +0 -92
  44. package/openspec/changes/archive/2026-01-08-add-mcp-resources/tasks.md +0 -56
  45. package/openspec/changes/archive/2026-01-08-expand-test-coverage-specs/design.md +0 -355
  46. package/openspec/changes/archive/2026-01-08-expand-test-coverage-specs/proposal.md +0 -161
  47. package/openspec/changes/archive/2026-01-08-expand-test-coverage-specs/tasks.md +0 -162
  48. package/openspec/changes/archive/2026-01-08-translate-project-to-english/proposal.md +0 -115
  49. package/openspec/changes/archive/2026-01-08-translate-project-to-english/specs/documentation-language/spec.md +0 -32
  50. package/openspec/changes/archive/2026-01-08-translate-project-to-english/tasks.md +0 -115
  51. package/openspec/changes/archive/2026-01-10-add-ckan-find-relevant-datasets/proposal.md +0 -17
  52. package/openspec/changes/archive/2026-01-10-add-ckan-find-relevant-datasets/specs/ckan-insights/spec.md +0 -7
  53. package/openspec/changes/archive/2026-01-10-add-ckan-find-relevant-datasets/tasks.md +0 -6
  54. package/openspec/changes/archive/2026-01-10-add-cloudflare-workers/design.md +0 -734
  55. package/openspec/changes/archive/2026-01-10-add-cloudflare-workers/proposal.md +0 -183
  56. package/openspec/changes/archive/2026-01-10-add-cloudflare-workers/specs/cloudflare-deployment/spec.md +0 -389
  57. package/openspec/changes/archive/2026-01-10-add-cloudflare-workers/tasks.md +0 -519
  58. package/openspec/changes/archive/2026-01-15-add-mcp-prompts/proposal.md +0 -13
  59. package/openspec/changes/archive/2026-01-15-add-mcp-prompts/specs/mcp-prompts/spec.md +0 -22
  60. package/openspec/changes/archive/2026-01-15-add-mcp-prompts/tasks.md +0 -10
  61. package/openspec/changes/archive/2026-01-15-add-mcp-resource-filters/proposal.md +0 -13
  62. package/openspec/changes/archive/2026-01-15-add-mcp-resource-filters/specs/mcp-resources/spec.md +0 -38
  63. package/openspec/changes/archive/2026-01-15-add-mcp-resource-filters/tasks.md +0 -10
  64. package/openspec/changes/archive/2026-01-19-update-repo-owner-ondata/proposal.md +0 -13
  65. package/openspec/changes/archive/2026-01-19-update-repo-owner-ondata/specs/repository-metadata/spec.md +0 -14
  66. package/openspec/changes/archive/2026-01-19-update-repo-owner-ondata/tasks.md +0 -12
  67. package/openspec/changes/archive/2026-01-19-update-search-parser-config/proposal.md +0 -13
  68. package/openspec/changes/archive/2026-01-19-update-search-parser-config/specs/ckan-insights/spec.md +0 -11
  69. package/openspec/changes/archive/2026-01-19-update-search-parser-config/specs/ckan-search/spec.md +0 -11
  70. package/openspec/changes/archive/2026-01-19-update-search-parser-config/tasks.md +0 -6
  71. package/openspec/changes/archive/add-automated-tests/design.md +0 -324
  72. package/openspec/changes/archive/add-automated-tests/proposal.md +0 -167
  73. package/openspec/changes/archive/add-automated-tests/specs/automated-testing/spec.md +0 -143
  74. package/openspec/changes/archive/add-automated-tests/tasks.md +0 -132
  75. package/openspec/project.md +0 -115
  76. package/openspec/specs/ckan-insights/spec.md +0 -23
  77. package/openspec/specs/ckan-search/spec.md +0 -16
  78. package/openspec/specs/cloudflare-deployment/spec.md +0 -344
  79. package/openspec/specs/documentation-language/spec.md +0 -32
  80. package/openspec/specs/mcp-prompts/spec.md +0 -26
  81. package/openspec/specs/mcp-resources/spec.md +0 -120
  82. package/openspec/specs/repository-metadata/spec.md +0 -19
  83. package/private/commenti-privati.yaml +0 -14
  84. package/testo.md +0 -12
  85. package/web-gui/PRD.md +0 -158
  86. package/web-gui/public/index.html +0 -883
  87. package/wrangler.toml +0 -6
@@ -1,719 +0,0 @@
1
- # LangGraph + CKAN MCP Server Examples
2
-
3
- Orchestrate complex workflows for open data analysis: search, filter, and analyze CKAN datasets with smart workflows.
4
-
5
- ## Why This Combination Is Powerful
6
-
7
- **Typical scenario without LangGraph:**
8
- ```python
9
- # Rigid sequential script
10
- datasets = search_ckan("mobility")
11
- for ds in datasets:
12
- if has_csv(ds):
13
- download(ds)
14
- # What if I want to analyze DataStore? More code...
15
- # What if I want to ask for confirmation? Even more code...
16
- # What if the script crashes? Start over...
17
- ```
18
-
19
- **With LangGraph + CKAN MCP:**
20
- ```python
21
- # Flexible, stateful, resumable workflow
22
- graph = StateGraph(State)
23
- graph.add_node("search", search_datasets)
24
- graph.add_conditional_edges("check_type", route_by_resource_type)
25
- # Resumes where it left off, handles errors, adapts behavior
26
- ```
27
-
28
- ### Concrete Benefits
29
-
30
- ```mermaid
31
- graph LR
32
- A[Classic Script] -->|rigid| B[A fixed flow]
33
- A -->|fragile| C[Crash = restart]
34
- A -->|complex| D[Nested if/else code]
35
-
36
- E[LangGraph + MCP] -->|flexible| F[Conditional branches]
37
- E -->|resilient| G[Checkpoint + resume]
38
- E -->|clear| H[Declarative nodes]
39
-
40
- style E fill:#90EE90
41
- style A fill:#FFB6C6
42
- ```
43
-
44
- **1. Automatic State Management**
45
- - Each workflow step has access to shared state
46
- - No global variables or manual handoffs
47
- - Easy debugging: `print(state)` in each node
48
-
49
- **2. Conditional Branching**
50
- - Adapts behavior to the data type found
51
- - DataStore? -> SQL query
52
- - CSV? -> Download
53
- - Other formats? -> Skip or convert
54
-
55
- **3. Human-in-the-Loop**
56
- - Ask for confirmation before heavy operations
57
- - Show previews and let the user choose
58
- - Resume execution after input
59
-
60
- **4. Resilience**
61
- - Automatic checkpoints between nodes
62
- - If it crashes: resume from the last completed step
63
- - Configurable retry logic
64
-
65
- ---
66
-
67
- ## Quick Start
68
-
69
- ```bash
70
- # 1. Build CKAN MCP Server (from repo root)
71
- cd /path/to/ckan-mcp-server
72
- npm install && npm run build
73
-
74
- # 2. Test setup
75
- cd examples/langgraph
76
- uvx --with langgraph --with mcp --with langchain-core python test_setup.py
77
-
78
- # 3. Run the basic workflow
79
- uvx --with langgraph --with mcp --with langchain-core python 01_basic_workflow.py
80
- ```
81
-
82
- **Output:**
83
- ```
84
- [1/3] Searching datasets for: 'mobilità urbana'
85
- ✓ Found 51 total, showing 5
86
-
87
- [2/3] Filtering by metadata quality
88
- ✓ Mobilità urbana: 73/100 (good)
89
- ✓ INTERVENTI PER LA MOBILITÀ URBANA: 68/100 (good)
90
- → 5/5 datasets pass quality threshold (40)
91
-
92
- [3/3] Extracting CSV resources
93
- ✓ Found 4 CSV resources
94
- ```
95
-
96
- ---
97
-
98
- ## What Is LangGraph
99
-
100
- [LangGraph](https://python.langchain.com/docs/langgraph/) is a framework to build **stateful agents** with execution graphs. Think of a workflow as a graph:
101
-
102
- ```mermaid
103
- graph TD
104
- START([Start]) --> A[Search Datasets]
105
- A --> B{Has Resources?}
106
- B -->|Yes| C[Filter Quality]
107
- B -->|No| END([End])
108
- C --> D{Resource Type?}
109
- D -->|DataStore| E[SQL Query]
110
- D -->|CSV| F[Download]
111
- D -->|Other| G[Skip]
112
- E --> END
113
- F --> END
114
- G --> END
115
-
116
- style START fill:#90EE90
117
- style END fill:#FFB6C6
118
- style B fill:#FFD700
119
- style D fill:#FFD700
120
- ```
121
-
122
- Each node is a function. The edges define flow. State propagates automatically.
123
-
124
- **Key capabilities:**
125
- - **Durable execution**: persistent workflow, resumes after crashes
126
- - **State management**: tracks state across multiple steps
127
- - **Conditional routing**: dynamic branches based on data
128
- - **Human-in-the-loop**: pause for user input
129
- - **Memory**: remembers previous interactions
130
- - **Streaming**: real-time updates
131
-
132
- ## Why LangGraph + CKAN MCP
133
-
134
- The CKAN MCP Server provides **atomic tools** (package_search, datastore_search, etc.). LangGraph orchestrates them into **smart workflows**:
135
-
136
- ```mermaid
137
- graph TB
138
- subgraph LangGraph
139
- direction TB
140
- W1[Workflow State]
141
- N1[Node: Search]
142
- N2[Node: Filter]
143
- N3[Node: Analyze]
144
- W1 --> N1
145
- N1 --> N2
146
- N2 --> N3
147
- end
148
-
149
- subgraph CKAN_MCP[CKAN MCP Server]
150
- direction TB
151
- T1[package_search]
152
- T2[package_show]
153
- T3[datastore_search]
154
- end
155
-
156
- N1 -.call.-> T1
157
- N2 -.call.-> T2
158
- N3 -.call.-> T3
159
-
160
- style LangGraph fill:#E6F3FF
161
- style CKAN_MCP fill:#FFE6E6
162
- ```
163
-
164
- **Separation of concerns:**
165
- - **CKAN MCP**: provides data access (stateless, single-purpose tools)
166
- - **LangGraph**: coordinates business logic (stateful, complex workflows)
167
-
168
- **Concrete example:**
169
-
170
- *Multi-stage search with adaptive decisions*
171
-
172
- 1. Search "mobilità urbana" on dati.gov.it
173
- 2. If < 10 results -> broaden query to "trasporti"
174
- 3. Filter datasets with quality score > 60
175
- 4. For each dataset:
176
- - If it has DataStore -> sample 100 rows with SQL
177
- - If it has CSV -> download and analyze with DuckDB
178
- - Otherwise -> skip
179
- 5. Aggregate results and generate a report
180
-
181
- With a classic script: 200+ lines, tangled logic, hard to debug.
182
- With LangGraph: 6 declarative nodes, visible flow, maintainable.
183
-
184
- ---
185
-
186
- ## Available Examples
187
-
188
- ### 0. Setup Test (`test_setup.py`)
189
-
190
- Verify prerequisites:
191
-
192
- ```bash
193
- uvx --with langgraph --with mcp --with langchain-core python test_setup.py
194
- ```
195
-
196
- Checks:
197
- - Python dependencies (langgraph, mcp, langchain-core)
198
- - Node.js >= 18
199
- - `dist/index.js` built
200
-
201
- ---
202
-
203
- ### 1. Basic Workflow (`01_basic_workflow.py`)
204
-
205
- **Sequential workflow** for dataset search and analysis.
206
-
207
- ```mermaid
208
- graph LR
209
- START([Start]) --> A[Search Datasets]
210
- A --> B[Filter Quality]
211
- B --> C[Extract CSV Resources]
212
- C --> END([End])
213
-
214
- style START fill:#90EE90
215
- style END fill:#FFB6C6
216
- ```
217
-
218
- **What it does:**
219
- 1. Search datasets by keyword ("mobilità urbana")
220
- 2. **Filter by metadata quality** using a 0-100 scorer:
221
- - Completeness (30pt): title, description, license, org
222
- - Richness (30pt): description length, tags, coverage
223
- - Resources (30pt): open formats, DataStore
224
- - Freshness (10pt): last update
225
- 3. Extract CSV resources
226
- 4. Show results
227
-
228
- **Run:**
229
- ```bash
230
- uvx --with langgraph --with mcp --with langchain-core python 01_basic_workflow.py
231
- ```
232
-
233
- **Example output:**
234
- ```
235
- [1/3] Searching datasets for: 'mobilità urbana'
236
- ✓ Found 51 total, showing 5
237
-
238
- [2/3] Filtering by metadata quality
239
- ✓ Mobilità urbana: 73/100 (good)
240
- ✓ INTERVENTI MOBILITÀ URBANA INTERVENTIONS 2010: 68/100 (good)
241
- ✓ INTERVENTI MOBILITÀ URBANA INTERVENTIONS 2011: 65/100 (good)
242
- ✓ Dataset trasporti Palermo: 62/100 (acceptable)
243
- ✓ Mobilità sostenibile: 74/100 (good)
244
- → 5/5 datasets pass quality threshold (40)
245
-
246
- [3/3] Extracting CSV resources
247
- ✓ Found 4 CSV resources
248
-
249
- Query: mobilità urbana
250
- Total datasets: 5
251
- Quality datasets: 5
252
- CSV resources: 4
253
- ```
254
-
255
- **Pattern demonstrated:** Sequential pipeline with state propagation
256
-
257
- ---
258
-
259
- ### 2. Data Exploration (`02_data_exploration.py`)
260
-
261
- **Workflow with conditional branching** and human-in-the-loop.
262
-
263
- ```mermaid
264
- graph TD
265
- START([Start]) --> A[Search Datasets]
266
- A --> B[User: Select Dataset]
267
- B --> C[Detect Resource Type]
268
- C --> D{Resource Type?}
269
- D -->|DataStore| E[SQL Query + Preview]
270
- D -->|CSV| F[Show Download URL]
271
- D -->|Unknown| G[Skip Analysis]
272
- E --> END([End])
273
- F --> END
274
- G --> END
275
-
276
- style START fill:#90EE90
277
- style END fill:#FFB6C6
278
- style D fill:#FFD700
279
- style B fill:#87CEEB
280
- ```
281
-
282
- **What it does:**
283
- 1. Search datasets ("trasporti")
284
- 2. **Human-in-the-loop**: show list, user picks (simulated)
285
- 3. Detect resource type automatically
286
- 4. **Conditional routing**:
287
- - `datastore_active=true` -> SQL query with LIMIT
288
- - `format=CSV` -> download URL for DuckDB
289
- - Other -> skip
290
- 5. Adaptive analysis based on type
291
-
292
- **Run:**
293
- ```bash
294
- uvx --with langgraph --with mcp --with langchain-core python 02_data_exploration.py
295
- ```
296
-
297
- **Example output:**
298
- ```
299
- [SEARCH] Query: 'trasporti'
300
- ✓ Found 2983 total, showing 5
301
-
302
- [SELECT DATASET] Available datasets:
303
-
304
- 1. Evoluzione offerta trasporto ferroviario merci
305
- Resources: 2
306
- Org: Autorità regolazione trasporti
307
-
308
- 2. Diffusione TAXI e NCC
309
- Resources: 1
310
- Org: Autorità regolazione trasporti
311
-
312
- 3. Trafori internazionali
313
- Resources: 2
314
- Org: Autorità regolazione trasporti
315
-
316
- → Selected: Evoluzione offerta trasporto ferroviario merci
317
-
318
- [SELECT RESOURCE]
319
- Available resources:
320
- 1. Offerta trasporti ferroviario merci (2018-2022) (CSV)
321
- 2. Offerta trasporti ferroviario merci (2019-2023) (CSV)
322
-
323
- → Type: CSV (download required)
324
-
325
- [ANALYZE CSV]
326
- → URL: https://bdt.autorita-trasporti.it/[...]/D12-Offerta-merci.csv
327
- (Download and analyze with DuckDB/pandas)
328
-
329
- WORKFLOW RESULT:
330
- Analysis Type: csv
331
- URL: https://bdt.autorita-trasporti.it/[...]
332
- ```
333
-
334
- **Pattern demonstrated:** Conditional branching + human-in-the-loop
335
-
336
- ---
337
-
338
- ### 3. Metadata Quality Scorer (`metadata_quality.py`)
339
-
340
- Reusable module to evaluate CKAN metadata quality.
341
-
342
- **Scoring criteria (0-100):**
343
-
344
- | Category | Weight | Criteria |
345
- |-----------|------|---------|
346
- | **Completeness** | 30pt | Title, description, license, organization, contact |
347
- | **Richness** | 30pt | Description length (>200 chars), tags (>2), temporal coverage |
348
- | **Resources** | 30pt | Open formats (CSV/JSON/XML), active DataStore, valid URLs |
349
- | **Freshness** | 10pt | Updated in the last 12 months |
350
-
351
- **Quality levels:**
352
- - `excellent` (80-100): complete and rich metadata
353
- - `good` (60-79): good quality, some missing fields
354
- - `acceptable` (40-59): basic metadata present
355
- - `poor` (0-39): many missing fields
356
-
357
- **Standalone test:**
358
- ```bash
359
- uvx python metadata_quality.py
360
- ```
361
-
362
- **Output:**
363
- ```python
364
- {
365
- "score": 73,
366
- "level": "good",
367
- "breakdown": {
368
- "completeness": 24, # 24/30
369
- "richness": 22, # 22/30
370
- "resources": 20, # 20/30
371
- "freshness": 7 # 7/10
372
- },
373
- "reasons": [
374
- "Has complete title and description",
375
- "Has 5 tags and temporal coverage",
376
- "2 open format resources with DataStore",
377
- "Updated 3 months ago"
378
- ]
379
- }
380
- ```
381
-
382
- ---
383
-
384
- ## Prerequisites
385
-
386
- ### 1. Build CKAN MCP Server
387
-
388
- The examples connect to the local server via stdio.
389
-
390
- ```bash
391
- # From the repo root
392
- cd /path/to/ckan-mcp-server
393
- npm install
394
- npm run build
395
- ```
396
-
397
- Verify that `dist/index.js` exists.
398
-
399
- ### 2. Install uv (Recommended)
400
-
401
- [uv](https://docs.astral.sh/uv/) is the fastest way to test:
402
-
403
- ```bash
404
- # macOS/Linux
405
- curl -LsSf https://astral.sh/uv/install.sh | sh
406
-
407
- # Windows
408
- powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
409
- ```
410
-
411
- Or use classic pip/venv (see below).
412
-
413
- ---
414
-
415
- ## Installation
416
-
417
- ### Method 1: Quick Test with uvx (Zero Install)
418
-
419
- Test without installing anything permanently:
420
-
421
- ```bash
422
- cd examples/langgraph
423
-
424
- # Quick test
425
- uvx --with langgraph --with mcp --with langchain-core \
426
- python 01_basic_workflow.py
427
- ```
428
-
429
- `uvx` creates an isolated environment, installs dependencies, runs, then cleans up.
430
-
431
- ### Method 2: Classic Virtual Environment
432
-
433
- ```bash
434
- # Create venv
435
- python3 -m venv venv
436
- source venv/bin/activate # Windows: venv\Scripts\activate
437
-
438
- # Install dependencies
439
- pip install -r requirements.txt
440
-
441
- # Run examples
442
- python 01_basic_workflow.py
443
- python 02_data_exploration.py
444
- ```
445
-
446
- ### Method 3: Automated Setup
447
-
448
- ```bash
449
- # Full setup (venv + install)
450
- ./setup.sh
451
-
452
- # Activate
453
- source venv/bin/activate
454
-
455
- # Run
456
- python 01_basic_workflow.py
457
- ```
458
-
459
- ---
460
-
461
- ## Architectural Patterns
462
-
463
- ### Pattern 1: Sequential Pipeline
464
-
465
- Linear workflow where each step depends on the previous one.
466
-
467
- ```python
468
- graph = StateGraph(State)
469
-
470
- # Define nodes
471
- graph.add_node("search", search_datasets_node)
472
- graph.add_node("filter", filter_quality_node)
473
- graph.add_node("extract", extract_resources_node)
474
-
475
- # Connect sequentially
476
- graph.add_edge(START, "search")
477
- graph.add_edge("search", "filter")
478
- graph.add_edge("filter", "extract")
479
- graph.add_edge("extract", END)
480
- ```
481
-
482
- **Use case:** ETL, report generation, data pipelines.
483
-
484
- ---
485
-
486
- ### Pattern 2: Conditional Branching
487
-
488
- Workflow that adapts at runtime based on data.
489
-
490
- ```python
491
- def route_by_resource_type(state: State) -> str:
492
- """Decide path based on resource type."""
493
- resource = state["selected_resource"]
494
- if resource.get("datastore_active"):
495
- return "analyze_datastore"
496
- elif resource.get("format") == "CSV":
497
- return "analyze_csv"
498
- else:
499
- return "skip"
500
-
501
- graph.add_conditional_edges(
502
- "select_resource",
503
- route_by_resource_type,
504
- {
505
- "analyze_datastore": "datastore_node",
506
- "analyze_csv": "csv_node",
507
- "skip": "skip_node"
508
- }
509
- )
510
- ```
511
-
512
- **Use case:** Adapt analysis to data type, handle multiple formats.
513
-
514
- ---
515
-
516
- ### Pattern 3: Human-in-the-Loop
517
-
518
- Workflow that requires user input before proceeding.
519
-
520
- ```python
521
- async def select_dataset_node(state: State) -> State:
522
- """Show options and get user selection."""
523
- print("Available datasets:")
524
- for i, ds in enumerate(state["datasets"]):
525
- print(f"{i+1}. {ds['title']}")
526
-
527
- # In production: use input() or web UI
528
- selection = int(input("Select dataset: ")) - 1
529
- state["selected"] = state["datasets"][selection]
530
- return state
531
- ```
532
-
533
- **Use case:** Qualitative decisions, data cleaning, validation.
534
-
535
- ---
536
-
537
- ### Pattern 4: Parallel Execution
538
-
539
- Run nodes in parallel for performance (future example).
540
-
541
- ```python
542
- # Query multiple CKAN portals simultaneously
543
- from langgraph.pregel import Pregel
544
-
545
- results = await graph.arun_parallel([
546
- {"query": "mobility", "server": "dati.gov.it"},
547
- {"query": "mobility", "server": "data.gov"},
548
- {"query": "mobilite", "server": "data.gouv.fr"}
549
- ])
550
- ```
551
-
552
- **Use case:** Multi-source aggregation, comparative analysis.
553
-
554
- ---
555
-
556
- ## MCP Client Integration
557
-
558
- The examples use the [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk) to communicate with CKAN MCP Server via stdio.
559
-
560
- ```python
561
- from mcp import ClientSession, StdioServerParameters
562
- from mcp.client.stdio import stdio_client
563
-
564
- # Connect to local server
565
- server_params = StdioServerParameters(
566
- command="node",
567
- args=["../../dist/index.js"]
568
- )
569
-
570
- async with stdio_client(server_params) as (read, write):
571
- async with ClientSession(read, write) as session:
572
- await session.initialize()
573
-
574
- # Call MCP tool
575
- result = await session.call_tool(
576
- "ckan_package_search",
577
- arguments={
578
- "server_url": "https://www.dati.gov.it/opendata",
579
- "q": "mobilità urbana",
580
- "rows": 5,
581
- "response_format": "json"
582
- }
583
- )
584
-
585
- # Parse response
586
- for content in result.content:
587
- if content.type == "text":
588
- data = json.loads(content.text)
589
- datasets = data["results"]
590
- ```
591
-
592
- **Important note:**
593
- - Response format: use `response_format` (not `format`)
594
- - Response structure: CKAN result is direct, not wrapped in `{success, result}`
595
- - JSON parsing: handle truncation if response > 50KB
596
-
597
- ---
598
-
599
- ## Debugging with LangSmith
600
-
601
- To visualize workflows graphically:
602
-
603
- ```bash
604
- # Setup LangSmith (free up to 5k traces/month)
605
- export LANGCHAIN_TRACING_V2=true
606
- export LANGCHAIN_API_KEY=your_api_key
607
-
608
- # Run with tracing
609
- uvx --with langgraph --with mcp --with langchain-core --with langsmith \
610
- python 01_basic_workflow.py
611
- ```
612
-
613
- View traces at [smith.langchain.com](https://smith.langchain.com):
614
- - See each node executed
615
- - Timing for each step
616
- - State evolution
617
- - Errors and retries
618
-
619
- ---
620
-
621
- ## Troubleshooting
622
-
623
- ### ❌ `dist/index.js` not found
624
-
625
- ```bash
626
- # Build the server from repo root
627
- cd ../..
628
- npm run build
629
- cd examples/langgraph
630
- ```
631
-
632
- ### ❌ `Cannot connect to MCP server`
633
-
634
- Verify Node.js and file:
635
-
636
- ```bash
637
- node --version # Must be >= 18
638
- ls -la ../../dist/index.js # Must exist
639
- ```
640
-
641
- ### ❌ JSON parse error / Response truncated
642
-
643
- Some CKAN queries return huge metadata (>50KB) that gets truncated.
644
-
645
- **Solution:** Use specific queries instead of generic ones:
646
- - ✅ Works: `trasporti`, `mobilità urbana`, `sanità`
647
- - ❌ Problems: `CSV`, `data`, `popolazione` (too generic)
648
-
649
- ```python
650
- # In script, change query
651
- initial_state = {
652
- "query": "trasporti", # Specific ✅
653
- # "query": "CSV", # Generic ❌
654
- }
655
- ```
656
-
657
- ### ❌ Timeout on dati.gov.it
658
-
659
- The portal can be slow. Reduce `rows`:
660
-
661
- ```python
662
- SEARCH_ROWS = 3 # Instead of 10
663
- ```
664
-
665
- ### ❌ Import errors with uvx
666
-
667
- If `uvx` fails, use classic venv:
668
-
669
- ```bash
670
- python3 -m venv venv
671
- source venv/bin/activate
672
- pip install -r requirements.txt
673
- python 01_basic_workflow.py
674
- ```
675
-
676
- ---
677
-
678
- ## Configuration
679
-
680
- The examples default to `dati.gov.it`. To change:
681
-
682
- ```python
683
- # In script, edit:
684
- CKAN_SERVER = "https://data.gov" # US portal
685
- # CKAN_SERVER = "https://demo.ckan.org" # Demo
686
- ```
687
-
688
- Other CKAN portals: [instances.ckan.org](https://instances.ckan.org)
689
-
690
- ---
691
-
692
- ## Next Steps
693
-
694
- **Patterns to add:**
695
-
696
- 1. **Multi-Region Analysis**: Parallel queries across multiple portals, aggregate results
697
- 2. **Streaming Updates**: Show real-time progress with `StreamingStdOutCallbackHandler`
698
- 3. **Error Recovery**: Retry logic with exponential backoff
699
- 4. **Memory Persistence**: Save checkpoints to disk, resume days later
700
- 5. **LangGraph Studio**: Visual UI to build and debug workflows
701
-
702
- Contribute by adding examples in this directory!
703
-
704
- ---
705
-
706
- ## Resources
707
-
708
- - [LangGraph Documentation](https://python.langchain.com/docs/langgraph/)
709
- - [LangGraph Tutorials](https://langchain-ai.github.io/langgraph/tutorials/)
710
- - [MCP Protocol](https://modelcontextprotocol.io)
711
- - [CKAN MCP Server](../../README.md)
712
- - [LangSmith](https://docs.smith.langchain.com)
713
- - [CKAN API Guide](https://docs.ckan.org/en/latest/api/)
714
-
715
- ---
716
-
717
- ## License
718
-
719
- Same license as the parent project (see root README).