kailash 0.1.1__py3-none-any.whl → 0.1.2__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- kailash/nodes/__init__.py +2 -1
- kailash/nodes/ai/__init__.py +26 -0
- kailash/nodes/ai/ai_providers.py +1272 -0
- kailash/nodes/ai/embedding_generator.py +853 -0
- kailash/nodes/ai/llm_agent.py +1166 -0
- kailash/nodes/api/auth.py +3 -3
- kailash/nodes/api/graphql.py +2 -2
- kailash/nodes/api/http.py +391 -44
- kailash/nodes/api/rate_limiting.py +2 -2
- kailash/nodes/api/rest.py +464 -56
- kailash/nodes/base.py +71 -12
- kailash/nodes/code/python.py +2 -1
- kailash/nodes/data/__init__.py +7 -0
- kailash/nodes/data/readers.py +28 -26
- kailash/nodes/data/retrieval.py +178 -0
- kailash/nodes/data/sharepoint_graph.py +7 -7
- kailash/nodes/data/sources.py +65 -0
- kailash/nodes/data/sql.py +4 -2
- kailash/nodes/data/writers.py +6 -3
- kailash/nodes/logic/operations.py +2 -1
- kailash/nodes/mcp/__init__.py +11 -0
- kailash/nodes/mcp/client.py +558 -0
- kailash/nodes/mcp/resource.py +682 -0
- kailash/nodes/mcp/server.py +571 -0
- kailash/nodes/transform/__init__.py +16 -1
- kailash/nodes/transform/chunkers.py +78 -0
- kailash/nodes/transform/formatters.py +96 -0
- kailash/runtime/docker.py +6 -6
- kailash/sdk_exceptions.py +24 -10
- kailash/tracking/metrics_collector.py +2 -1
- kailash/utils/templates.py +6 -6
- {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/METADATA +344 -46
- {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/RECORD +37 -26
- {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/WHEEL +0 -0
- {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/entry_points.txt +0 -0
- {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/licenses/LICENSE +0 -0
- {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/top_level.txt +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: kailash
|
3
|
-
Version: 0.1.
|
3
|
+
Version: 0.1.2
|
4
4
|
Summary: Python SDK for the Kailash container-node architecture
|
5
5
|
Home-page: https://github.com/integrum/kailash-python-sdk
|
6
6
|
Author: Integrum
|
@@ -10,9 +10,8 @@ Project-URL: Bug Tracker, https://github.com/integrum/kailash-python-sdk/issues
|
|
10
10
|
Classifier: Development Status :: 3 - Alpha
|
11
11
|
Classifier: Intended Audience :: Developers
|
12
12
|
Classifier: Programming Language :: Python :: 3
|
13
|
-
Classifier: Programming Language :: Python :: 3.
|
14
|
-
Classifier: Programming Language :: Python :: 3.
|
15
|
-
Classifier: Programming Language :: Python :: 3.10
|
13
|
+
Classifier: Programming Language :: Python :: 3.11
|
14
|
+
Classifier: Programming Language :: Python :: 3.12
|
16
15
|
Requires-Python: >=3.11
|
17
16
|
Description-Content-Type: text/markdown
|
18
17
|
License-File: LICENSE
|
@@ -22,7 +21,7 @@ Requires-Dist: matplotlib>=3.5
|
|
22
21
|
Requires-Dist: pyyaml>=6.0
|
23
22
|
Requires-Dist: click>=8.0
|
24
23
|
Requires-Dist: pytest>=8.3.5
|
25
|
-
Requires-Dist: mcp[cli]>=1.9.
|
24
|
+
Requires-Dist: mcp[cli]>=1.9.2
|
26
25
|
Requires-Dist: pandas>=2.2.3
|
27
26
|
Requires-Dist: numpy>=2.2.5
|
28
27
|
Requires-Dist: scipy>=1.15.3
|
@@ -46,6 +45,7 @@ Requires-Dist: fastapi[all]>=0.115.12
|
|
46
45
|
Requires-Dist: pytest-asyncio>=1.0.0
|
47
46
|
Requires-Dist: pre-commit>=4.2.0
|
48
47
|
Requires-Dist: twine>=6.1.0
|
48
|
+
Requires-Dist: ollama>=0.5.1
|
49
49
|
Provides-Extra: dev
|
50
50
|
Requires-Dist: pytest>=7.0; extra == "dev"
|
51
51
|
Requires-Dist: pytest-cov>=3.0; extra == "dev"
|
@@ -62,10 +62,10 @@ Dynamic: requires-python
|
|
62
62
|
<p align="center">
|
63
63
|
<a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/v/kailash.svg" alt="PyPI version"></a>
|
64
64
|
<a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/pyversions/kailash.svg" alt="Python versions"></a>
|
65
|
-
<a href="https://
|
65
|
+
<a href="https://pepy.tech/project/kailash"><img src="https://static.pepy.tech/badge/kailash" alt="Downloads"></a>
|
66
66
|
<img src="https://img.shields.io/badge/license-MIT-green.svg" alt="MIT License">
|
67
67
|
<img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black">
|
68
|
-
<img src="https://img.shields.io/badge/tests-
|
68
|
+
<img src="https://img.shields.io/badge/tests-746%20passing-brightgreen.svg" alt="Tests: 746 passing">
|
69
69
|
<img src="https://img.shields.io/badge/coverage-100%25-brightgreen.svg" alt="Coverage: 100%">
|
70
70
|
</p>
|
71
71
|
|
@@ -87,6 +87,8 @@ Dynamic: requires-python
|
|
87
87
|
- 📊 **Real-time Monitoring**: Live dashboards with WebSocket streaming and performance metrics
|
88
88
|
- 🧩 **Extensible**: Easy to create custom nodes for domain-specific operations
|
89
89
|
- ⚡ **Fast Installation**: Uses `uv` for lightning-fast Python package management
|
90
|
+
- 🤖 **AI-Powered**: Complete LLM agents, embeddings, and hierarchical RAG architecture
|
91
|
+
- 🧠 **Retrieval-Augmented Generation**: Full RAG pipeline with intelligent document processing
|
90
92
|
|
91
93
|
## 🎯 Who Is This For?
|
92
94
|
|
@@ -101,6 +103,8 @@ The Kailash Python SDK is designed for:
|
|
101
103
|
|
102
104
|
### Installation
|
103
105
|
|
106
|
+
**Requirements:** Python 3.11 or higher
|
107
|
+
|
104
108
|
```bash
|
105
109
|
# Install uv if you haven't already
|
106
110
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
@@ -137,9 +141,11 @@ def analyze_customers(data):
|
|
137
141
|
# Convert total_spent to numeric
|
138
142
|
df['total_spent'] = pd.to_numeric(df['total_spent'])
|
139
143
|
return {
|
140
|
-
"
|
141
|
-
|
142
|
-
|
144
|
+
"result": {
|
145
|
+
"total_customers": len(df),
|
146
|
+
"avg_spend": df["total_spent"].mean(),
|
147
|
+
"top_customers": df.nlargest(10, "total_spent").to_dict("records")
|
148
|
+
}
|
143
149
|
}
|
144
150
|
|
145
151
|
analyzer = PythonCodeNode.from_function(analyze_customers, name="analyzer")
|
@@ -174,7 +180,7 @@ sharepoint = SharePointGraphReader()
|
|
174
180
|
workflow.add_node("read_sharepoint", sharepoint)
|
175
181
|
|
176
182
|
# Process downloaded files
|
177
|
-
csv_writer = CSVWriter()
|
183
|
+
csv_writer = CSVWriter(file_path="sharepoint_output.csv")
|
178
184
|
workflow.add_node("save_locally", csv_writer)
|
179
185
|
|
180
186
|
# Connect nodes
|
@@ -198,6 +204,75 @@ runtime = LocalRuntime()
|
|
198
204
|
results, run_id = runtime.execute(workflow, inputs=inputs)
|
199
205
|
```
|
200
206
|
|
207
|
+
### Hierarchical RAG Example
|
208
|
+
|
209
|
+
```python
|
210
|
+
from kailash.workflow import Workflow
|
211
|
+
from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
|
212
|
+
from kailash.nodes.ai.llm_agent import LLMAgent
|
213
|
+
from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
|
214
|
+
from kailash.nodes.data.retrieval import RelevanceScorerNode
|
215
|
+
from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
|
216
|
+
from kailash.nodes.transform.formatters import (
|
217
|
+
ChunkTextExtractorNode, QueryTextWrapperNode, ContextFormatterNode
|
218
|
+
)
|
219
|
+
|
220
|
+
# Create hierarchical RAG workflow
|
221
|
+
workflow = Workflow("hierarchical_rag", name="Hierarchical RAG Workflow")
|
222
|
+
|
223
|
+
# Data sources (autonomous - no external files needed)
|
224
|
+
doc_source = DocumentSourceNode()
|
225
|
+
query_source = QuerySourceNode()
|
226
|
+
|
227
|
+
# Document processing pipeline
|
228
|
+
chunker = HierarchicalChunkerNode()
|
229
|
+
chunk_text_extractor = ChunkTextExtractorNode()
|
230
|
+
query_text_wrapper = QueryTextWrapperNode()
|
231
|
+
|
232
|
+
# AI processing with Ollama
|
233
|
+
chunk_embedder = EmbeddingGenerator(
|
234
|
+
provider="ollama", model="nomic-embed-text", operation="embed_batch"
|
235
|
+
)
|
236
|
+
query_embedder = EmbeddingGenerator(
|
237
|
+
provider="ollama", model="nomic-embed-text", operation="embed_batch"
|
238
|
+
)
|
239
|
+
|
240
|
+
# Retrieval and response generation
|
241
|
+
relevance_scorer = RelevanceScorerNode()
|
242
|
+
context_formatter = ContextFormatterNode()
|
243
|
+
llm_agent = LLMAgent(provider="ollama", model="llama3.2", temperature=0.7)
|
244
|
+
|
245
|
+
# Add all nodes to workflow
|
246
|
+
for name, node in {
|
247
|
+
"doc_source": doc_source, "query_source": query_source,
|
248
|
+
"chunker": chunker, "chunk_text_extractor": chunk_text_extractor,
|
249
|
+
"query_text_wrapper": query_text_wrapper, "chunk_embedder": chunk_embedder,
|
250
|
+
"query_embedder": query_embedder, "relevance_scorer": relevance_scorer,
|
251
|
+
"context_formatter": context_formatter, "llm_agent": llm_agent
|
252
|
+
}.items():
|
253
|
+
workflow.add_node(name, node)
|
254
|
+
|
255
|
+
# Connect the RAG pipeline
|
256
|
+
workflow.connect("doc_source", "chunker", {"documents": "documents"})
|
257
|
+
workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
|
258
|
+
workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
|
259
|
+
workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
|
260
|
+
workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
|
261
|
+
workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
|
262
|
+
workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
|
263
|
+
workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
|
264
|
+
workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
|
265
|
+
workflow.connect("query_source", "context_formatter", {"query": "query"})
|
266
|
+
workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
|
267
|
+
|
268
|
+
# Execute the RAG workflow
|
269
|
+
from kailash.runtime.local import LocalRuntime
|
270
|
+
runtime = LocalRuntime()
|
271
|
+
results, run_id = runtime.execute(workflow)
|
272
|
+
|
273
|
+
print("RAG Response:", results["llm_agent"]["response"])
|
274
|
+
```
|
275
|
+
|
201
276
|
## 📚 Documentation
|
202
277
|
|
203
278
|
| Resource | Description |
|
@@ -221,6 +296,9 @@ The SDK includes a rich set of pre-built nodes for common operations:
|
|
221
296
|
**Data Operations**
|
222
297
|
- `CSVReader` - Read CSV files
|
223
298
|
- `JSONReader` - Read JSON files
|
299
|
+
- `DocumentSourceNode` - Sample document provider
|
300
|
+
- `QuerySourceNode` - Sample query provider
|
301
|
+
- `RelevanceScorerNode` - Multi-method similarity
|
224
302
|
- `SQLDatabaseNode` - Query databases
|
225
303
|
- `CSVWriter` - Write CSV files
|
226
304
|
- `JSONWriter` - Write JSON files
|
@@ -228,12 +306,15 @@ The SDK includes a rich set of pre-built nodes for common operations:
|
|
228
306
|
</td>
|
229
307
|
<td width="50%">
|
230
308
|
|
231
|
-
**
|
309
|
+
**Transform Nodes**
|
232
310
|
- `PythonCodeNode` - Custom Python logic
|
233
311
|
- `DataTransformer` - Transform data
|
312
|
+
- `HierarchicalChunkerNode` - Document chunking
|
313
|
+
- `ChunkTextExtractorNode` - Extract chunk text
|
314
|
+
- `QueryTextWrapperNode` - Wrap queries for processing
|
315
|
+
- `ContextFormatterNode` - Format LLM context
|
234
316
|
- `Filter` - Filter records
|
235
317
|
- `Aggregator` - Aggregate data
|
236
|
-
- `TextProcessor` - Process text
|
237
318
|
|
238
319
|
</td>
|
239
320
|
</tr>
|
@@ -241,10 +322,12 @@ The SDK includes a rich set of pre-built nodes for common operations:
|
|
241
322
|
<td width="50%">
|
242
323
|
|
243
324
|
**AI/ML Nodes**
|
244
|
-
- `
|
245
|
-
- `
|
246
|
-
- `
|
247
|
-
- `
|
325
|
+
- `LLMAgent` - Multi-provider LLM with memory & tools
|
326
|
+
- `EmbeddingGenerator` - Vector embeddings with caching
|
327
|
+
- `MCPClient/MCPServer` - Model Context Protocol
|
328
|
+
- `TextClassifier` - Text classification
|
329
|
+
- `SentimentAnalyzer` - Sentiment analysis
|
330
|
+
- `NamedEntityRecognizer` - NER extraction
|
248
331
|
|
249
332
|
</td>
|
250
333
|
<td width="50%">
|
@@ -280,25 +363,30 @@ The SDK includes a rich set of pre-built nodes for common operations:
|
|
280
363
|
#### Workflow Management
|
281
364
|
```python
|
282
365
|
from kailash.workflow import Workflow
|
366
|
+
from kailash.nodes.logic import Switch
|
367
|
+
from kailash.nodes.transform import DataTransformer
|
283
368
|
|
284
369
|
# Create complex workflows with branching logic
|
285
370
|
workflow = Workflow("data_pipeline", name="data_pipeline")
|
286
371
|
|
287
|
-
# Add conditional branching
|
288
|
-
|
289
|
-
workflow.add_node("
|
372
|
+
# Add conditional branching with Switch node
|
373
|
+
switch = Switch()
|
374
|
+
workflow.add_node("route", switch)
|
290
375
|
|
291
376
|
# Different paths based on validation
|
377
|
+
processor_a = DataTransformer(transformations=["lambda x: x"])
|
378
|
+
error_handler = DataTransformer(transformations=["lambda x: {'error': str(x)}"])
|
292
379
|
workflow.add_node("process_valid", processor_a)
|
293
380
|
workflow.add_node("handle_errors", error_handler)
|
294
381
|
|
295
|
-
# Connect with
|
296
|
-
workflow.connect("
|
297
|
-
workflow.connect("
|
382
|
+
# Connect with switch routing
|
383
|
+
workflow.connect("route", "process_valid")
|
384
|
+
workflow.connect("route", "handle_errors")
|
298
385
|
```
|
299
386
|
|
300
387
|
#### Immutable State Management
|
301
388
|
```python
|
389
|
+
from kailash.workflow import Workflow
|
302
390
|
from kailash.workflow.state import WorkflowStateWrapper
|
303
391
|
from pydantic import BaseModel
|
304
392
|
|
@@ -308,6 +396,9 @@ class MyStateModel(BaseModel):
|
|
308
396
|
status: str = "pending"
|
309
397
|
nested: dict = {}
|
310
398
|
|
399
|
+
# Create workflow
|
400
|
+
workflow = Workflow("state_workflow", name="state_workflow")
|
401
|
+
|
311
402
|
# Create and wrap state object
|
312
403
|
state = MyStateModel()
|
313
404
|
state_wrapper = workflow.create_state_wrapper(state)
|
@@ -324,8 +415,9 @@ updated_wrapper = state_wrapper.batch_update([
|
|
324
415
|
(["status"], "processing")
|
325
416
|
])
|
326
417
|
|
327
|
-
#
|
328
|
-
|
418
|
+
# Access the updated state
|
419
|
+
print(f"Updated counter: {updated_wrapper._state.counter}")
|
420
|
+
print(f"Updated status: {updated_wrapper._state.status}")
|
329
421
|
```
|
330
422
|
|
331
423
|
#### Task Tracking
|
@@ -342,45 +434,75 @@ workflow = Workflow("sample_workflow", name="Sample Workflow")
|
|
342
434
|
# Run workflow with tracking
|
343
435
|
from kailash.runtime.local import LocalRuntime
|
344
436
|
runtime = LocalRuntime()
|
345
|
-
results, run_id = runtime.execute(workflow
|
437
|
+
results, run_id = runtime.execute(workflow)
|
346
438
|
|
347
439
|
# Query execution history
|
348
|
-
|
349
|
-
|
440
|
+
# Note: list_runs() may fail with timezone comparison errors in some cases
|
441
|
+
try:
|
442
|
+
# List all runs
|
443
|
+
all_runs = task_manager.list_runs()
|
444
|
+
|
445
|
+
# Filter by status
|
446
|
+
completed_runs = task_manager.list_runs(status="completed")
|
447
|
+
failed_runs = task_manager.list_runs(status="failed")
|
448
|
+
|
449
|
+
# Filter by workflow name
|
450
|
+
workflow_runs = task_manager.list_runs(workflow_name="sample_workflow")
|
451
|
+
|
452
|
+
# Process run information
|
453
|
+
for run in completed_runs[:5]: # First 5 runs
|
454
|
+
print(f"Run {run.run_id[:8]}: {run.workflow_name} - {run.status}")
|
455
|
+
|
456
|
+
except Exception as e:
|
457
|
+
print(f"Error listing runs: {e}")
|
458
|
+
# Fallback: Access run details directly if available
|
459
|
+
if hasattr(task_manager, 'storage'):
|
460
|
+
run = task_manager.get_run(run_id)
|
350
461
|
```
|
351
462
|
|
352
463
|
#### Local Testing
|
353
464
|
```python
|
354
465
|
from kailash.runtime.local import LocalRuntime
|
466
|
+
from kailash.workflow import Workflow
|
467
|
+
|
468
|
+
# Create a test workflow
|
469
|
+
workflow = Workflow("test_workflow", name="test_workflow")
|
355
470
|
|
356
471
|
# Create test runtime with debugging enabled
|
357
472
|
runtime = LocalRuntime(debug=True)
|
358
473
|
|
359
474
|
# Execute with test data
|
360
|
-
|
361
|
-
results = runtime.execute(workflow, inputs=test_data)
|
475
|
+
results, run_id = runtime.execute(workflow)
|
362
476
|
|
363
477
|
# Validate results
|
364
|
-
assert results
|
478
|
+
assert isinstance(results, dict)
|
365
479
|
```
|
366
480
|
|
367
481
|
#### Performance Monitoring & Real-time Dashboards
|
368
482
|
```python
|
369
483
|
from kailash.visualization.performance import PerformanceVisualizer
|
370
484
|
from kailash.visualization.dashboard import RealTimeDashboard, DashboardConfig
|
371
|
-
from kailash.visualization.reports import WorkflowPerformanceReporter
|
485
|
+
from kailash.visualization.reports import WorkflowPerformanceReporter, ReportFormat
|
372
486
|
from kailash.tracking import TaskManager
|
373
487
|
from kailash.runtime.local import LocalRuntime
|
488
|
+
from kailash.workflow import Workflow
|
489
|
+
from kailash.nodes.transform import DataTransformer
|
490
|
+
|
491
|
+
# Create a workflow to monitor
|
492
|
+
workflow = Workflow("monitored_workflow", name="monitored_workflow")
|
493
|
+
node = DataTransformer(transformations=["lambda x: x"])
|
494
|
+
workflow.add_node("transform", node)
|
374
495
|
|
375
496
|
# Run workflow with task tracking
|
497
|
+
# Note: Pass task_manager to execute() to enable performance tracking
|
376
498
|
task_manager = TaskManager()
|
377
499
|
runtime = LocalRuntime()
|
378
500
|
results, run_id = runtime.execute(workflow, task_manager=task_manager)
|
379
501
|
|
380
502
|
# Static performance analysis
|
503
|
+
from pathlib import Path
|
381
504
|
perf_viz = PerformanceVisualizer(task_manager)
|
382
|
-
outputs = perf_viz.create_run_performance_summary(run_id, output_dir="performance_report")
|
383
|
-
perf_viz.compare_runs([run_id_1, run_id_2], output_path="comparison.png")
|
505
|
+
outputs = perf_viz.create_run_performance_summary(run_id, output_dir=Path("performance_report"))
|
384
506
|
|
385
507
|
# Real-time monitoring dashboard
|
386
508
|
config = DashboardConfig(
|
@@ -408,8 +530,7 @@ reporter = WorkflowPerformanceReporter(task_manager)
|
|
408
530
|
report_path = reporter.generate_report(
|
409
531
|
run_id,
|
410
532
|
output_path="workflow_report.html",
|
411
|
-
format=ReportFormat.HTML
|
412
|
-
compare_runs=[run_id_1, run_id_2]
|
533
|
+
format=ReportFormat.HTML
|
413
534
|
)
|
414
535
|
```
|
415
536
|
|
@@ -466,6 +587,13 @@ api_client = RESTAPINode(
|
|
466
587
|
#### Export Formats
|
467
588
|
```python
|
468
589
|
from kailash.utils.export import WorkflowExporter, ExportConfig
|
590
|
+
from kailash.workflow import Workflow
|
591
|
+
from kailash.nodes.transform import DataTransformer
|
592
|
+
|
593
|
+
# Create a workflow to export
|
594
|
+
workflow = Workflow("export_example", name="export_example")
|
595
|
+
node = DataTransformer(transformations=["lambda x: x"])
|
596
|
+
workflow.add_node("transform", node)
|
469
597
|
|
470
598
|
exporter = WorkflowExporter()
|
471
599
|
|
@@ -478,22 +606,147 @@ config = ExportConfig(
|
|
478
606
|
include_metadata=True,
|
479
607
|
container_tag="latest"
|
480
608
|
)
|
481
|
-
workflow.save("deployment.yaml"
|
609
|
+
workflow.save("deployment.yaml")
|
482
610
|
```
|
483
611
|
|
484
612
|
### 🎨 Visualization
|
485
613
|
|
486
614
|
```python
|
615
|
+
from kailash.workflow import Workflow
|
487
616
|
from kailash.workflow.visualization import WorkflowVisualizer
|
617
|
+
from kailash.nodes.transform import DataTransformer
|
618
|
+
|
619
|
+
# Create a workflow to visualize
|
620
|
+
workflow = Workflow("viz_example", name="viz_example")
|
621
|
+
node = DataTransformer(transformations=["lambda x: x"])
|
622
|
+
workflow.add_node("transform", node)
|
488
623
|
|
489
|
-
#
|
624
|
+
# Generate Mermaid diagram (recommended for documentation)
|
625
|
+
mermaid_code = workflow.to_mermaid()
|
626
|
+
print(mermaid_code)
|
627
|
+
|
628
|
+
# Save as Mermaid markdown file
|
629
|
+
with open("workflow.md", "w") as f:
|
630
|
+
f.write(workflow.to_mermaid_markdown(title="My Workflow"))
|
631
|
+
|
632
|
+
# Or use matplotlib visualization
|
490
633
|
visualizer = WorkflowVisualizer(workflow)
|
491
|
-
visualizer.visualize(
|
634
|
+
visualizer.visualize()
|
635
|
+
visualizer.save("workflow.png", dpi=300) # Save as PNG
|
636
|
+
```
|
637
|
+
|
638
|
+
#### Hierarchical RAG (Retrieval-Augmented Generation)
|
639
|
+
```python
|
640
|
+
from kailash.workflow import Workflow
|
641
|
+
from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
|
642
|
+
from kailash.nodes.data.retrieval import RelevanceScorerNode
|
643
|
+
from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
|
644
|
+
from kailash.nodes.transform.formatters import (
|
645
|
+
ChunkTextExtractorNode,
|
646
|
+
QueryTextWrapperNode,
|
647
|
+
ContextFormatterNode,
|
648
|
+
)
|
649
|
+
from kailash.nodes.ai.llm_agent import LLMAgent
|
650
|
+
from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
|
651
|
+
|
652
|
+
# Create hierarchical RAG workflow
|
653
|
+
workflow = Workflow(
|
654
|
+
workflow_id="hierarchical_rag_example",
|
655
|
+
name="Hierarchical RAG Workflow",
|
656
|
+
description="Complete RAG pipeline with embedding-based retrieval",
|
657
|
+
version="1.0.0"
|
658
|
+
)
|
492
659
|
|
493
|
-
#
|
494
|
-
|
660
|
+
# Create data source nodes
|
661
|
+
doc_source = DocumentSourceNode()
|
662
|
+
query_source = QuerySourceNode()
|
663
|
+
|
664
|
+
# Create document processing pipeline
|
665
|
+
chunker = HierarchicalChunkerNode()
|
666
|
+
chunk_text_extractor = ChunkTextExtractorNode()
|
667
|
+
query_text_wrapper = QueryTextWrapperNode()
|
668
|
+
|
669
|
+
# Create embedding generators
|
670
|
+
chunk_embedder = EmbeddingGenerator(
|
671
|
+
provider="ollama",
|
672
|
+
model="nomic-embed-text",
|
673
|
+
operation="embed_batch"
|
674
|
+
)
|
675
|
+
|
676
|
+
query_embedder = EmbeddingGenerator(
|
677
|
+
provider="ollama",
|
678
|
+
model="nomic-embed-text",
|
679
|
+
operation="embed_batch"
|
680
|
+
)
|
681
|
+
|
682
|
+
# Create retrieval and formatting nodes
|
683
|
+
relevance_scorer = RelevanceScorerNode(similarity_method="cosine")
|
684
|
+
context_formatter = ContextFormatterNode()
|
685
|
+
|
686
|
+
# Create LLM agent for final answer generation
|
687
|
+
llm_agent = LLMAgent(
|
688
|
+
provider="ollama",
|
689
|
+
model="llama3.2",
|
690
|
+
temperature=0.7,
|
691
|
+
max_tokens=500
|
692
|
+
)
|
693
|
+
|
694
|
+
# Add all nodes to workflow
|
695
|
+
for node_id, node in [
|
696
|
+
("doc_source", doc_source),
|
697
|
+
("chunker", chunker),
|
698
|
+
("query_source", query_source),
|
699
|
+
("chunk_text_extractor", chunk_text_extractor),
|
700
|
+
("query_text_wrapper", query_text_wrapper),
|
701
|
+
("chunk_embedder", chunk_embedder),
|
702
|
+
("query_embedder", query_embedder),
|
703
|
+
("relevance_scorer", relevance_scorer),
|
704
|
+
("context_formatter", context_formatter),
|
705
|
+
("llm_agent", llm_agent)
|
706
|
+
]:
|
707
|
+
workflow.add_node(node_id, node)
|
708
|
+
|
709
|
+
# Connect the workflow pipeline
|
710
|
+
# Document processing: docs → chunks → text → embeddings
|
711
|
+
workflow.connect("doc_source", "chunker", {"documents": "documents"})
|
712
|
+
workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
|
713
|
+
workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
|
714
|
+
|
715
|
+
# Query processing: query → text wrapper → embeddings
|
716
|
+
workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
|
717
|
+
workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
|
718
|
+
|
719
|
+
# Relevance scoring: chunks + embeddings → scored chunks
|
720
|
+
workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
|
721
|
+
workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
|
722
|
+
workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
|
723
|
+
|
724
|
+
# Context formatting: relevant chunks + query → formatted context
|
725
|
+
workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
|
726
|
+
workflow.connect("query_source", "context_formatter", {"query": "query"})
|
727
|
+
|
728
|
+
# Final answer generation: formatted context → LLM response
|
729
|
+
workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
|
730
|
+
|
731
|
+
# Execute workflow
|
732
|
+
results, run_id = workflow.run()
|
733
|
+
|
734
|
+
# Access results
|
735
|
+
print("🎯 Top Relevant Chunks:")
|
736
|
+
for chunk in results["relevance_scorer"]["relevant_chunks"]:
|
737
|
+
print(f" - {chunk['document_title']}: {chunk['relevance_score']:.3f}")
|
738
|
+
|
739
|
+
print("\n🤖 Final Answer:")
|
740
|
+
print(results["llm_agent"]["response"]["content"])
|
495
741
|
```
|
496
742
|
|
743
|
+
This example demonstrates:
|
744
|
+
- **Document chunking** with hierarchical structure
|
745
|
+
- **Vector embeddings** using Ollama's nomic-embed-text model
|
746
|
+
- **Semantic similarity** scoring with cosine similarity
|
747
|
+
- **Context formatting** for LLM input
|
748
|
+
- **Answer generation** using Ollama's llama3.2 model
|
749
|
+
|
497
750
|
## 💻 CLI Commands
|
498
751
|
|
499
752
|
The SDK includes a comprehensive CLI for workflow management:
|
@@ -545,6 +798,45 @@ kailash/
|
|
545
798
|
└── utils/ # Utilities and helpers
|
546
799
|
```
|
547
800
|
|
801
|
+
### 🤖 Unified AI Provider Architecture
|
802
|
+
|
803
|
+
The SDK features a unified provider architecture for AI capabilities:
|
804
|
+
|
805
|
+
```python
|
806
|
+
from kailash.nodes.ai import LLMAgent, EmbeddingGenerator
|
807
|
+
|
808
|
+
# Multi-provider LLM support
|
809
|
+
agent = LLMAgent()
|
810
|
+
result = agent.run(
|
811
|
+
provider="ollama", # or "openai", "anthropic", "mock"
|
812
|
+
model="llama3.1:8b-instruct-q8_0",
|
813
|
+
messages=[{"role": "user", "content": "Explain quantum computing"}],
|
814
|
+
generation_config={"temperature": 0.7, "max_tokens": 500}
|
815
|
+
)
|
816
|
+
|
817
|
+
# Vector embeddings with the same providers
|
818
|
+
embedder = EmbeddingGenerator()
|
819
|
+
embedding = embedder.run(
|
820
|
+
provider="ollama", # Same providers support embeddings
|
821
|
+
model="snowflake-arctic-embed2",
|
822
|
+
operation="embed_text",
|
823
|
+
input_text="Quantum computing uses quantum mechanics principles"
|
824
|
+
)
|
825
|
+
|
826
|
+
# Check available providers and capabilities
|
827
|
+
from kailash.nodes.ai.ai_providers import get_available_providers
|
828
|
+
providers = get_available_providers()
|
829
|
+
# Returns: {"ollama": {"available": True, "chat": True, "embeddings": True}, ...}
|
830
|
+
```
|
831
|
+
|
832
|
+
**Supported AI Providers:**
|
833
|
+
- **Ollama**: Local LLMs with both chat and embeddings (llama3.1, mistral, etc.)
|
834
|
+
- **OpenAI**: GPT models and text-embedding-3 series
|
835
|
+
- **Anthropic**: Claude models (chat only)
|
836
|
+
- **Cohere**: Embedding models (embed-english-v3.0)
|
837
|
+
- **HuggingFace**: Sentence transformers and local models
|
838
|
+
- **Mock**: Testing provider with consistent outputs
|
839
|
+
|
548
840
|
## 🧪 Testing
|
549
841
|
|
550
842
|
The SDK is thoroughly tested with comprehensive test suites:
|
@@ -656,9 +948,9 @@ pre-commit run pytest-check
|
|
656
948
|
- **Performance visualization dashboards**
|
657
949
|
- **Real-time monitoring dashboard with WebSocket streaming**
|
658
950
|
- **Comprehensive performance reports (HTML, Markdown, JSON)**
|
659
|
-
- **
|
951
|
+
- **89% test coverage (571 tests)**
|
660
952
|
- **15 test categories all passing**
|
661
|
-
-
|
953
|
+
- 37 working examples
|
662
954
|
|
663
955
|
</td>
|
664
956
|
<td width="30%">
|
@@ -683,11 +975,17 @@ pre-commit run pytest-check
|
|
683
975
|
</table>
|
684
976
|
|
685
977
|
### 🎯 Test Suite Status
|
686
|
-
- **Total Tests**:
|
978
|
+
- **Total Tests**: 571 passing (89%)
|
687
979
|
- **Test Categories**: 15/15 at 100%
|
688
980
|
- **Integration Tests**: 65 passing
|
689
|
-
- **Examples**:
|
690
|
-
- **Code Coverage**:
|
981
|
+
- **Examples**: 37/37 working
|
982
|
+
- **Code Coverage**: 89%
|
983
|
+
|
984
|
+
## ⚠️ Known Issues
|
985
|
+
|
986
|
+
1. **DateTime Comparison in `list_runs()`**: The `TaskManager.list_runs()` method may encounter timezone comparison errors between timezone-aware and timezone-naive datetime objects. Workaround: Use try-catch blocks when calling `list_runs()` or access run details directly via `get_run(run_id)`.
|
987
|
+
|
988
|
+
2. **Performance Tracking**: To enable performance metrics collection, you must pass the `task_manager` parameter to the `runtime.execute()` method: `runtime.execute(workflow, task_manager=task_manager)`.
|
691
989
|
|
692
990
|
## 📄 License
|
693
991
|
|