kailash 0.1.0__py3-none-any.whl → 0.1.2__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- kailash/__init__.py +1 -1
- kailash/nodes/__init__.py +2 -1
- kailash/nodes/ai/__init__.py +26 -0
- kailash/nodes/ai/ai_providers.py +1272 -0
- kailash/nodes/ai/embedding_generator.py +853 -0
- kailash/nodes/ai/llm_agent.py +1166 -0
- kailash/nodes/api/auth.py +3 -3
- kailash/nodes/api/graphql.py +2 -2
- kailash/nodes/api/http.py +391 -44
- kailash/nodes/api/rate_limiting.py +2 -2
- kailash/nodes/api/rest.py +464 -56
- kailash/nodes/base.py +71 -12
- kailash/nodes/code/python.py +2 -1
- kailash/nodes/data/__init__.py +7 -0
- kailash/nodes/data/readers.py +28 -26
- kailash/nodes/data/retrieval.py +178 -0
- kailash/nodes/data/sharepoint_graph.py +7 -7
- kailash/nodes/data/sources.py +65 -0
- kailash/nodes/data/sql.py +4 -2
- kailash/nodes/data/writers.py +6 -3
- kailash/nodes/logic/operations.py +2 -1
- kailash/nodes/mcp/__init__.py +11 -0
- kailash/nodes/mcp/client.py +558 -0
- kailash/nodes/mcp/resource.py +682 -0
- kailash/nodes/mcp/server.py +571 -0
- kailash/nodes/transform/__init__.py +16 -1
- kailash/nodes/transform/chunkers.py +78 -0
- kailash/nodes/transform/formatters.py +96 -0
- kailash/runtime/docker.py +6 -6
- kailash/sdk_exceptions.py +24 -10
- kailash/tracking/metrics_collector.py +2 -1
- kailash/utils/templates.py +6 -6
- {kailash-0.1.0.dist-info → kailash-0.1.2.dist-info}/METADATA +349 -49
- {kailash-0.1.0.dist-info → kailash-0.1.2.dist-info}/RECORD +38 -27
- {kailash-0.1.0.dist-info → kailash-0.1.2.dist-info}/WHEEL +0 -0
- {kailash-0.1.0.dist-info → kailash-0.1.2.dist-info}/entry_points.txt +0 -0
- {kailash-0.1.0.dist-info → kailash-0.1.2.dist-info}/licenses/LICENSE +0 -0
- {kailash-0.1.0.dist-info → kailash-0.1.2.dist-info}/top_level.txt +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: kailash
|
3
|
-
Version: 0.1.
|
3
|
+
Version: 0.1.2
|
4
4
|
Summary: Python SDK for the Kailash container-node architecture
|
5
5
|
Home-page: https://github.com/integrum/kailash-python-sdk
|
6
6
|
Author: Integrum
|
@@ -10,9 +10,8 @@ Project-URL: Bug Tracker, https://github.com/integrum/kailash-python-sdk/issues
|
|
10
10
|
Classifier: Development Status :: 3 - Alpha
|
11
11
|
Classifier: Intended Audience :: Developers
|
12
12
|
Classifier: Programming Language :: Python :: 3
|
13
|
-
Classifier: Programming Language :: Python :: 3.
|
14
|
-
Classifier: Programming Language :: Python :: 3.
|
15
|
-
Classifier: Programming Language :: Python :: 3.10
|
13
|
+
Classifier: Programming Language :: Python :: 3.11
|
14
|
+
Classifier: Programming Language :: Python :: 3.12
|
16
15
|
Requires-Python: >=3.11
|
17
16
|
Description-Content-Type: text/markdown
|
18
17
|
License-File: LICENSE
|
@@ -22,7 +21,7 @@ Requires-Dist: matplotlib>=3.5
|
|
22
21
|
Requires-Dist: pyyaml>=6.0
|
23
22
|
Requires-Dist: click>=8.0
|
24
23
|
Requires-Dist: pytest>=8.3.5
|
25
|
-
Requires-Dist: mcp[cli]>=1.9.
|
24
|
+
Requires-Dist: mcp[cli]>=1.9.2
|
26
25
|
Requires-Dist: pandas>=2.2.3
|
27
26
|
Requires-Dist: numpy>=2.2.5
|
28
27
|
Requires-Dist: scipy>=1.15.3
|
@@ -45,6 +44,8 @@ Requires-Dist: psutil>=7.0.0
|
|
45
44
|
Requires-Dist: fastapi[all]>=0.115.12
|
46
45
|
Requires-Dist: pytest-asyncio>=1.0.0
|
47
46
|
Requires-Dist: pre-commit>=4.2.0
|
47
|
+
Requires-Dist: twine>=6.1.0
|
48
|
+
Requires-Dist: ollama>=0.5.1
|
48
49
|
Provides-Extra: dev
|
49
50
|
Requires-Dist: pytest>=7.0; extra == "dev"
|
50
51
|
Requires-Dist: pytest-cov>=3.0; extra == "dev"
|
@@ -59,10 +60,12 @@ Dynamic: requires-python
|
|
59
60
|
# Kailash Python SDK
|
60
61
|
|
61
62
|
<p align="center">
|
62
|
-
<img src="https://img.shields.io/
|
63
|
+
<a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/v/kailash.svg" alt="PyPI version"></a>
|
64
|
+
<a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/pyversions/kailash.svg" alt="Python versions"></a>
|
65
|
+
<a href="https://pepy.tech/project/kailash"><img src="https://static.pepy.tech/badge/kailash" alt="Downloads"></a>
|
63
66
|
<img src="https://img.shields.io/badge/license-MIT-green.svg" alt="MIT License">
|
64
67
|
<img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black">
|
65
|
-
<img src="https://img.shields.io/badge/tests-
|
68
|
+
<img src="https://img.shields.io/badge/tests-746%20passing-brightgreen.svg" alt="Tests: 746 passing">
|
66
69
|
<img src="https://img.shields.io/badge/coverage-100%25-brightgreen.svg" alt="Coverage: 100%">
|
67
70
|
</p>
|
68
71
|
|
@@ -84,6 +87,8 @@ Dynamic: requires-python
|
|
84
87
|
- 📊 **Real-time Monitoring**: Live dashboards with WebSocket streaming and performance metrics
|
85
88
|
- 🧩 **Extensible**: Easy to create custom nodes for domain-specific operations
|
86
89
|
- ⚡ **Fast Installation**: Uses `uv` for lightning-fast Python package management
|
90
|
+
- 🤖 **AI-Powered**: Complete LLM agents, embeddings, and hierarchical RAG architecture
|
91
|
+
- 🧠 **Retrieval-Augmented Generation**: Full RAG pipeline with intelligent document processing
|
87
92
|
|
88
93
|
## 🎯 Who Is This For?
|
89
94
|
|
@@ -98,12 +103,14 @@ The Kailash Python SDK is designed for:
|
|
98
103
|
|
99
104
|
### Installation
|
100
105
|
|
106
|
+
**Requirements:** Python 3.11 or higher
|
107
|
+
|
101
108
|
```bash
|
102
109
|
# Install uv if you haven't already
|
103
110
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
104
111
|
|
105
112
|
# For users: Install from PyPI
|
106
|
-
|
113
|
+
pip install kailash
|
107
114
|
|
108
115
|
# For developers: Clone and sync
|
109
116
|
git clone https://github.com/integrum/kailash-python-sdk.git
|
@@ -134,9 +141,11 @@ def analyze_customers(data):
|
|
134
141
|
# Convert total_spent to numeric
|
135
142
|
df['total_spent'] = pd.to_numeric(df['total_spent'])
|
136
143
|
return {
|
137
|
-
"
|
138
|
-
|
139
|
-
|
144
|
+
"result": {
|
145
|
+
"total_customers": len(df),
|
146
|
+
"avg_spend": df["total_spent"].mean(),
|
147
|
+
"top_customers": df.nlargest(10, "total_spent").to_dict("records")
|
148
|
+
}
|
140
149
|
}
|
141
150
|
|
142
151
|
analyzer = PythonCodeNode.from_function(analyze_customers, name="analyzer")
|
@@ -171,7 +180,7 @@ sharepoint = SharePointGraphReader()
|
|
171
180
|
workflow.add_node("read_sharepoint", sharepoint)
|
172
181
|
|
173
182
|
# Process downloaded files
|
174
|
-
csv_writer = CSVWriter()
|
183
|
+
csv_writer = CSVWriter(file_path="sharepoint_output.csv")
|
175
184
|
workflow.add_node("save_locally", csv_writer)
|
176
185
|
|
177
186
|
# Connect nodes
|
@@ -195,13 +204,81 @@ runtime = LocalRuntime()
|
|
195
204
|
results, run_id = runtime.execute(workflow, inputs=inputs)
|
196
205
|
```
|
197
206
|
|
207
|
+
### Hierarchical RAG Example
|
208
|
+
|
209
|
+
```python
|
210
|
+
from kailash.workflow import Workflow
|
211
|
+
from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
|
212
|
+
from kailash.nodes.ai.llm_agent import LLMAgent
|
213
|
+
from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
|
214
|
+
from kailash.nodes.data.retrieval import RelevanceScorerNode
|
215
|
+
from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
|
216
|
+
from kailash.nodes.transform.formatters import (
|
217
|
+
ChunkTextExtractorNode, QueryTextWrapperNode, ContextFormatterNode
|
218
|
+
)
|
219
|
+
|
220
|
+
# Create hierarchical RAG workflow
|
221
|
+
workflow = Workflow("hierarchical_rag", name="Hierarchical RAG Workflow")
|
222
|
+
|
223
|
+
# Data sources (autonomous - no external files needed)
|
224
|
+
doc_source = DocumentSourceNode()
|
225
|
+
query_source = QuerySourceNode()
|
226
|
+
|
227
|
+
# Document processing pipeline
|
228
|
+
chunker = HierarchicalChunkerNode()
|
229
|
+
chunk_text_extractor = ChunkTextExtractorNode()
|
230
|
+
query_text_wrapper = QueryTextWrapperNode()
|
231
|
+
|
232
|
+
# AI processing with Ollama
|
233
|
+
chunk_embedder = EmbeddingGenerator(
|
234
|
+
provider="ollama", model="nomic-embed-text", operation="embed_batch"
|
235
|
+
)
|
236
|
+
query_embedder = EmbeddingGenerator(
|
237
|
+
provider="ollama", model="nomic-embed-text", operation="embed_batch"
|
238
|
+
)
|
239
|
+
|
240
|
+
# Retrieval and response generation
|
241
|
+
relevance_scorer = RelevanceScorerNode()
|
242
|
+
context_formatter = ContextFormatterNode()
|
243
|
+
llm_agent = LLMAgent(provider="ollama", model="llama3.2", temperature=0.7)
|
244
|
+
|
245
|
+
# Add all nodes to workflow
|
246
|
+
for name, node in {
|
247
|
+
"doc_source": doc_source, "query_source": query_source,
|
248
|
+
"chunker": chunker, "chunk_text_extractor": chunk_text_extractor,
|
249
|
+
"query_text_wrapper": query_text_wrapper, "chunk_embedder": chunk_embedder,
|
250
|
+
"query_embedder": query_embedder, "relevance_scorer": relevance_scorer,
|
251
|
+
"context_formatter": context_formatter, "llm_agent": llm_agent
|
252
|
+
}.items():
|
253
|
+
workflow.add_node(name, node)
|
254
|
+
|
255
|
+
# Connect the RAG pipeline
|
256
|
+
workflow.connect("doc_source", "chunker", {"documents": "documents"})
|
257
|
+
workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
|
258
|
+
workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
|
259
|
+
workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
|
260
|
+
workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
|
261
|
+
workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
|
262
|
+
workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
|
263
|
+
workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
|
264
|
+
workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
|
265
|
+
workflow.connect("query_source", "context_formatter", {"query": "query"})
|
266
|
+
workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
|
267
|
+
|
268
|
+
# Execute the RAG workflow
|
269
|
+
from kailash.runtime.local import LocalRuntime
|
270
|
+
runtime = LocalRuntime()
|
271
|
+
results, run_id = runtime.execute(workflow)
|
272
|
+
|
273
|
+
print("RAG Response:", results["llm_agent"]["response"])
|
274
|
+
```
|
275
|
+
|
198
276
|
## 📚 Documentation
|
199
277
|
|
200
278
|
| Resource | Description |
|
201
279
|
|----------|-------------|
|
202
280
|
| 📖 [User Guide](docs/user-guide.md) | Comprehensive guide for using the SDK |
|
203
|
-
|
|
204
|
-
| 📋 [API Reference](docs/api/) | Detailed API documentation |
|
281
|
+
| 📋 [API Reference](docs/) | Detailed API documentation |
|
205
282
|
| 🌐 [API Integration Guide](examples/API_INTEGRATION_README.md) | Complete API integration documentation |
|
206
283
|
| 🎓 [Examples](examples/) | Working examples and tutorials |
|
207
284
|
| 🤝 [Contributing](CONTRIBUTING.md) | Contribution guidelines |
|
@@ -219,6 +296,9 @@ The SDK includes a rich set of pre-built nodes for common operations:
|
|
219
296
|
**Data Operations**
|
220
297
|
- `CSVReader` - Read CSV files
|
221
298
|
- `JSONReader` - Read JSON files
|
299
|
+
- `DocumentSourceNode` - Sample document provider
|
300
|
+
- `QuerySourceNode` - Sample query provider
|
301
|
+
- `RelevanceScorerNode` - Multi-method similarity
|
222
302
|
- `SQLDatabaseNode` - Query databases
|
223
303
|
- `CSVWriter` - Write CSV files
|
224
304
|
- `JSONWriter` - Write JSON files
|
@@ -226,12 +306,15 @@ The SDK includes a rich set of pre-built nodes for common operations:
|
|
226
306
|
</td>
|
227
307
|
<td width="50%">
|
228
308
|
|
229
|
-
**
|
309
|
+
**Transform Nodes**
|
230
310
|
- `PythonCodeNode` - Custom Python logic
|
231
311
|
- `DataTransformer` - Transform data
|
312
|
+
- `HierarchicalChunkerNode` - Document chunking
|
313
|
+
- `ChunkTextExtractorNode` - Extract chunk text
|
314
|
+
- `QueryTextWrapperNode` - Wrap queries for processing
|
315
|
+
- `ContextFormatterNode` - Format LLM context
|
232
316
|
- `Filter` - Filter records
|
233
317
|
- `Aggregator` - Aggregate data
|
234
|
-
- `TextProcessor` - Process text
|
235
318
|
|
236
319
|
</td>
|
237
320
|
</tr>
|
@@ -239,10 +322,12 @@ The SDK includes a rich set of pre-built nodes for common operations:
|
|
239
322
|
<td width="50%">
|
240
323
|
|
241
324
|
**AI/ML Nodes**
|
242
|
-
- `
|
243
|
-
- `
|
244
|
-
- `
|
245
|
-
- `
|
325
|
+
- `LLMAgent` - Multi-provider LLM with memory & tools
|
326
|
+
- `EmbeddingGenerator` - Vector embeddings with caching
|
327
|
+
- `MCPClient/MCPServer` - Model Context Protocol
|
328
|
+
- `TextClassifier` - Text classification
|
329
|
+
- `SentimentAnalyzer` - Sentiment analysis
|
330
|
+
- `NamedEntityRecognizer` - NER extraction
|
246
331
|
|
247
332
|
</td>
|
248
333
|
<td width="50%">
|
@@ -278,25 +363,30 @@ The SDK includes a rich set of pre-built nodes for common operations:
|
|
278
363
|
#### Workflow Management
|
279
364
|
```python
|
280
365
|
from kailash.workflow import Workflow
|
366
|
+
from kailash.nodes.logic import Switch
|
367
|
+
from kailash.nodes.transform import DataTransformer
|
281
368
|
|
282
369
|
# Create complex workflows with branching logic
|
283
370
|
workflow = Workflow("data_pipeline", name="data_pipeline")
|
284
371
|
|
285
|
-
# Add conditional branching
|
286
|
-
|
287
|
-
workflow.add_node("
|
372
|
+
# Add conditional branching with Switch node
|
373
|
+
switch = Switch()
|
374
|
+
workflow.add_node("route", switch)
|
288
375
|
|
289
376
|
# Different paths based on validation
|
377
|
+
processor_a = DataTransformer(transformations=["lambda x: x"])
|
378
|
+
error_handler = DataTransformer(transformations=["lambda x: {'error': str(x)}"])
|
290
379
|
workflow.add_node("process_valid", processor_a)
|
291
380
|
workflow.add_node("handle_errors", error_handler)
|
292
381
|
|
293
|
-
# Connect with
|
294
|
-
workflow.connect("
|
295
|
-
workflow.connect("
|
382
|
+
# Connect with switch routing
|
383
|
+
workflow.connect("route", "process_valid")
|
384
|
+
workflow.connect("route", "handle_errors")
|
296
385
|
```
|
297
386
|
|
298
387
|
#### Immutable State Management
|
299
388
|
```python
|
389
|
+
from kailash.workflow import Workflow
|
300
390
|
from kailash.workflow.state import WorkflowStateWrapper
|
301
391
|
from pydantic import BaseModel
|
302
392
|
|
@@ -306,6 +396,9 @@ class MyStateModel(BaseModel):
|
|
306
396
|
status: str = "pending"
|
307
397
|
nested: dict = {}
|
308
398
|
|
399
|
+
# Create workflow
|
400
|
+
workflow = Workflow("state_workflow", name="state_workflow")
|
401
|
+
|
309
402
|
# Create and wrap state object
|
310
403
|
state = MyStateModel()
|
311
404
|
state_wrapper = workflow.create_state_wrapper(state)
|
@@ -322,8 +415,9 @@ updated_wrapper = state_wrapper.batch_update([
|
|
322
415
|
(["status"], "processing")
|
323
416
|
])
|
324
417
|
|
325
|
-
#
|
326
|
-
|
418
|
+
# Access the updated state
|
419
|
+
print(f"Updated counter: {updated_wrapper._state.counter}")
|
420
|
+
print(f"Updated status: {updated_wrapper._state.status}")
|
327
421
|
```
|
328
422
|
|
329
423
|
#### Task Tracking
|
@@ -340,45 +434,75 @@ workflow = Workflow("sample_workflow", name="Sample Workflow")
|
|
340
434
|
# Run workflow with tracking
|
341
435
|
from kailash.runtime.local import LocalRuntime
|
342
436
|
runtime = LocalRuntime()
|
343
|
-
results, run_id = runtime.execute(workflow
|
437
|
+
results, run_id = runtime.execute(workflow)
|
344
438
|
|
345
439
|
# Query execution history
|
346
|
-
|
347
|
-
|
440
|
+
# Note: list_runs() may fail with timezone comparison errors in some cases
|
441
|
+
try:
|
442
|
+
# List all runs
|
443
|
+
all_runs = task_manager.list_runs()
|
444
|
+
|
445
|
+
# Filter by status
|
446
|
+
completed_runs = task_manager.list_runs(status="completed")
|
447
|
+
failed_runs = task_manager.list_runs(status="failed")
|
448
|
+
|
449
|
+
# Filter by workflow name
|
450
|
+
workflow_runs = task_manager.list_runs(workflow_name="sample_workflow")
|
451
|
+
|
452
|
+
# Process run information
|
453
|
+
for run in completed_runs[:5]: # First 5 runs
|
454
|
+
print(f"Run {run.run_id[:8]}: {run.workflow_name} - {run.status}")
|
455
|
+
|
456
|
+
except Exception as e:
|
457
|
+
print(f"Error listing runs: {e}")
|
458
|
+
# Fallback: Access run details directly if available
|
459
|
+
if hasattr(task_manager, 'storage'):
|
460
|
+
run = task_manager.get_run(run_id)
|
348
461
|
```
|
349
462
|
|
350
463
|
#### Local Testing
|
351
464
|
```python
|
352
465
|
from kailash.runtime.local import LocalRuntime
|
466
|
+
from kailash.workflow import Workflow
|
467
|
+
|
468
|
+
# Create a test workflow
|
469
|
+
workflow = Workflow("test_workflow", name="test_workflow")
|
353
470
|
|
354
471
|
# Create test runtime with debugging enabled
|
355
472
|
runtime = LocalRuntime(debug=True)
|
356
473
|
|
357
474
|
# Execute with test data
|
358
|
-
|
359
|
-
results = runtime.execute(workflow, inputs=test_data)
|
475
|
+
results, run_id = runtime.execute(workflow)
|
360
476
|
|
361
477
|
# Validate results
|
362
|
-
assert results
|
478
|
+
assert isinstance(results, dict)
|
363
479
|
```
|
364
480
|
|
365
481
|
#### Performance Monitoring & Real-time Dashboards
|
366
482
|
```python
|
367
483
|
from kailash.visualization.performance import PerformanceVisualizer
|
368
484
|
from kailash.visualization.dashboard import RealTimeDashboard, DashboardConfig
|
369
|
-
from kailash.visualization.reports import WorkflowPerformanceReporter
|
485
|
+
from kailash.visualization.reports import WorkflowPerformanceReporter, ReportFormat
|
370
486
|
from kailash.tracking import TaskManager
|
371
487
|
from kailash.runtime.local import LocalRuntime
|
488
|
+
from kailash.workflow import Workflow
|
489
|
+
from kailash.nodes.transform import DataTransformer
|
490
|
+
|
491
|
+
# Create a workflow to monitor
|
492
|
+
workflow = Workflow("monitored_workflow", name="monitored_workflow")
|
493
|
+
node = DataTransformer(transformations=["lambda x: x"])
|
494
|
+
workflow.add_node("transform", node)
|
372
495
|
|
373
496
|
# Run workflow with task tracking
|
497
|
+
# Note: Pass task_manager to execute() to enable performance tracking
|
374
498
|
task_manager = TaskManager()
|
375
499
|
runtime = LocalRuntime()
|
376
500
|
results, run_id = runtime.execute(workflow, task_manager=task_manager)
|
377
501
|
|
378
502
|
# Static performance analysis
|
503
|
+
from pathlib import Path
|
379
504
|
perf_viz = PerformanceVisualizer(task_manager)
|
380
|
-
outputs = perf_viz.create_run_performance_summary(run_id, output_dir="performance_report")
|
381
|
-
perf_viz.compare_runs([run_id_1, run_id_2], output_path="comparison.png")
|
505
|
+
outputs = perf_viz.create_run_performance_summary(run_id, output_dir=Path("performance_report"))
|
382
506
|
|
383
507
|
# Real-time monitoring dashboard
|
384
508
|
config = DashboardConfig(
|
@@ -406,8 +530,7 @@ reporter = WorkflowPerformanceReporter(task_manager)
|
|
406
530
|
report_path = reporter.generate_report(
|
407
531
|
run_id,
|
408
532
|
output_path="workflow_report.html",
|
409
|
-
format=ReportFormat.HTML
|
410
|
-
compare_runs=[run_id_1, run_id_2]
|
533
|
+
format=ReportFormat.HTML
|
411
534
|
)
|
412
535
|
```
|
413
536
|
|
@@ -464,6 +587,13 @@ api_client = RESTAPINode(
|
|
464
587
|
#### Export Formats
|
465
588
|
```python
|
466
589
|
from kailash.utils.export import WorkflowExporter, ExportConfig
|
590
|
+
from kailash.workflow import Workflow
|
591
|
+
from kailash.nodes.transform import DataTransformer
|
592
|
+
|
593
|
+
# Create a workflow to export
|
594
|
+
workflow = Workflow("export_example", name="export_example")
|
595
|
+
node = DataTransformer(transformations=["lambda x: x"])
|
596
|
+
workflow.add_node("transform", node)
|
467
597
|
|
468
598
|
exporter = WorkflowExporter()
|
469
599
|
|
@@ -476,22 +606,147 @@ config = ExportConfig(
|
|
476
606
|
include_metadata=True,
|
477
607
|
container_tag="latest"
|
478
608
|
)
|
479
|
-
workflow.save("deployment.yaml"
|
609
|
+
workflow.save("deployment.yaml")
|
480
610
|
```
|
481
611
|
|
482
612
|
### 🎨 Visualization
|
483
613
|
|
484
614
|
```python
|
615
|
+
from kailash.workflow import Workflow
|
485
616
|
from kailash.workflow.visualization import WorkflowVisualizer
|
617
|
+
from kailash.nodes.transform import DataTransformer
|
618
|
+
|
619
|
+
# Create a workflow to visualize
|
620
|
+
workflow = Workflow("viz_example", name="viz_example")
|
621
|
+
node = DataTransformer(transformations=["lambda x: x"])
|
622
|
+
workflow.add_node("transform", node)
|
486
623
|
|
487
|
-
#
|
624
|
+
# Generate Mermaid diagram (recommended for documentation)
|
625
|
+
mermaid_code = workflow.to_mermaid()
|
626
|
+
print(mermaid_code)
|
627
|
+
|
628
|
+
# Save as Mermaid markdown file
|
629
|
+
with open("workflow.md", "w") as f:
|
630
|
+
f.write(workflow.to_mermaid_markdown(title="My Workflow"))
|
631
|
+
|
632
|
+
# Or use matplotlib visualization
|
488
633
|
visualizer = WorkflowVisualizer(workflow)
|
489
|
-
visualizer.visualize(
|
634
|
+
visualizer.visualize()
|
635
|
+
visualizer.save("workflow.png", dpi=300) # Save as PNG
|
636
|
+
```
|
637
|
+
|
638
|
+
#### Hierarchical RAG (Retrieval-Augmented Generation)
|
639
|
+
```python
|
640
|
+
from kailash.workflow import Workflow
|
641
|
+
from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
|
642
|
+
from kailash.nodes.data.retrieval import RelevanceScorerNode
|
643
|
+
from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
|
644
|
+
from kailash.nodes.transform.formatters import (
|
645
|
+
ChunkTextExtractorNode,
|
646
|
+
QueryTextWrapperNode,
|
647
|
+
ContextFormatterNode,
|
648
|
+
)
|
649
|
+
from kailash.nodes.ai.llm_agent import LLMAgent
|
650
|
+
from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
|
651
|
+
|
652
|
+
# Create hierarchical RAG workflow
|
653
|
+
workflow = Workflow(
|
654
|
+
workflow_id="hierarchical_rag_example",
|
655
|
+
name="Hierarchical RAG Workflow",
|
656
|
+
description="Complete RAG pipeline with embedding-based retrieval",
|
657
|
+
version="1.0.0"
|
658
|
+
)
|
490
659
|
|
491
|
-
#
|
492
|
-
|
660
|
+
# Create data source nodes
|
661
|
+
doc_source = DocumentSourceNode()
|
662
|
+
query_source = QuerySourceNode()
|
663
|
+
|
664
|
+
# Create document processing pipeline
|
665
|
+
chunker = HierarchicalChunkerNode()
|
666
|
+
chunk_text_extractor = ChunkTextExtractorNode()
|
667
|
+
query_text_wrapper = QueryTextWrapperNode()
|
668
|
+
|
669
|
+
# Create embedding generators
|
670
|
+
chunk_embedder = EmbeddingGenerator(
|
671
|
+
provider="ollama",
|
672
|
+
model="nomic-embed-text",
|
673
|
+
operation="embed_batch"
|
674
|
+
)
|
675
|
+
|
676
|
+
query_embedder = EmbeddingGenerator(
|
677
|
+
provider="ollama",
|
678
|
+
model="nomic-embed-text",
|
679
|
+
operation="embed_batch"
|
680
|
+
)
|
681
|
+
|
682
|
+
# Create retrieval and formatting nodes
|
683
|
+
relevance_scorer = RelevanceScorerNode(similarity_method="cosine")
|
684
|
+
context_formatter = ContextFormatterNode()
|
685
|
+
|
686
|
+
# Create LLM agent for final answer generation
|
687
|
+
llm_agent = LLMAgent(
|
688
|
+
provider="ollama",
|
689
|
+
model="llama3.2",
|
690
|
+
temperature=0.7,
|
691
|
+
max_tokens=500
|
692
|
+
)
|
693
|
+
|
694
|
+
# Add all nodes to workflow
|
695
|
+
for node_id, node in [
|
696
|
+
("doc_source", doc_source),
|
697
|
+
("chunker", chunker),
|
698
|
+
("query_source", query_source),
|
699
|
+
("chunk_text_extractor", chunk_text_extractor),
|
700
|
+
("query_text_wrapper", query_text_wrapper),
|
701
|
+
("chunk_embedder", chunk_embedder),
|
702
|
+
("query_embedder", query_embedder),
|
703
|
+
("relevance_scorer", relevance_scorer),
|
704
|
+
("context_formatter", context_formatter),
|
705
|
+
("llm_agent", llm_agent)
|
706
|
+
]:
|
707
|
+
workflow.add_node(node_id, node)
|
708
|
+
|
709
|
+
# Connect the workflow pipeline
|
710
|
+
# Document processing: docs → chunks → text → embeddings
|
711
|
+
workflow.connect("doc_source", "chunker", {"documents": "documents"})
|
712
|
+
workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
|
713
|
+
workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
|
714
|
+
|
715
|
+
# Query processing: query → text wrapper → embeddings
|
716
|
+
workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
|
717
|
+
workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
|
718
|
+
|
719
|
+
# Relevance scoring: chunks + embeddings → scored chunks
|
720
|
+
workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
|
721
|
+
workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
|
722
|
+
workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
|
723
|
+
|
724
|
+
# Context formatting: relevant chunks + query → formatted context
|
725
|
+
workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
|
726
|
+
workflow.connect("query_source", "context_formatter", {"query": "query"})
|
727
|
+
|
728
|
+
# Final answer generation: formatted context → LLM response
|
729
|
+
workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
|
730
|
+
|
731
|
+
# Execute workflow
|
732
|
+
results, run_id = workflow.run()
|
733
|
+
|
734
|
+
# Access results
|
735
|
+
print("🎯 Top Relevant Chunks:")
|
736
|
+
for chunk in results["relevance_scorer"]["relevant_chunks"]:
|
737
|
+
print(f" - {chunk['document_title']}: {chunk['relevance_score']:.3f}")
|
738
|
+
|
739
|
+
print("\n🤖 Final Answer:")
|
740
|
+
print(results["llm_agent"]["response"]["content"])
|
493
741
|
```
|
494
742
|
|
743
|
+
This example demonstrates:
|
744
|
+
- **Document chunking** with hierarchical structure
|
745
|
+
- **Vector embeddings** using Ollama's nomic-embed-text model
|
746
|
+
- **Semantic similarity** scoring with cosine similarity
|
747
|
+
- **Context formatting** for LLM input
|
748
|
+
- **Answer generation** using Ollama's llama3.2 model
|
749
|
+
|
495
750
|
## 💻 CLI Commands
|
496
751
|
|
497
752
|
The SDK includes a comprehensive CLI for workflow management:
|
@@ -543,6 +798,45 @@ kailash/
|
|
543
798
|
└── utils/ # Utilities and helpers
|
544
799
|
```
|
545
800
|
|
801
|
+
### 🤖 Unified AI Provider Architecture
|
802
|
+
|
803
|
+
The SDK features a unified provider architecture for AI capabilities:
|
804
|
+
|
805
|
+
```python
|
806
|
+
from kailash.nodes.ai import LLMAgent, EmbeddingGenerator
|
807
|
+
|
808
|
+
# Multi-provider LLM support
|
809
|
+
agent = LLMAgent()
|
810
|
+
result = agent.run(
|
811
|
+
provider="ollama", # or "openai", "anthropic", "mock"
|
812
|
+
model="llama3.1:8b-instruct-q8_0",
|
813
|
+
messages=[{"role": "user", "content": "Explain quantum computing"}],
|
814
|
+
generation_config={"temperature": 0.7, "max_tokens": 500}
|
815
|
+
)
|
816
|
+
|
817
|
+
# Vector embeddings with the same providers
|
818
|
+
embedder = EmbeddingGenerator()
|
819
|
+
embedding = embedder.run(
|
820
|
+
provider="ollama", # Same providers support embeddings
|
821
|
+
model="snowflake-arctic-embed2",
|
822
|
+
operation="embed_text",
|
823
|
+
input_text="Quantum computing uses quantum mechanics principles"
|
824
|
+
)
|
825
|
+
|
826
|
+
# Check available providers and capabilities
|
827
|
+
from kailash.nodes.ai.ai_providers import get_available_providers
|
828
|
+
providers = get_available_providers()
|
829
|
+
# Returns: {"ollama": {"available": True, "chat": True, "embeddings": True}, ...}
|
830
|
+
```
|
831
|
+
|
832
|
+
**Supported AI Providers:**
|
833
|
+
- **Ollama**: Local LLMs with both chat and embeddings (llama3.1, mistral, etc.)
|
834
|
+
- **OpenAI**: GPT models and text-embedding-3 series
|
835
|
+
- **Anthropic**: Claude models (chat only)
|
836
|
+
- **Cohere**: Embedding models (embed-english-v3.0)
|
837
|
+
- **HuggingFace**: Sentence transformers and local models
|
838
|
+
- **Mock**: Testing provider with consistent outputs
|
839
|
+
|
546
840
|
## 🧪 Testing
|
547
841
|
|
548
842
|
The SDK is thoroughly tested with comprehensive test suites:
|
@@ -654,9 +948,9 @@ pre-commit run pytest-check
|
|
654
948
|
- **Performance visualization dashboards**
|
655
949
|
- **Real-time monitoring dashboard with WebSocket streaming**
|
656
950
|
- **Comprehensive performance reports (HTML, Markdown, JSON)**
|
657
|
-
- **
|
951
|
+
- **89% test coverage (571 tests)**
|
658
952
|
- **15 test categories all passing**
|
659
|
-
-
|
953
|
+
- 37 working examples
|
660
954
|
|
661
955
|
</td>
|
662
956
|
<td width="30%">
|
@@ -681,11 +975,17 @@ pre-commit run pytest-check
|
|
681
975
|
</table>
|
682
976
|
|
683
977
|
### 🎯 Test Suite Status
|
684
|
-
- **Total Tests**:
|
978
|
+
- **Total Tests**: 571 passing (89%)
|
685
979
|
- **Test Categories**: 15/15 at 100%
|
686
980
|
- **Integration Tests**: 65 passing
|
687
|
-
- **Examples**:
|
688
|
-
- **Code Coverage**:
|
981
|
+
- **Examples**: 37/37 working
|
982
|
+
- **Code Coverage**: 89%
|
983
|
+
|
984
|
+
## ⚠️ Known Issues
|
985
|
+
|
986
|
+
1. **DateTime Comparison in `list_runs()`**: The `TaskManager.list_runs()` method may encounter timezone comparison errors between timezone-aware and timezone-naive datetime objects. Workaround: Use try-catch blocks when calling `list_runs()` or access run details directly via `get_run(run_id)`.
|
987
|
+
|
988
|
+
2. **Performance Tracking**: To enable performance metrics collection, you must pass the `task_manager` parameter to the `runtime.execute()` method: `runtime.execute(workflow, task_manager=task_manager)`.
|
689
989
|
|
690
990
|
## 📄 License
|
691
991
|
|