kailash 0.1.1__py3-none-any.whl → 0.1.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. kailash/api/__init__.py +7 -0
  2. kailash/api/workflow_api.py +383 -0
  3. kailash/nodes/__init__.py +2 -1
  4. kailash/nodes/ai/__init__.py +26 -0
  5. kailash/nodes/ai/ai_providers.py +1272 -0
  6. kailash/nodes/ai/embedding_generator.py +853 -0
  7. kailash/nodes/ai/llm_agent.py +1166 -0
  8. kailash/nodes/api/auth.py +3 -3
  9. kailash/nodes/api/graphql.py +2 -2
  10. kailash/nodes/api/http.py +391 -48
  11. kailash/nodes/api/rate_limiting.py +2 -2
  12. kailash/nodes/api/rest.py +465 -57
  13. kailash/nodes/base.py +71 -12
  14. kailash/nodes/code/python.py +2 -1
  15. kailash/nodes/data/__init__.py +7 -0
  16. kailash/nodes/data/readers.py +28 -26
  17. kailash/nodes/data/retrieval.py +178 -0
  18. kailash/nodes/data/sharepoint_graph.py +7 -7
  19. kailash/nodes/data/sources.py +65 -0
  20. kailash/nodes/data/sql.py +7 -5
  21. kailash/nodes/data/vector_db.py +2 -2
  22. kailash/nodes/data/writers.py +6 -3
  23. kailash/nodes/logic/__init__.py +2 -1
  24. kailash/nodes/logic/operations.py +2 -1
  25. kailash/nodes/logic/workflow.py +439 -0
  26. kailash/nodes/mcp/__init__.py +11 -0
  27. kailash/nodes/mcp/client.py +558 -0
  28. kailash/nodes/mcp/resource.py +682 -0
  29. kailash/nodes/mcp/server.py +577 -0
  30. kailash/nodes/transform/__init__.py +16 -1
  31. kailash/nodes/transform/chunkers.py +78 -0
  32. kailash/nodes/transform/formatters.py +96 -0
  33. kailash/nodes/transform/processors.py +5 -3
  34. kailash/runtime/docker.py +8 -6
  35. kailash/sdk_exceptions.py +24 -10
  36. kailash/tracking/metrics_collector.py +2 -1
  37. kailash/tracking/models.py +0 -20
  38. kailash/tracking/storage/database.py +4 -4
  39. kailash/tracking/storage/filesystem.py +0 -1
  40. kailash/utils/templates.py +6 -6
  41. kailash/visualization/performance.py +7 -7
  42. kailash/visualization/reports.py +1 -1
  43. kailash/workflow/graph.py +4 -4
  44. kailash/workflow/mock_registry.py +1 -1
  45. {kailash-0.1.1.dist-info → kailash-0.1.3.dist-info}/METADATA +441 -47
  46. kailash-0.1.3.dist-info/RECORD +83 -0
  47. kailash-0.1.1.dist-info/RECORD +0 -69
  48. {kailash-0.1.1.dist-info → kailash-0.1.3.dist-info}/WHEEL +0 -0
  49. {kailash-0.1.1.dist-info → kailash-0.1.3.dist-info}/entry_points.txt +0 -0
  50. {kailash-0.1.1.dist-info → kailash-0.1.3.dist-info}/licenses/LICENSE +0 -0
  51. {kailash-0.1.1.dist-info → kailash-0.1.3.dist-info}/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: kailash
3
- Version: 0.1.1
3
+ Version: 0.1.3
4
4
  Summary: Python SDK for the Kailash container-node architecture
5
5
  Home-page: https://github.com/integrum/kailash-python-sdk
6
6
  Author: Integrum
@@ -10,9 +10,8 @@ Project-URL: Bug Tracker, https://github.com/integrum/kailash-python-sdk/issues
10
10
  Classifier: Development Status :: 3 - Alpha
11
11
  Classifier: Intended Audience :: Developers
12
12
  Classifier: Programming Language :: Python :: 3
13
- Classifier: Programming Language :: Python :: 3.8
14
- Classifier: Programming Language :: Python :: 3.9
15
- Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
16
15
  Requires-Python: >=3.11
17
16
  Description-Content-Type: text/markdown
18
17
  License-File: LICENSE
@@ -22,7 +21,7 @@ Requires-Dist: matplotlib>=3.5
22
21
  Requires-Dist: pyyaml>=6.0
23
22
  Requires-Dist: click>=8.0
24
23
  Requires-Dist: pytest>=8.3.5
25
- Requires-Dist: mcp[cli]>=1.9.0
24
+ Requires-Dist: mcp[cli]>=1.9.2
26
25
  Requires-Dist: pandas>=2.2.3
27
26
  Requires-Dist: numpy>=2.2.5
28
27
  Requires-Dist: scipy>=1.15.3
@@ -42,10 +41,12 @@ Requires-Dist: autodoc>=0.5.0
42
41
  Requires-Dist: myst-parser>=4.0.1
43
42
  Requires-Dist: black>=25.1.0
44
43
  Requires-Dist: psutil>=7.0.0
45
- Requires-Dist: fastapi[all]>=0.115.12
44
+ Requires-Dist: fastapi>=0.115.12
45
+ Requires-Dist: uvicorn[standard]>=0.31.0
46
46
  Requires-Dist: pytest-asyncio>=1.0.0
47
47
  Requires-Dist: pre-commit>=4.2.0
48
48
  Requires-Dist: twine>=6.1.0
49
+ Requires-Dist: ollama>=0.5.1
49
50
  Provides-Extra: dev
50
51
  Requires-Dist: pytest>=7.0; extra == "dev"
51
52
  Requires-Dist: pytest-cov>=3.0; extra == "dev"
@@ -62,10 +63,10 @@ Dynamic: requires-python
62
63
  <p align="center">
63
64
  <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/v/kailash.svg" alt="PyPI version"></a>
64
65
  <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/pyversions/kailash.svg" alt="Python versions"></a>
65
- <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/dm/kailash.svg" alt="Downloads"></a>
66
+ <a href="https://pepy.tech/project/kailash"><img src="https://static.pepy.tech/badge/kailash" alt="Downloads"></a>
66
67
  <img src="https://img.shields.io/badge/license-MIT-green.svg" alt="MIT License">
67
68
  <img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black">
68
- <img src="https://img.shields.io/badge/tests-544%20passing-brightgreen.svg" alt="Tests: 544 passing">
69
+ <img src="https://img.shields.io/badge/tests-746%20passing-brightgreen.svg" alt="Tests: 746 passing">
69
70
  <img src="https://img.shields.io/badge/coverage-100%25-brightgreen.svg" alt="Coverage: 100%">
70
71
  </p>
71
72
 
@@ -87,6 +88,9 @@ Dynamic: requires-python
87
88
  - 📊 **Real-time Monitoring**: Live dashboards with WebSocket streaming and performance metrics
88
89
  - 🧩 **Extensible**: Easy to create custom nodes for domain-specific operations
89
90
  - ⚡ **Fast Installation**: Uses `uv` for lightning-fast Python package management
91
+ - 🤖 **AI-Powered**: Complete LLM agents, embeddings, and hierarchical RAG architecture
92
+ - 🧠 **Retrieval-Augmented Generation**: Full RAG pipeline with intelligent document processing
93
+ - 🌐 **REST API Wrapper**: Expose any workflow as a production-ready API in 3 lines
90
94
 
91
95
  ## 🎯 Who Is This For?
92
96
 
@@ -101,6 +105,8 @@ The Kailash Python SDK is designed for:
101
105
 
102
106
  ### Installation
103
107
 
108
+ **Requirements:** Python 3.11 or higher
109
+
104
110
  ```bash
105
111
  # Install uv if you haven't already
106
112
  curl -LsSf https://astral.sh/uv/install.sh | sh
@@ -137,9 +143,11 @@ def analyze_customers(data):
137
143
  # Convert total_spent to numeric
138
144
  df['total_spent'] = pd.to_numeric(df['total_spent'])
139
145
  return {
140
- "total_customers": len(df),
141
- "avg_spend": df["total_spent"].mean(),
142
- "top_customers": df.nlargest(10, "total_spent").to_dict("records")
146
+ "result": {
147
+ "total_customers": len(df),
148
+ "avg_spend": df["total_spent"].mean(),
149
+ "top_customers": df.nlargest(10, "total_spent").to_dict("records")
150
+ }
143
151
  }
144
152
 
145
153
  analyzer = PythonCodeNode.from_function(analyze_customers, name="analyzer")
@@ -174,7 +182,7 @@ sharepoint = SharePointGraphReader()
174
182
  workflow.add_node("read_sharepoint", sharepoint)
175
183
 
176
184
  # Process downloaded files
177
- csv_writer = CSVWriter()
185
+ csv_writer = CSVWriter(file_path="sharepoint_output.csv")
178
186
  workflow.add_node("save_locally", csv_writer)
179
187
 
180
188
  # Connect nodes
@@ -198,6 +206,135 @@ runtime = LocalRuntime()
198
206
  results, run_id = runtime.execute(workflow, inputs=inputs)
199
207
  ```
200
208
 
209
+ ### Hierarchical RAG Example
210
+
211
+ ```python
212
+ from kailash.workflow import Workflow
213
+ from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
214
+ from kailash.nodes.ai.llm_agent import LLMAgent
215
+ from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
216
+ from kailash.nodes.data.retrieval import RelevanceScorerNode
217
+ from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
218
+ from kailash.nodes.transform.formatters import (
219
+ ChunkTextExtractorNode, QueryTextWrapperNode, ContextFormatterNode
220
+ )
221
+
222
+ # Create hierarchical RAG workflow
223
+ workflow = Workflow("hierarchical_rag", name="Hierarchical RAG Workflow")
224
+
225
+ # Data sources (autonomous - no external files needed)
226
+ doc_source = DocumentSourceNode()
227
+ query_source = QuerySourceNode()
228
+
229
+ # Document processing pipeline
230
+ chunker = HierarchicalChunkerNode()
231
+ chunk_text_extractor = ChunkTextExtractorNode()
232
+ query_text_wrapper = QueryTextWrapperNode()
233
+
234
+ # AI processing with Ollama
235
+ chunk_embedder = EmbeddingGenerator(
236
+ provider="ollama", model="nomic-embed-text", operation="embed_batch"
237
+ )
238
+ query_embedder = EmbeddingGenerator(
239
+ provider="ollama", model="nomic-embed-text", operation="embed_batch"
240
+ )
241
+
242
+ # Retrieval and response generation
243
+ relevance_scorer = RelevanceScorerNode()
244
+ context_formatter = ContextFormatterNode()
245
+ llm_agent = LLMAgent(provider="ollama", model="llama3.2", temperature=0.7)
246
+
247
+ # Add all nodes to workflow
248
+ for name, node in {
249
+ "doc_source": doc_source, "query_source": query_source,
250
+ "chunker": chunker, "chunk_text_extractor": chunk_text_extractor,
251
+ "query_text_wrapper": query_text_wrapper, "chunk_embedder": chunk_embedder,
252
+ "query_embedder": query_embedder, "relevance_scorer": relevance_scorer,
253
+ "context_formatter": context_formatter, "llm_agent": llm_agent
254
+ }.items():
255
+ workflow.add_node(name, node)
256
+
257
+ # Connect the RAG pipeline
258
+ workflow.connect("doc_source", "chunker", {"documents": "documents"})
259
+ workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
260
+ workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
261
+ workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
262
+ workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
263
+ workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
264
+ workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
265
+ workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
266
+ workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
267
+ workflow.connect("query_source", "context_formatter", {"query": "query"})
268
+ workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
269
+
270
+ # Execute the RAG workflow
271
+ from kailash.runtime.local import LocalRuntime
272
+ runtime = LocalRuntime()
273
+ results, run_id = runtime.execute(workflow)
274
+
275
+ print("RAG Response:", results["llm_agent"]["response"])
276
+ ```
277
+
278
+ ### Workflow API Wrapper - Expose Workflows as REST APIs
279
+
280
+ Transform any Kailash workflow into a production-ready REST API in just 3 lines of code:
281
+
282
+ ```python
283
+ from kailash.api.workflow_api import WorkflowAPI
284
+
285
+ # Take any workflow and expose it as an API
286
+ api = WorkflowAPI(workflow)
287
+ api.run(port=8000) # That's it! Your workflow is now a REST API
288
+ ```
289
+
290
+ #### Features
291
+
292
+ - **Automatic REST Endpoints**:
293
+ - `POST /execute` - Execute workflow with inputs
294
+ - `GET /workflow/info` - Get workflow metadata
295
+ - `GET /health` - Health check endpoint
296
+ - Automatic OpenAPI docs at `/docs`
297
+
298
+ - **Multiple Execution Modes**:
299
+ ```python
300
+ # Synchronous execution (wait for results)
301
+ curl -X POST http://localhost:8000/execute \
302
+ -d '{"inputs": {...}, "mode": "sync"}'
303
+
304
+ # Asynchronous execution (get execution ID)
305
+ curl -X POST http://localhost:8000/execute \
306
+ -d '{"inputs": {...}, "mode": "async"}'
307
+
308
+ # Check async status
309
+ curl http://localhost:8000/status/{execution_id}
310
+ ```
311
+
312
+ - **Specialized APIs** for specific domains:
313
+ ```python
314
+ from kailash.api.workflow_api import create_workflow_api
315
+
316
+ # Create a RAG-specific API with custom endpoints
317
+ api = create_workflow_api(rag_workflow, api_type="rag")
318
+ # Adds /documents and /query endpoints
319
+ ```
320
+
321
+ - **Production Ready**:
322
+ ```python
323
+ # Development
324
+ api.run(reload=True, log_level="debug")
325
+
326
+ # Production with SSL
327
+ api.run(
328
+ host="0.0.0.0",
329
+ port=443,
330
+ ssl_keyfile="key.pem",
331
+ ssl_certfile="cert.pem",
332
+ workers=4
333
+ )
334
+ ```
335
+
336
+ See the [API demo example](examples/integration_examples/integration_api_demo.py) for complete usage patterns.
337
+
201
338
  ## 📚 Documentation
202
339
 
203
340
  | Resource | Description |
@@ -221,6 +358,9 @@ The SDK includes a rich set of pre-built nodes for common operations:
221
358
  **Data Operations**
222
359
  - `CSVReader` - Read CSV files
223
360
  - `JSONReader` - Read JSON files
361
+ - `DocumentSourceNode` - Sample document provider
362
+ - `QuerySourceNode` - Sample query provider
363
+ - `RelevanceScorerNode` - Multi-method similarity
224
364
  - `SQLDatabaseNode` - Query databases
225
365
  - `CSVWriter` - Write CSV files
226
366
  - `JSONWriter` - Write JSON files
@@ -228,12 +368,20 @@ The SDK includes a rich set of pre-built nodes for common operations:
228
368
  </td>
229
369
  <td width="50%">
230
370
 
231
- **Processing Nodes**
371
+ **Transform Nodes**
232
372
  - `PythonCodeNode` - Custom Python logic
233
373
  - `DataTransformer` - Transform data
374
+ - `HierarchicalChunkerNode` - Document chunking
375
+ - `ChunkTextExtractorNode` - Extract chunk text
376
+ - `QueryTextWrapperNode` - Wrap queries for processing
377
+ - `ContextFormatterNode` - Format LLM context
234
378
  - `Filter` - Filter records
235
379
  - `Aggregator` - Aggregate data
236
- - `TextProcessor` - Process text
380
+
381
+ **Logic Nodes**
382
+ - `Switch` - Conditional routing
383
+ - `Merge` - Combine multiple inputs
384
+ - `WorkflowNode` - Wrap workflows as reusable nodes
237
385
 
238
386
  </td>
239
387
  </tr>
@@ -241,10 +389,12 @@ The SDK includes a rich set of pre-built nodes for common operations:
241
389
  <td width="50%">
242
390
 
243
391
  **AI/ML Nodes**
244
- - `EmbeddingNode` - Generate embeddings
245
- - `VectorDatabaseNode` - Vector search
246
- - `ModelPredictorNode` - ML predictions
247
- - `LLMNode` - LLM integration
392
+ - `LLMAgent` - Multi-provider LLM with memory & tools
393
+ - `EmbeddingGenerator` - Vector embeddings with caching
394
+ - `MCPClient/MCPServer` - Model Context Protocol
395
+ - `TextClassifier` - Text classification
396
+ - `SentimentAnalyzer` - Sentiment analysis
397
+ - `NamedEntityRecognizer` - NER extraction
248
398
 
249
399
  </td>
250
400
  <td width="50%">
@@ -280,25 +430,59 @@ The SDK includes a rich set of pre-built nodes for common operations:
280
430
  #### Workflow Management
281
431
  ```python
282
432
  from kailash.workflow import Workflow
433
+ from kailash.nodes.logic import Switch
434
+ from kailash.nodes.transform import DataTransformer
283
435
 
284
436
  # Create complex workflows with branching logic
285
437
  workflow = Workflow("data_pipeline", name="data_pipeline")
286
438
 
287
- # Add conditional branching
288
- validator = ValidationNode()
289
- workflow.add_node("validate", validator)
439
+ # Add conditional branching with Switch node
440
+ switch = Switch()
441
+ workflow.add_node("route", switch)
290
442
 
291
443
  # Different paths based on validation
444
+ processor_a = DataTransformer(transformations=["lambda x: x"])
445
+ error_handler = DataTransformer(transformations=["lambda x: {'error': str(x)}"])
292
446
  workflow.add_node("process_valid", processor_a)
293
447
  workflow.add_node("handle_errors", error_handler)
294
448
 
295
- # Connect with conditions
296
- workflow.connect("validate", "process_valid", condition="is_valid")
297
- workflow.connect("validate", "handle_errors", condition="has_errors")
449
+ # Connect with switch routing
450
+ workflow.connect("route", "process_valid")
451
+ workflow.connect("route", "handle_errors")
452
+ ```
453
+
454
+ #### Hierarchical Workflow Composition
455
+ ```python
456
+ from kailash.workflow import Workflow
457
+ from kailash.nodes.logic import WorkflowNode
458
+ from kailash.runtime.local import LocalRuntime
459
+
460
+ # Create a reusable data processing workflow
461
+ inner_workflow = Workflow("data_processor", name="Data Processor")
462
+ # ... add nodes to inner workflow ...
463
+
464
+ # Wrap the workflow as a node
465
+ processor_node = WorkflowNode(
466
+ workflow=inner_workflow,
467
+ name="data_processor"
468
+ )
469
+
470
+ # Use in a larger workflow
471
+ main_workflow = Workflow("main", name="Main Pipeline")
472
+ main_workflow.add_node("process", processor_node)
473
+ main_workflow.add_node("analyze", analyzer_node)
474
+
475
+ # Connect workflows
476
+ main_workflow.connect("process", "analyze")
477
+
478
+ # Execute - parameters automatically mapped to inner workflow
479
+ runtime = LocalRuntime()
480
+ results, _ = runtime.execute(main_workflow)
298
481
  ```
299
482
 
300
483
  #### Immutable State Management
301
484
  ```python
485
+ from kailash.workflow import Workflow
302
486
  from kailash.workflow.state import WorkflowStateWrapper
303
487
  from pydantic import BaseModel
304
488
 
@@ -308,6 +492,9 @@ class MyStateModel(BaseModel):
308
492
  status: str = "pending"
309
493
  nested: dict = {}
310
494
 
495
+ # Create workflow
496
+ workflow = Workflow("state_workflow", name="state_workflow")
497
+
311
498
  # Create and wrap state object
312
499
  state = MyStateModel()
313
500
  state_wrapper = workflow.create_state_wrapper(state)
@@ -324,8 +511,9 @@ updated_wrapper = state_wrapper.batch_update([
324
511
  (["status"], "processing")
325
512
  ])
326
513
 
327
- # Execute workflow with state management
328
- final_state, results = workflow.execute_with_state(state_model=state)
514
+ # Access the updated state
515
+ print(f"Updated counter: {updated_wrapper._state.counter}")
516
+ print(f"Updated status: {updated_wrapper._state.status}")
329
517
  ```
330
518
 
331
519
  #### Task Tracking
@@ -342,45 +530,75 @@ workflow = Workflow("sample_workflow", name="Sample Workflow")
342
530
  # Run workflow with tracking
343
531
  from kailash.runtime.local import LocalRuntime
344
532
  runtime = LocalRuntime()
345
- results, run_id = runtime.execute(workflow, task_manager=task_manager)
533
+ results, run_id = runtime.execute(workflow)
346
534
 
347
535
  # Query execution history
348
- runs = task_manager.list_runs(status="completed", limit=10)
349
- details = task_manager.get_run(run_id)
536
+ # Note: list_runs() may fail with timezone comparison errors in some cases
537
+ try:
538
+ # List all runs
539
+ all_runs = task_manager.list_runs()
540
+
541
+ # Filter by status
542
+ completed_runs = task_manager.list_runs(status="completed")
543
+ failed_runs = task_manager.list_runs(status="failed")
544
+
545
+ # Filter by workflow name
546
+ workflow_runs = task_manager.list_runs(workflow_name="sample_workflow")
547
+
548
+ # Process run information
549
+ for run in completed_runs[:5]: # First 5 runs
550
+ print(f"Run {run.run_id[:8]}: {run.workflow_name} - {run.status}")
551
+
552
+ except Exception as e:
553
+ print(f"Error listing runs: {e}")
554
+ # Fallback: Access run details directly if available
555
+ if hasattr(task_manager, 'storage'):
556
+ run = task_manager.get_run(run_id)
350
557
  ```
351
558
 
352
559
  #### Local Testing
353
560
  ```python
354
561
  from kailash.runtime.local import LocalRuntime
562
+ from kailash.workflow import Workflow
563
+
564
+ # Create a test workflow
565
+ workflow = Workflow("test_workflow", name="test_workflow")
355
566
 
356
567
  # Create test runtime with debugging enabled
357
568
  runtime = LocalRuntime(debug=True)
358
569
 
359
570
  # Execute with test data
360
- test_data = {"customers": [...]}
361
- results = runtime.execute(workflow, inputs=test_data)
571
+ results, run_id = runtime.execute(workflow)
362
572
 
363
573
  # Validate results
364
- assert results["node_id"]["output_key"] == expected_value
574
+ assert isinstance(results, dict)
365
575
  ```
366
576
 
367
577
  #### Performance Monitoring & Real-time Dashboards
368
578
  ```python
369
579
  from kailash.visualization.performance import PerformanceVisualizer
370
580
  from kailash.visualization.dashboard import RealTimeDashboard, DashboardConfig
371
- from kailash.visualization.reports import WorkflowPerformanceReporter
581
+ from kailash.visualization.reports import WorkflowPerformanceReporter, ReportFormat
372
582
  from kailash.tracking import TaskManager
373
583
  from kailash.runtime.local import LocalRuntime
584
+ from kailash.workflow import Workflow
585
+ from kailash.nodes.transform import DataTransformer
586
+
587
+ # Create a workflow to monitor
588
+ workflow = Workflow("monitored_workflow", name="monitored_workflow")
589
+ node = DataTransformer(transformations=["lambda x: x"])
590
+ workflow.add_node("transform", node)
374
591
 
375
592
  # Run workflow with task tracking
593
+ # Note: Pass task_manager to execute() to enable performance tracking
376
594
  task_manager = TaskManager()
377
595
  runtime = LocalRuntime()
378
596
  results, run_id = runtime.execute(workflow, task_manager=task_manager)
379
597
 
380
598
  # Static performance analysis
599
+ from pathlib import Path
381
600
  perf_viz = PerformanceVisualizer(task_manager)
382
- outputs = perf_viz.create_run_performance_summary(run_id, output_dir="performance_report")
383
- perf_viz.compare_runs([run_id_1, run_id_2], output_path="comparison.png")
601
+ outputs = perf_viz.create_run_performance_summary(run_id, output_dir=Path("performance_report"))
384
602
 
385
603
  # Real-time monitoring dashboard
386
604
  config = DashboardConfig(
@@ -408,8 +626,7 @@ reporter = WorkflowPerformanceReporter(task_manager)
408
626
  report_path = reporter.generate_report(
409
627
  run_id,
410
628
  output_path="workflow_report.html",
411
- format=ReportFormat.HTML,
412
- compare_runs=[run_id_1, run_id_2]
629
+ format=ReportFormat.HTML
413
630
  )
414
631
  ```
415
632
 
@@ -466,6 +683,13 @@ api_client = RESTAPINode(
466
683
  #### Export Formats
467
684
  ```python
468
685
  from kailash.utils.export import WorkflowExporter, ExportConfig
686
+ from kailash.workflow import Workflow
687
+ from kailash.nodes.transform import DataTransformer
688
+
689
+ # Create a workflow to export
690
+ workflow = Workflow("export_example", name="export_example")
691
+ node = DataTransformer(transformations=["lambda x: x"])
692
+ workflow.add_node("transform", node)
469
693
 
470
694
  exporter = WorkflowExporter()
471
695
 
@@ -478,22 +702,147 @@ config = ExportConfig(
478
702
  include_metadata=True,
479
703
  container_tag="latest"
480
704
  )
481
- workflow.save("deployment.yaml", format="yaml")
705
+ workflow.save("deployment.yaml")
482
706
  ```
483
707
 
484
708
  ### 🎨 Visualization
485
709
 
486
710
  ```python
711
+ from kailash.workflow import Workflow
487
712
  from kailash.workflow.visualization import WorkflowVisualizer
713
+ from kailash.nodes.transform import DataTransformer
714
+
715
+ # Create a workflow to visualize
716
+ workflow = Workflow("viz_example", name="viz_example")
717
+ node = DataTransformer(transformations=["lambda x: x"])
718
+ workflow.add_node("transform", node)
719
+
720
+ # Generate Mermaid diagram (recommended for documentation)
721
+ mermaid_code = workflow.to_mermaid()
722
+ print(mermaid_code)
488
723
 
489
- # Visualize workflow structure
724
+ # Save as Mermaid markdown file
725
+ with open("workflow.md", "w") as f:
726
+ f.write(workflow.to_mermaid_markdown(title="My Workflow"))
727
+
728
+ # Or use matplotlib visualization
490
729
  visualizer = WorkflowVisualizer(workflow)
491
- visualizer.visualize(output_path="workflow.png")
730
+ visualizer.visualize()
731
+ visualizer.save("workflow.png", dpi=300) # Save as PNG
732
+ ```
733
+
734
+ #### Hierarchical RAG (Retrieval-Augmented Generation)
735
+ ```python
736
+ from kailash.workflow import Workflow
737
+ from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
738
+ from kailash.nodes.data.retrieval import RelevanceScorerNode
739
+ from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
740
+ from kailash.nodes.transform.formatters import (
741
+ ChunkTextExtractorNode,
742
+ QueryTextWrapperNode,
743
+ ContextFormatterNode,
744
+ )
745
+ from kailash.nodes.ai.llm_agent import LLMAgent
746
+ from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
747
+
748
+ # Create hierarchical RAG workflow
749
+ workflow = Workflow(
750
+ workflow_id="hierarchical_rag_example",
751
+ name="Hierarchical RAG Workflow",
752
+ description="Complete RAG pipeline with embedding-based retrieval",
753
+ version="1.0.0"
754
+ )
755
+
756
+ # Create data source nodes
757
+ doc_source = DocumentSourceNode()
758
+ query_source = QuerySourceNode()
492
759
 
493
- # Show in Jupyter notebook
494
- visualizer.show()
760
+ # Create document processing pipeline
761
+ chunker = HierarchicalChunkerNode()
762
+ chunk_text_extractor = ChunkTextExtractorNode()
763
+ query_text_wrapper = QueryTextWrapperNode()
764
+
765
+ # Create embedding generators
766
+ chunk_embedder = EmbeddingGenerator(
767
+ provider="ollama",
768
+ model="nomic-embed-text",
769
+ operation="embed_batch"
770
+ )
771
+
772
+ query_embedder = EmbeddingGenerator(
773
+ provider="ollama",
774
+ model="nomic-embed-text",
775
+ operation="embed_batch"
776
+ )
777
+
778
+ # Create retrieval and formatting nodes
779
+ relevance_scorer = RelevanceScorerNode(similarity_method="cosine")
780
+ context_formatter = ContextFormatterNode()
781
+
782
+ # Create LLM agent for final answer generation
783
+ llm_agent = LLMAgent(
784
+ provider="ollama",
785
+ model="llama3.2",
786
+ temperature=0.7,
787
+ max_tokens=500
788
+ )
789
+
790
+ # Add all nodes to workflow
791
+ for node_id, node in [
792
+ ("doc_source", doc_source),
793
+ ("chunker", chunker),
794
+ ("query_source", query_source),
795
+ ("chunk_text_extractor", chunk_text_extractor),
796
+ ("query_text_wrapper", query_text_wrapper),
797
+ ("chunk_embedder", chunk_embedder),
798
+ ("query_embedder", query_embedder),
799
+ ("relevance_scorer", relevance_scorer),
800
+ ("context_formatter", context_formatter),
801
+ ("llm_agent", llm_agent)
802
+ ]:
803
+ workflow.add_node(node_id, node)
804
+
805
+ # Connect the workflow pipeline
806
+ # Document processing: docs → chunks → text → embeddings
807
+ workflow.connect("doc_source", "chunker", {"documents": "documents"})
808
+ workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
809
+ workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
810
+
811
+ # Query processing: query → text wrapper → embeddings
812
+ workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
813
+ workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
814
+
815
+ # Relevance scoring: chunks + embeddings → scored chunks
816
+ workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
817
+ workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
818
+ workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
819
+
820
+ # Context formatting: relevant chunks + query → formatted context
821
+ workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
822
+ workflow.connect("query_source", "context_formatter", {"query": "query"})
823
+
824
+ # Final answer generation: formatted context → LLM response
825
+ workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
826
+
827
+ # Execute workflow
828
+ results, run_id = workflow.run()
829
+
830
+ # Access results
831
+ print("🎯 Top Relevant Chunks:")
832
+ for chunk in results["relevance_scorer"]["relevant_chunks"]:
833
+ print(f" - {chunk['document_title']}: {chunk['relevance_score']:.3f}")
834
+
835
+ print("\n🤖 Final Answer:")
836
+ print(results["llm_agent"]["response"]["content"])
495
837
  ```
496
838
 
839
+ This example demonstrates:
840
+ - **Document chunking** with hierarchical structure
841
+ - **Vector embeddings** using Ollama's nomic-embed-text model
842
+ - **Semantic similarity** scoring with cosine similarity
843
+ - **Context formatting** for LLM input
844
+ - **Answer generation** using Ollama's llama3.2 model
845
+
497
846
  ## 💻 CLI Commands
498
847
 
499
848
  The SDK includes a comprehensive CLI for workflow management:
@@ -545,6 +894,45 @@ kailash/
545
894
  └── utils/ # Utilities and helpers
546
895
  ```
547
896
 
897
+ ### 🤖 Unified AI Provider Architecture
898
+
899
+ The SDK features a unified provider architecture for AI capabilities:
900
+
901
+ ```python
902
+ from kailash.nodes.ai import LLMAgent, EmbeddingGenerator
903
+
904
+ # Multi-provider LLM support
905
+ agent = LLMAgent()
906
+ result = agent.run(
907
+ provider="ollama", # or "openai", "anthropic", "mock"
908
+ model="llama3.1:8b-instruct-q8_0",
909
+ messages=[{"role": "user", "content": "Explain quantum computing"}],
910
+ generation_config={"temperature": 0.7, "max_tokens": 500}
911
+ )
912
+
913
+ # Vector embeddings with the same providers
914
+ embedder = EmbeddingGenerator()
915
+ embedding = embedder.run(
916
+ provider="ollama", # Same providers support embeddings
917
+ model="snowflake-arctic-embed2",
918
+ operation="embed_text",
919
+ input_text="Quantum computing uses quantum mechanics principles"
920
+ )
921
+
922
+ # Check available providers and capabilities
923
+ from kailash.nodes.ai.ai_providers import get_available_providers
924
+ providers = get_available_providers()
925
+ # Returns: {"ollama": {"available": True, "chat": True, "embeddings": True}, ...}
926
+ ```
927
+
928
+ **Supported AI Providers:**
929
+ - **Ollama**: Local LLMs with both chat and embeddings (llama3.1, mistral, etc.)
930
+ - **OpenAI**: GPT models and text-embedding-3 series
931
+ - **Anthropic**: Claude models (chat only)
932
+ - **Cohere**: Embedding models (embed-english-v3.0)
933
+ - **HuggingFace**: Sentence transformers and local models
934
+ - **Mock**: Testing provider with consistent outputs
935
+
548
936
  ## 🧪 Testing
549
937
 
550
938
  The SDK is thoroughly tested with comprehensive test suites:
@@ -656,9 +1044,9 @@ pre-commit run pytest-check
656
1044
  - **Performance visualization dashboards**
657
1045
  - **Real-time monitoring dashboard with WebSocket streaming**
658
1046
  - **Comprehensive performance reports (HTML, Markdown, JSON)**
659
- - **100% test coverage (544 tests)**
1047
+ - **89% test coverage (571 tests)**
660
1048
  - **15 test categories all passing**
661
- - 21+ working examples
1049
+ - 37 working examples
662
1050
 
663
1051
  </td>
664
1052
  <td width="30%">
@@ -683,11 +1071,17 @@ pre-commit run pytest-check
683
1071
  </table>
684
1072
 
685
1073
  ### 🎯 Test Suite Status
686
- - **Total Tests**: 544 passing (100%)
1074
+ - **Total Tests**: 571 passing (89%)
687
1075
  - **Test Categories**: 15/15 at 100%
688
1076
  - **Integration Tests**: 65 passing
689
- - **Examples**: 21/21 working
690
- - **Code Coverage**: Comprehensive
1077
+ - **Examples**: 37/37 working
1078
+ - **Code Coverage**: 89%
1079
+
1080
+ ## ⚠️ Known Issues
1081
+
1082
+ 1. **DateTime Comparison in `list_runs()`**: The `TaskManager.list_runs()` method may encounter timezone comparison errors between timezone-aware and timezone-naive datetime objects. Workaround: Use try-catch blocks when calling `list_runs()` or access run details directly via `get_run(run_id)`.
1083
+
1084
+ 2. **Performance Tracking**: To enable performance metrics collection, you must pass the `task_manager` parameter to the `runtime.execute()` method: `runtime.execute(workflow, task_manager=task_manager)`.
691
1085
 
692
1086
  ## 📄 License
693
1087