kailash 0.1.1__py3-none-any.whl → 0.1.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. kailash/nodes/__init__.py +2 -1
  2. kailash/nodes/ai/__init__.py +26 -0
  3. kailash/nodes/ai/ai_providers.py +1272 -0
  4. kailash/nodes/ai/embedding_generator.py +853 -0
  5. kailash/nodes/ai/llm_agent.py +1166 -0
  6. kailash/nodes/api/auth.py +3 -3
  7. kailash/nodes/api/graphql.py +2 -2
  8. kailash/nodes/api/http.py +391 -44
  9. kailash/nodes/api/rate_limiting.py +2 -2
  10. kailash/nodes/api/rest.py +464 -56
  11. kailash/nodes/base.py +71 -12
  12. kailash/nodes/code/python.py +2 -1
  13. kailash/nodes/data/__init__.py +7 -0
  14. kailash/nodes/data/readers.py +28 -26
  15. kailash/nodes/data/retrieval.py +178 -0
  16. kailash/nodes/data/sharepoint_graph.py +7 -7
  17. kailash/nodes/data/sources.py +65 -0
  18. kailash/nodes/data/sql.py +4 -2
  19. kailash/nodes/data/writers.py +6 -3
  20. kailash/nodes/logic/operations.py +2 -1
  21. kailash/nodes/mcp/__init__.py +11 -0
  22. kailash/nodes/mcp/client.py +558 -0
  23. kailash/nodes/mcp/resource.py +682 -0
  24. kailash/nodes/mcp/server.py +571 -0
  25. kailash/nodes/transform/__init__.py +16 -1
  26. kailash/nodes/transform/chunkers.py +78 -0
  27. kailash/nodes/transform/formatters.py +96 -0
  28. kailash/runtime/docker.py +6 -6
  29. kailash/sdk_exceptions.py +24 -10
  30. kailash/tracking/metrics_collector.py +2 -1
  31. kailash/utils/templates.py +6 -6
  32. {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/METADATA +344 -46
  33. {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/RECORD +37 -26
  34. {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/WHEEL +0 -0
  35. {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/entry_points.txt +0 -0
  36. {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/licenses/LICENSE +0 -0
  37. {kailash-0.1.1.dist-info → kailash-0.1.2.dist-info}/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: kailash
3
- Version: 0.1.1
3
+ Version: 0.1.2
4
4
  Summary: Python SDK for the Kailash container-node architecture
5
5
  Home-page: https://github.com/integrum/kailash-python-sdk
6
6
  Author: Integrum
@@ -10,9 +10,8 @@ Project-URL: Bug Tracker, https://github.com/integrum/kailash-python-sdk/issues
10
10
  Classifier: Development Status :: 3 - Alpha
11
11
  Classifier: Intended Audience :: Developers
12
12
  Classifier: Programming Language :: Python :: 3
13
- Classifier: Programming Language :: Python :: 3.8
14
- Classifier: Programming Language :: Python :: 3.9
15
- Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
16
15
  Requires-Python: >=3.11
17
16
  Description-Content-Type: text/markdown
18
17
  License-File: LICENSE
@@ -22,7 +21,7 @@ Requires-Dist: matplotlib>=3.5
22
21
  Requires-Dist: pyyaml>=6.0
23
22
  Requires-Dist: click>=8.0
24
23
  Requires-Dist: pytest>=8.3.5
25
- Requires-Dist: mcp[cli]>=1.9.0
24
+ Requires-Dist: mcp[cli]>=1.9.2
26
25
  Requires-Dist: pandas>=2.2.3
27
26
  Requires-Dist: numpy>=2.2.5
28
27
  Requires-Dist: scipy>=1.15.3
@@ -46,6 +45,7 @@ Requires-Dist: fastapi[all]>=0.115.12
46
45
  Requires-Dist: pytest-asyncio>=1.0.0
47
46
  Requires-Dist: pre-commit>=4.2.0
48
47
  Requires-Dist: twine>=6.1.0
48
+ Requires-Dist: ollama>=0.5.1
49
49
  Provides-Extra: dev
50
50
  Requires-Dist: pytest>=7.0; extra == "dev"
51
51
  Requires-Dist: pytest-cov>=3.0; extra == "dev"
@@ -62,10 +62,10 @@ Dynamic: requires-python
62
62
  <p align="center">
63
63
  <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/v/kailash.svg" alt="PyPI version"></a>
64
64
  <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/pyversions/kailash.svg" alt="Python versions"></a>
65
- <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/dm/kailash.svg" alt="Downloads"></a>
65
+ <a href="https://pepy.tech/project/kailash"><img src="https://static.pepy.tech/badge/kailash" alt="Downloads"></a>
66
66
  <img src="https://img.shields.io/badge/license-MIT-green.svg" alt="MIT License">
67
67
  <img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black">
68
- <img src="https://img.shields.io/badge/tests-544%20passing-brightgreen.svg" alt="Tests: 544 passing">
68
+ <img src="https://img.shields.io/badge/tests-746%20passing-brightgreen.svg" alt="Tests: 746 passing">
69
69
  <img src="https://img.shields.io/badge/coverage-100%25-brightgreen.svg" alt="Coverage: 100%">
70
70
  </p>
71
71
 
@@ -87,6 +87,8 @@ Dynamic: requires-python
87
87
  - 📊 **Real-time Monitoring**: Live dashboards with WebSocket streaming and performance metrics
88
88
  - 🧩 **Extensible**: Easy to create custom nodes for domain-specific operations
89
89
  - ⚡ **Fast Installation**: Uses `uv` for lightning-fast Python package management
90
+ - 🤖 **AI-Powered**: Complete LLM agents, embeddings, and hierarchical RAG architecture
91
+ - 🧠 **Retrieval-Augmented Generation**: Full RAG pipeline with intelligent document processing
90
92
 
91
93
  ## 🎯 Who Is This For?
92
94
 
@@ -101,6 +103,8 @@ The Kailash Python SDK is designed for:
101
103
 
102
104
  ### Installation
103
105
 
106
+ **Requirements:** Python 3.11 or higher
107
+
104
108
  ```bash
105
109
  # Install uv if you haven't already
106
110
  curl -LsSf https://astral.sh/uv/install.sh | sh
@@ -137,9 +141,11 @@ def analyze_customers(data):
137
141
  # Convert total_spent to numeric
138
142
  df['total_spent'] = pd.to_numeric(df['total_spent'])
139
143
  return {
140
- "total_customers": len(df),
141
- "avg_spend": df["total_spent"].mean(),
142
- "top_customers": df.nlargest(10, "total_spent").to_dict("records")
144
+ "result": {
145
+ "total_customers": len(df),
146
+ "avg_spend": df["total_spent"].mean(),
147
+ "top_customers": df.nlargest(10, "total_spent").to_dict("records")
148
+ }
143
149
  }
144
150
 
145
151
  analyzer = PythonCodeNode.from_function(analyze_customers, name="analyzer")
@@ -174,7 +180,7 @@ sharepoint = SharePointGraphReader()
174
180
  workflow.add_node("read_sharepoint", sharepoint)
175
181
 
176
182
  # Process downloaded files
177
- csv_writer = CSVWriter()
183
+ csv_writer = CSVWriter(file_path="sharepoint_output.csv")
178
184
  workflow.add_node("save_locally", csv_writer)
179
185
 
180
186
  # Connect nodes
@@ -198,6 +204,75 @@ runtime = LocalRuntime()
198
204
  results, run_id = runtime.execute(workflow, inputs=inputs)
199
205
  ```
200
206
 
207
+ ### Hierarchical RAG Example
208
+
209
+ ```python
210
+ from kailash.workflow import Workflow
211
+ from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
212
+ from kailash.nodes.ai.llm_agent import LLMAgent
213
+ from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
214
+ from kailash.nodes.data.retrieval import RelevanceScorerNode
215
+ from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
216
+ from kailash.nodes.transform.formatters import (
217
+ ChunkTextExtractorNode, QueryTextWrapperNode, ContextFormatterNode
218
+ )
219
+
220
+ # Create hierarchical RAG workflow
221
+ workflow = Workflow("hierarchical_rag", name="Hierarchical RAG Workflow")
222
+
223
+ # Data sources (autonomous - no external files needed)
224
+ doc_source = DocumentSourceNode()
225
+ query_source = QuerySourceNode()
226
+
227
+ # Document processing pipeline
228
+ chunker = HierarchicalChunkerNode()
229
+ chunk_text_extractor = ChunkTextExtractorNode()
230
+ query_text_wrapper = QueryTextWrapperNode()
231
+
232
+ # AI processing with Ollama
233
+ chunk_embedder = EmbeddingGenerator(
234
+ provider="ollama", model="nomic-embed-text", operation="embed_batch"
235
+ )
236
+ query_embedder = EmbeddingGenerator(
237
+ provider="ollama", model="nomic-embed-text", operation="embed_batch"
238
+ )
239
+
240
+ # Retrieval and response generation
241
+ relevance_scorer = RelevanceScorerNode()
242
+ context_formatter = ContextFormatterNode()
243
+ llm_agent = LLMAgent(provider="ollama", model="llama3.2", temperature=0.7)
244
+
245
+ # Add all nodes to workflow
246
+ for name, node in {
247
+ "doc_source": doc_source, "query_source": query_source,
248
+ "chunker": chunker, "chunk_text_extractor": chunk_text_extractor,
249
+ "query_text_wrapper": query_text_wrapper, "chunk_embedder": chunk_embedder,
250
+ "query_embedder": query_embedder, "relevance_scorer": relevance_scorer,
251
+ "context_formatter": context_formatter, "llm_agent": llm_agent
252
+ }.items():
253
+ workflow.add_node(name, node)
254
+
255
+ # Connect the RAG pipeline
256
+ workflow.connect("doc_source", "chunker", {"documents": "documents"})
257
+ workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
258
+ workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
259
+ workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
260
+ workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
261
+ workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
262
+ workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
263
+ workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
264
+ workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
265
+ workflow.connect("query_source", "context_formatter", {"query": "query"})
266
+ workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
267
+
268
+ # Execute the RAG workflow
269
+ from kailash.runtime.local import LocalRuntime
270
+ runtime = LocalRuntime()
271
+ results, run_id = runtime.execute(workflow)
272
+
273
+ print("RAG Response:", results["llm_agent"]["response"])
274
+ ```
275
+
201
276
  ## 📚 Documentation
202
277
 
203
278
  | Resource | Description |
@@ -221,6 +296,9 @@ The SDK includes a rich set of pre-built nodes for common operations:
221
296
  **Data Operations**
222
297
  - `CSVReader` - Read CSV files
223
298
  - `JSONReader` - Read JSON files
299
+ - `DocumentSourceNode` - Sample document provider
300
+ - `QuerySourceNode` - Sample query provider
301
+ - `RelevanceScorerNode` - Multi-method similarity
224
302
  - `SQLDatabaseNode` - Query databases
225
303
  - `CSVWriter` - Write CSV files
226
304
  - `JSONWriter` - Write JSON files
@@ -228,12 +306,15 @@ The SDK includes a rich set of pre-built nodes for common operations:
228
306
  </td>
229
307
  <td width="50%">
230
308
 
231
- **Processing Nodes**
309
+ **Transform Nodes**
232
310
  - `PythonCodeNode` - Custom Python logic
233
311
  - `DataTransformer` - Transform data
312
+ - `HierarchicalChunkerNode` - Document chunking
313
+ - `ChunkTextExtractorNode` - Extract chunk text
314
+ - `QueryTextWrapperNode` - Wrap queries for processing
315
+ - `ContextFormatterNode` - Format LLM context
234
316
  - `Filter` - Filter records
235
317
  - `Aggregator` - Aggregate data
236
- - `TextProcessor` - Process text
237
318
 
238
319
  </td>
239
320
  </tr>
@@ -241,10 +322,12 @@ The SDK includes a rich set of pre-built nodes for common operations:
241
322
  <td width="50%">
242
323
 
243
324
  **AI/ML Nodes**
244
- - `EmbeddingNode` - Generate embeddings
245
- - `VectorDatabaseNode` - Vector search
246
- - `ModelPredictorNode` - ML predictions
247
- - `LLMNode` - LLM integration
325
+ - `LLMAgent` - Multi-provider LLM with memory & tools
326
+ - `EmbeddingGenerator` - Vector embeddings with caching
327
+ - `MCPClient/MCPServer` - Model Context Protocol
328
+ - `TextClassifier` - Text classification
329
+ - `SentimentAnalyzer` - Sentiment analysis
330
+ - `NamedEntityRecognizer` - NER extraction
248
331
 
249
332
  </td>
250
333
  <td width="50%">
@@ -280,25 +363,30 @@ The SDK includes a rich set of pre-built nodes for common operations:
280
363
  #### Workflow Management
281
364
  ```python
282
365
  from kailash.workflow import Workflow
366
+ from kailash.nodes.logic import Switch
367
+ from kailash.nodes.transform import DataTransformer
283
368
 
284
369
  # Create complex workflows with branching logic
285
370
  workflow = Workflow("data_pipeline", name="data_pipeline")
286
371
 
287
- # Add conditional branching
288
- validator = ValidationNode()
289
- workflow.add_node("validate", validator)
372
+ # Add conditional branching with Switch node
373
+ switch = Switch()
374
+ workflow.add_node("route", switch)
290
375
 
291
376
  # Different paths based on validation
377
+ processor_a = DataTransformer(transformations=["lambda x: x"])
378
+ error_handler = DataTransformer(transformations=["lambda x: {'error': str(x)}"])
292
379
  workflow.add_node("process_valid", processor_a)
293
380
  workflow.add_node("handle_errors", error_handler)
294
381
 
295
- # Connect with conditions
296
- workflow.connect("validate", "process_valid", condition="is_valid")
297
- workflow.connect("validate", "handle_errors", condition="has_errors")
382
+ # Connect with switch routing
383
+ workflow.connect("route", "process_valid")
384
+ workflow.connect("route", "handle_errors")
298
385
  ```
299
386
 
300
387
  #### Immutable State Management
301
388
  ```python
389
+ from kailash.workflow import Workflow
302
390
  from kailash.workflow.state import WorkflowStateWrapper
303
391
  from pydantic import BaseModel
304
392
 
@@ -308,6 +396,9 @@ class MyStateModel(BaseModel):
308
396
  status: str = "pending"
309
397
  nested: dict = {}
310
398
 
399
+ # Create workflow
400
+ workflow = Workflow("state_workflow", name="state_workflow")
401
+
311
402
  # Create and wrap state object
312
403
  state = MyStateModel()
313
404
  state_wrapper = workflow.create_state_wrapper(state)
@@ -324,8 +415,9 @@ updated_wrapper = state_wrapper.batch_update([
324
415
  (["status"], "processing")
325
416
  ])
326
417
 
327
- # Execute workflow with state management
328
- final_state, results = workflow.execute_with_state(state_model=state)
418
+ # Access the updated state
419
+ print(f"Updated counter: {updated_wrapper._state.counter}")
420
+ print(f"Updated status: {updated_wrapper._state.status}")
329
421
  ```
330
422
 
331
423
  #### Task Tracking
@@ -342,45 +434,75 @@ workflow = Workflow("sample_workflow", name="Sample Workflow")
342
434
  # Run workflow with tracking
343
435
  from kailash.runtime.local import LocalRuntime
344
436
  runtime = LocalRuntime()
345
- results, run_id = runtime.execute(workflow, task_manager=task_manager)
437
+ results, run_id = runtime.execute(workflow)
346
438
 
347
439
  # Query execution history
348
- runs = task_manager.list_runs(status="completed", limit=10)
349
- details = task_manager.get_run(run_id)
440
+ # Note: list_runs() may fail with timezone comparison errors in some cases
441
+ try:
442
+ # List all runs
443
+ all_runs = task_manager.list_runs()
444
+
445
+ # Filter by status
446
+ completed_runs = task_manager.list_runs(status="completed")
447
+ failed_runs = task_manager.list_runs(status="failed")
448
+
449
+ # Filter by workflow name
450
+ workflow_runs = task_manager.list_runs(workflow_name="sample_workflow")
451
+
452
+ # Process run information
453
+ for run in completed_runs[:5]: # First 5 runs
454
+ print(f"Run {run.run_id[:8]}: {run.workflow_name} - {run.status}")
455
+
456
+ except Exception as e:
457
+ print(f"Error listing runs: {e}")
458
+ # Fallback: Access run details directly if available
459
+ if hasattr(task_manager, 'storage'):
460
+ run = task_manager.get_run(run_id)
350
461
  ```
351
462
 
352
463
  #### Local Testing
353
464
  ```python
354
465
  from kailash.runtime.local import LocalRuntime
466
+ from kailash.workflow import Workflow
467
+
468
+ # Create a test workflow
469
+ workflow = Workflow("test_workflow", name="test_workflow")
355
470
 
356
471
  # Create test runtime with debugging enabled
357
472
  runtime = LocalRuntime(debug=True)
358
473
 
359
474
  # Execute with test data
360
- test_data = {"customers": [...]}
361
- results = runtime.execute(workflow, inputs=test_data)
475
+ results, run_id = runtime.execute(workflow)
362
476
 
363
477
  # Validate results
364
- assert results["node_id"]["output_key"] == expected_value
478
+ assert isinstance(results, dict)
365
479
  ```
366
480
 
367
481
  #### Performance Monitoring & Real-time Dashboards
368
482
  ```python
369
483
  from kailash.visualization.performance import PerformanceVisualizer
370
484
  from kailash.visualization.dashboard import RealTimeDashboard, DashboardConfig
371
- from kailash.visualization.reports import WorkflowPerformanceReporter
485
+ from kailash.visualization.reports import WorkflowPerformanceReporter, ReportFormat
372
486
  from kailash.tracking import TaskManager
373
487
  from kailash.runtime.local import LocalRuntime
488
+ from kailash.workflow import Workflow
489
+ from kailash.nodes.transform import DataTransformer
490
+
491
+ # Create a workflow to monitor
492
+ workflow = Workflow("monitored_workflow", name="monitored_workflow")
493
+ node = DataTransformer(transformations=["lambda x: x"])
494
+ workflow.add_node("transform", node)
374
495
 
375
496
  # Run workflow with task tracking
497
+ # Note: Pass task_manager to execute() to enable performance tracking
376
498
  task_manager = TaskManager()
377
499
  runtime = LocalRuntime()
378
500
  results, run_id = runtime.execute(workflow, task_manager=task_manager)
379
501
 
380
502
  # Static performance analysis
503
+ from pathlib import Path
381
504
  perf_viz = PerformanceVisualizer(task_manager)
382
- outputs = perf_viz.create_run_performance_summary(run_id, output_dir="performance_report")
383
- perf_viz.compare_runs([run_id_1, run_id_2], output_path="comparison.png")
505
+ outputs = perf_viz.create_run_performance_summary(run_id, output_dir=Path("performance_report"))
384
506
 
385
507
  # Real-time monitoring dashboard
386
508
  config = DashboardConfig(
@@ -408,8 +530,7 @@ reporter = WorkflowPerformanceReporter(task_manager)
408
530
  report_path = reporter.generate_report(
409
531
  run_id,
410
532
  output_path="workflow_report.html",
411
- format=ReportFormat.HTML,
412
- compare_runs=[run_id_1, run_id_2]
533
+ format=ReportFormat.HTML
413
534
  )
414
535
  ```
415
536
 
@@ -466,6 +587,13 @@ api_client = RESTAPINode(
466
587
  #### Export Formats
467
588
  ```python
468
589
  from kailash.utils.export import WorkflowExporter, ExportConfig
590
+ from kailash.workflow import Workflow
591
+ from kailash.nodes.transform import DataTransformer
592
+
593
+ # Create a workflow to export
594
+ workflow = Workflow("export_example", name="export_example")
595
+ node = DataTransformer(transformations=["lambda x: x"])
596
+ workflow.add_node("transform", node)
469
597
 
470
598
  exporter = WorkflowExporter()
471
599
 
@@ -478,22 +606,147 @@ config = ExportConfig(
478
606
  include_metadata=True,
479
607
  container_tag="latest"
480
608
  )
481
- workflow.save("deployment.yaml", format="yaml")
609
+ workflow.save("deployment.yaml")
482
610
  ```
483
611
 
484
612
  ### 🎨 Visualization
485
613
 
486
614
  ```python
615
+ from kailash.workflow import Workflow
487
616
  from kailash.workflow.visualization import WorkflowVisualizer
617
+ from kailash.nodes.transform import DataTransformer
618
+
619
+ # Create a workflow to visualize
620
+ workflow = Workflow("viz_example", name="viz_example")
621
+ node = DataTransformer(transformations=["lambda x: x"])
622
+ workflow.add_node("transform", node)
488
623
 
489
- # Visualize workflow structure
624
+ # Generate Mermaid diagram (recommended for documentation)
625
+ mermaid_code = workflow.to_mermaid()
626
+ print(mermaid_code)
627
+
628
+ # Save as Mermaid markdown file
629
+ with open("workflow.md", "w") as f:
630
+ f.write(workflow.to_mermaid_markdown(title="My Workflow"))
631
+
632
+ # Or use matplotlib visualization
490
633
  visualizer = WorkflowVisualizer(workflow)
491
- visualizer.visualize(output_path="workflow.png")
634
+ visualizer.visualize()
635
+ visualizer.save("workflow.png", dpi=300) # Save as PNG
636
+ ```
637
+
638
+ #### Hierarchical RAG (Retrieval-Augmented Generation)
639
+ ```python
640
+ from kailash.workflow import Workflow
641
+ from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
642
+ from kailash.nodes.data.retrieval import RelevanceScorerNode
643
+ from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
644
+ from kailash.nodes.transform.formatters import (
645
+ ChunkTextExtractorNode,
646
+ QueryTextWrapperNode,
647
+ ContextFormatterNode,
648
+ )
649
+ from kailash.nodes.ai.llm_agent import LLMAgent
650
+ from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
651
+
652
+ # Create hierarchical RAG workflow
653
+ workflow = Workflow(
654
+ workflow_id="hierarchical_rag_example",
655
+ name="Hierarchical RAG Workflow",
656
+ description="Complete RAG pipeline with embedding-based retrieval",
657
+ version="1.0.0"
658
+ )
492
659
 
493
- # Show in Jupyter notebook
494
- visualizer.show()
660
+ # Create data source nodes
661
+ doc_source = DocumentSourceNode()
662
+ query_source = QuerySourceNode()
663
+
664
+ # Create document processing pipeline
665
+ chunker = HierarchicalChunkerNode()
666
+ chunk_text_extractor = ChunkTextExtractorNode()
667
+ query_text_wrapper = QueryTextWrapperNode()
668
+
669
+ # Create embedding generators
670
+ chunk_embedder = EmbeddingGenerator(
671
+ provider="ollama",
672
+ model="nomic-embed-text",
673
+ operation="embed_batch"
674
+ )
675
+
676
+ query_embedder = EmbeddingGenerator(
677
+ provider="ollama",
678
+ model="nomic-embed-text",
679
+ operation="embed_batch"
680
+ )
681
+
682
+ # Create retrieval and formatting nodes
683
+ relevance_scorer = RelevanceScorerNode(similarity_method="cosine")
684
+ context_formatter = ContextFormatterNode()
685
+
686
+ # Create LLM agent for final answer generation
687
+ llm_agent = LLMAgent(
688
+ provider="ollama",
689
+ model="llama3.2",
690
+ temperature=0.7,
691
+ max_tokens=500
692
+ )
693
+
694
+ # Add all nodes to workflow
695
+ for node_id, node in [
696
+ ("doc_source", doc_source),
697
+ ("chunker", chunker),
698
+ ("query_source", query_source),
699
+ ("chunk_text_extractor", chunk_text_extractor),
700
+ ("query_text_wrapper", query_text_wrapper),
701
+ ("chunk_embedder", chunk_embedder),
702
+ ("query_embedder", query_embedder),
703
+ ("relevance_scorer", relevance_scorer),
704
+ ("context_formatter", context_formatter),
705
+ ("llm_agent", llm_agent)
706
+ ]:
707
+ workflow.add_node(node_id, node)
708
+
709
+ # Connect the workflow pipeline
710
+ # Document processing: docs → chunks → text → embeddings
711
+ workflow.connect("doc_source", "chunker", {"documents": "documents"})
712
+ workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
713
+ workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
714
+
715
+ # Query processing: query → text wrapper → embeddings
716
+ workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
717
+ workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
718
+
719
+ # Relevance scoring: chunks + embeddings → scored chunks
720
+ workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
721
+ workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
722
+ workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
723
+
724
+ # Context formatting: relevant chunks + query → formatted context
725
+ workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
726
+ workflow.connect("query_source", "context_formatter", {"query": "query"})
727
+
728
+ # Final answer generation: formatted context → LLM response
729
+ workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
730
+
731
+ # Execute workflow
732
+ results, run_id = workflow.run()
733
+
734
+ # Access results
735
+ print("🎯 Top Relevant Chunks:")
736
+ for chunk in results["relevance_scorer"]["relevant_chunks"]:
737
+ print(f" - {chunk['document_title']}: {chunk['relevance_score']:.3f}")
738
+
739
+ print("\n🤖 Final Answer:")
740
+ print(results["llm_agent"]["response"]["content"])
495
741
  ```
496
742
 
743
+ This example demonstrates:
744
+ - **Document chunking** with hierarchical structure
745
+ - **Vector embeddings** using Ollama's nomic-embed-text model
746
+ - **Semantic similarity** scoring with cosine similarity
747
+ - **Context formatting** for LLM input
748
+ - **Answer generation** using Ollama's llama3.2 model
749
+
497
750
  ## 💻 CLI Commands
498
751
 
499
752
  The SDK includes a comprehensive CLI for workflow management:
@@ -545,6 +798,45 @@ kailash/
545
798
  └── utils/ # Utilities and helpers
546
799
  ```
547
800
 
801
+ ### 🤖 Unified AI Provider Architecture
802
+
803
+ The SDK features a unified provider architecture for AI capabilities:
804
+
805
+ ```python
806
+ from kailash.nodes.ai import LLMAgent, EmbeddingGenerator
807
+
808
+ # Multi-provider LLM support
809
+ agent = LLMAgent()
810
+ result = agent.run(
811
+ provider="ollama", # or "openai", "anthropic", "mock"
812
+ model="llama3.1:8b-instruct-q8_0",
813
+ messages=[{"role": "user", "content": "Explain quantum computing"}],
814
+ generation_config={"temperature": 0.7, "max_tokens": 500}
815
+ )
816
+
817
+ # Vector embeddings with the same providers
818
+ embedder = EmbeddingGenerator()
819
+ embedding = embedder.run(
820
+ provider="ollama", # Same providers support embeddings
821
+ model="snowflake-arctic-embed2",
822
+ operation="embed_text",
823
+ input_text="Quantum computing uses quantum mechanics principles"
824
+ )
825
+
826
+ # Check available providers and capabilities
827
+ from kailash.nodes.ai.ai_providers import get_available_providers
828
+ providers = get_available_providers()
829
+ # Returns: {"ollama": {"available": True, "chat": True, "embeddings": True}, ...}
830
+ ```
831
+
832
+ **Supported AI Providers:**
833
+ - **Ollama**: Local LLMs with both chat and embeddings (llama3.1, mistral, etc.)
834
+ - **OpenAI**: GPT models and text-embedding-3 series
835
+ - **Anthropic**: Claude models (chat only)
836
+ - **Cohere**: Embedding models (embed-english-v3.0)
837
+ - **HuggingFace**: Sentence transformers and local models
838
+ - **Mock**: Testing provider with consistent outputs
839
+
548
840
  ## 🧪 Testing
549
841
 
550
842
  The SDK is thoroughly tested with comprehensive test suites:
@@ -656,9 +948,9 @@ pre-commit run pytest-check
656
948
  - **Performance visualization dashboards**
657
949
  - **Real-time monitoring dashboard with WebSocket streaming**
658
950
  - **Comprehensive performance reports (HTML, Markdown, JSON)**
659
- - **100% test coverage (544 tests)**
951
+ - **89% test coverage (571 tests)**
660
952
  - **15 test categories all passing**
661
- - 21+ working examples
953
+ - 37 working examples
662
954
 
663
955
  </td>
664
956
  <td width="30%">
@@ -683,11 +975,17 @@ pre-commit run pytest-check
683
975
  </table>
684
976
 
685
977
  ### 🎯 Test Suite Status
686
- - **Total Tests**: 544 passing (100%)
978
+ - **Total Tests**: 571 passing (89%)
687
979
  - **Test Categories**: 15/15 at 100%
688
980
  - **Integration Tests**: 65 passing
689
- - **Examples**: 21/21 working
690
- - **Code Coverage**: Comprehensive
981
+ - **Examples**: 37/37 working
982
+ - **Code Coverage**: 89%
983
+
984
+ ## ⚠️ Known Issues
985
+
986
+ 1. **DateTime Comparison in `list_runs()`**: The `TaskManager.list_runs()` method may encounter timezone comparison errors between timezone-aware and timezone-naive datetime objects. Workaround: Use try-catch blocks when calling `list_runs()` or access run details directly via `get_run(run_id)`.
987
+
988
+ 2. **Performance Tracking**: To enable performance metrics collection, you must pass the `task_manager` parameter to the `runtime.execute()` method: `runtime.execute(workflow, task_manager=task_manager)`.
691
989
 
692
990
  ## 📄 License
693
991