kailash 0.2.0__py3-none-any.whl → 0.2.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,1614 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: kailash
3
- Version: 0.2.0
4
- Summary: Python SDK for the Kailash container-node architecture
5
- Home-page: https://github.com/integrum/kailash-python-sdk
6
- Author: Integrum
7
- Author-email: Integrum <info@integrum.com>
8
- Project-URL: Homepage, https://github.com/integrum/kailash-python-sdk
9
- Project-URL: Bug Tracker, https://github.com/integrum/kailash-python-sdk/issues
10
- Classifier: Development Status :: 3 - Alpha
11
- Classifier: Intended Audience :: Developers
12
- Classifier: Programming Language :: Python :: 3
13
- Classifier: Programming Language :: Python :: 3.11
14
- Classifier: Programming Language :: Python :: 3.12
15
- Requires-Python: >=3.11
16
- Description-Content-Type: text/markdown
17
- License-File: LICENSE
18
- Requires-Dist: networkx>=2.7
19
- Requires-Dist: pydantic>=1.9
20
- Requires-Dist: matplotlib>=3.5
21
- Requires-Dist: pyyaml>=6.0
22
- Requires-Dist: click>=8.0
23
- Requires-Dist: pytest>=8.3.5
24
- Requires-Dist: mcp[cli]>=1.9.2
25
- Requires-Dist: pandas>=2.2.3
26
- Requires-Dist: numpy>=2.2.5
27
- Requires-Dist: scipy>=1.15.3
28
- Requires-Dist: scikit-learn>=1.6.1
29
- Requires-Dist: requests>=2.32.3
30
- Requires-Dist: pytest-cov>=6.1.1
31
- Requires-Dist: isort>=6.0.1
32
- Requires-Dist: aiohttp>=3.12.4
33
- Requires-Dist: ruff>=0.11.12
34
- Requires-Dist: msal>=1.32.3
35
- Requires-Dist: sphinx>=8.2.3
36
- Requires-Dist: sphinx-rtd-theme>=3.0.2
37
- Requires-Dist: sphinx-copybutton>=0.5.2
38
- Requires-Dist: sphinxcontrib-mermaid>=1.0.0
39
- Requires-Dist: sphinx-autobuild>=2024.10.3
40
- Requires-Dist: autodoc>=0.5.0
41
- Requires-Dist: myst-parser>=4.0.1
42
- Requires-Dist: black>=25.1.0
43
- Requires-Dist: psutil>=7.0.0
44
- Requires-Dist: fastapi>=0.115.12
45
- Requires-Dist: uvicorn[standard]>=0.31.0
46
- Requires-Dist: pytest-asyncio>=1.0.0
47
- Requires-Dist: pre-commit>=4.2.0
48
- Requires-Dist: twine>=6.1.0
49
- Requires-Dist: ollama>=0.5.1
50
- Requires-Dist: sqlalchemy>=2.0.0
51
- Requires-Dist: websockets>=12.0
52
- Requires-Dist: httpx>=0.25.0
53
- Requires-Dist: python-jose>=3.5.0
54
- Requires-Dist: pytest-xdist>=3.6.0
55
- Requires-Dist: pytest-timeout>=2.3.0
56
- Requires-Dist: pytest-split>=0.9.0
57
- Provides-Extra: dev
58
- Requires-Dist: pytest>=7.0; extra == "dev"
59
- Requires-Dist: pytest-cov>=3.0; extra == "dev"
60
- Requires-Dist: black>=22.0; extra == "dev"
61
- Requires-Dist: isort>=5.10; extra == "dev"
62
- Requires-Dist: mypy>=0.9; extra == "dev"
63
- Dynamic: author
64
- Dynamic: home-page
65
- Dynamic: license-file
66
- Dynamic: requires-python
67
-
68
- # Kailash Python SDK
69
-
70
- <p align="center">
71
- <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/v/kailash.svg" alt="PyPI version"></a>
72
- <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/pyversions/kailash.svg" alt="Python versions"></a>
73
- <a href="https://pepy.tech/project/kailash"><img src="https://static.pepy.tech/badge/kailash" alt="Downloads"></a>
74
- <img src="https://img.shields.io/badge/license-MIT-green.svg" alt="MIT License">
75
- <img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black">
76
- <img src="https://img.shields.io/badge/tests-734%20passing-brightgreen.svg" alt="Tests: 734 passing">
77
- <img src="https://img.shields.io/badge/coverage-100%25-brightgreen.svg" alt="Coverage: 100%">
78
- </p>
79
-
80
- <p align="center">
81
- <strong>A Pythonic SDK for the Kailash container-node architecture</strong>
82
- </p>
83
-
84
- <p align="center">
85
- Build workflows that seamlessly integrate with Kailash's production environment while maintaining the flexibility to prototype quickly and iterate locally.
86
- </p>
87
-
88
- ---
89
-
90
- ## ✨ Highlights
91
-
92
- - 🚀 **Rapid Prototyping**: Create and test workflows locally without containerization
93
- - 🏗️ **Architecture-Aligned**: Automatically ensures compliance with Kailash standards
94
- - 🔄 **Seamless Handoff**: Export prototypes directly to production-ready formats
95
- - 📊 **Real-time Monitoring**: Live dashboards with WebSocket streaming and performance metrics
96
- - 🧩 **Extensible**: Easy to create custom nodes for domain-specific operations
97
- - ⚡ **Fast Installation**: Uses `uv` for lightning-fast Python package management
98
- - 🤖 **AI-Powered**: Complete LLM agents, embeddings, and hierarchical RAG architecture
99
- - 🧠 **Retrieval-Augmented Generation**: Full RAG pipeline with intelligent document processing
100
- - 🌐 **REST API Wrapper**: Expose any workflow as a production-ready API in 3 lines
101
- - 🚪 **Multi-Workflow Gateway**: Manage multiple workflows through unified API with MCP integration
102
- - 🤖 **Self-Organizing Agents**: Autonomous agent pools with intelligent team formation and convergence detection
103
- - 🧠 **Agent-to-Agent Communication**: Shared memory pools and intelligent caching for coordinated multi-agent systems
104
- - 🔒 **Production Security**: Comprehensive security framework with path traversal prevention, code sandboxing, and audit logging
105
- - 🎨 **Visual Workflow Builder**: Kailash Workflow Studio - drag-and-drop interface for creating and managing workflows (coming soon)
106
- - 🔁 **Cyclic Workflows (v0.2.0)**: Universal Hybrid Cyclic Graph Architecture with 30,000+ iterations/second performance
107
- - 🛠️ **Developer Tools**: CycleAnalyzer, CycleDebugger, CycleProfiler for production-ready cyclic workflows
108
- - 📈 **High Performance**: Optimized execution engine supporting 100,000+ iteration workflows
109
-
110
- ## 🎯 Who Is This For?
111
-
112
- The Kailash Python SDK is designed for:
113
-
114
- - **AI Business Coaches (ABCs)** who need to prototype workflows quickly
115
- - **Data Scientists** building ML pipelines compatible with production infrastructure
116
- - **Engineers** who want to test Kailash workflows locally before deployment
117
- - **Teams** looking to standardize their workflow development process
118
-
119
- ## 🚀 Quick Start
120
-
121
- ### Installation
122
-
123
- **Requirements:** Python 3.11 or higher
124
-
125
- ```bash
126
- # Install uv if you haven't already
127
- curl -LsSf https://astral.sh/uv/install.sh | sh
128
-
129
- # For users: Install from PyPI
130
- pip install kailash
131
-
132
- # For developers: Clone and sync
133
- git clone https://github.com/integrum/kailash-python-sdk.git
134
- cd kailash-python-sdk
135
- uv sync
136
- ```
137
-
138
- ### Your First Workflow
139
-
140
- ```python
141
- from kailash.workflow import Workflow
142
- from kailash.nodes.data import CSVReaderNode
143
- from kailash.nodes.code import PythonCodeNode
144
- from kailash.runtime.local import LocalRuntime
145
- import pandas as pd
146
-
147
- # Create a workflow
148
- workflow = Workflow("customer_analysis", name="customer_analysis")
149
-
150
- # Add data reader
151
- reader = CSVReaderNode(file_path="customers.csv")
152
- workflow.add_node("read_customers", reader)
153
-
154
- # Add custom processing using Python code
155
- def analyze_customers(data):
156
- """Analyze customer data and compute metrics."""
157
- df = pd.DataFrame(data)
158
- # Convert total_spent to numeric
159
- df['total_spent'] = pd.to_numeric(df['total_spent'])
160
- return {
161
- "result": {
162
- "total_customers": len(df),
163
- "avg_spend": df["total_spent"].mean(),
164
- "top_customers": df.nlargest(10, "total_spent").to_dict("records")
165
- }
166
- }
167
-
168
- analyzer = PythonCodeNode.from_function(analyze_customers, name="analyzer")
169
- workflow.add_node("analyze", analyzer)
170
-
171
- # Connect nodes
172
- workflow.connect("read_customers", "analyze", {"data": "data"})
173
-
174
- # Run locally
175
- runtime = LocalRuntime()
176
- results, run_id = runtime.execute(workflow)
177
- print(f"Analysis complete! Results: {results}")
178
-
179
- # Export for production
180
- from kailash.utils.export import WorkflowExporter
181
- exporter = WorkflowExporter()
182
- workflow.save("customer_analysis.yaml", format="yaml")
183
- ```
184
-
185
- ### SharePoint Integration Example
186
-
187
- ```python
188
- from kailash.workflow import Workflow
189
- from kailash.nodes.data import SharePointGraphReader, CSVWriterNode
190
- import os
191
-
192
- # Create workflow for SharePoint file processing
193
- workflow = Workflow("sharepoint_processor", name="sharepoint_processor")
194
-
195
- # Configure SharePoint reader (using environment variables)
196
- sharepoint = SharePointGraphReader()
197
- workflow.add_node("read_sharepoint", sharepoint)
198
-
199
- # Process downloaded files
200
- csv_writer = CSVWriterNode(file_path="sharepoint_output.csv")
201
- workflow.add_node("save_locally", csv_writer)
202
-
203
- # Connect nodes
204
- workflow.connect("read_sharepoint", "save_locally")
205
-
206
- # Execute with credentials
207
- from kailash.runtime.local import LocalRuntime
208
-
209
- inputs = {
210
- "read_sharepoint": {
211
- "tenant_id": os.getenv("SHAREPOINT_TENANT_ID"),
212
- "client_id": os.getenv("SHAREPOINT_CLIENT_ID"),
213
- "client_secret": os.getenv("SHAREPOINT_CLIENT_SECRET"),
214
- "site_url": "https://yourcompany.sharepoint.com/sites/YourSite",
215
- "operation": "list_files",
216
- "library_name": "Documents"
217
- }
218
- }
219
-
220
- runtime = LocalRuntime()
221
- results, run_id = runtime.execute(workflow, inputs=inputs)
222
- ```
223
-
224
- ### Hierarchical RAG Example
225
-
226
- ```python
227
- from kailash.workflow import Workflow
228
- from kailash.nodes.ai.embedding_generator import EmbeddingGeneratorNode
229
- from kailash.nodes.ai.llm_agent import LLMAgentNode
230
- from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
231
- from kailash.nodes.data.retrieval import RelevanceScorerNode
232
- from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
233
- from kailash.nodes.transform.formatters import (
234
- ChunkTextExtractorNode, QueryTextWrapperNode, ContextFormatterNode
235
- )
236
-
237
- # Create hierarchical RAG workflow
238
- workflow = Workflow("hierarchical_rag", name="Hierarchical RAG Workflow")
239
-
240
- # Data sources (autonomous - no external files needed)
241
- doc_source = DocumentSourceNode()
242
- query_source = QuerySourceNode()
243
-
244
- # Document processing pipeline
245
- chunker = HierarchicalChunkerNode()
246
- chunk_text_extractor = ChunkTextExtractorNode()
247
- query_text_wrapper = QueryTextWrapperNode()
248
-
249
- # AI processing with Ollama
250
- chunk_embedder = EmbeddingGeneratorNode(
251
- provider="ollama", model="nomic-embed-text", operation="embed_batch"
252
- )
253
- query_embedder = EmbeddingGeneratorNode(
254
- provider="ollama", model="nomic-embed-text", operation="embed_batch"
255
- )
256
-
257
- # Retrieval and response generation
258
- relevance_scorer = RelevanceScorerNode()
259
- context_formatter = ContextFormatterNode()
260
- llm_agent = LLMAgentNode(provider="ollama", model="llama3.2", temperature=0.7)
261
-
262
- # Add all nodes to workflow
263
- for name, node in {
264
- "doc_source": doc_source, "query_source": query_source,
265
- "chunker": chunker, "chunk_text_extractor": chunk_text_extractor,
266
- "query_text_wrapper": query_text_wrapper, "chunk_embedder": chunk_embedder,
267
- "query_embedder": query_embedder, "relevance_scorer": relevance_scorer,
268
- "context_formatter": context_formatter, "llm_agent": llm_agent
269
- }.items():
270
- workflow.add_node(name, node)
271
-
272
- # Connect the RAG pipeline
273
- workflow.connect("doc_source", "chunker", {"documents": "documents"})
274
- workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
275
- workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
276
- workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
277
- workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
278
- workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
279
- workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
280
- workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
281
- workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
282
- workflow.connect("query_source", "context_formatter", {"query": "query"})
283
- workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
284
-
285
- # Execute the RAG workflow
286
- from kailash.runtime.local import LocalRuntime
287
- runtime = LocalRuntime()
288
- results, run_id = runtime.execute(workflow)
289
-
290
- print("RAG Response:", results["llm_agent"]["response"])
291
- ```
292
-
293
- ### Cyclic Workflows - Iterative Processing with Convergence
294
-
295
- Build workflows that iterate until a condition is met, perfect for optimization, retries, and ML training:
296
-
297
- ```python
298
- from kailash.workflow import Workflow
299
- from kailash.nodes.base_cycle_aware import CycleAwareNode
300
- from kailash.nodes.base import NodeParameter
301
- from kailash.runtime.local import LocalRuntime
302
- from typing import Any, Dict
303
-
304
- # Create a custom cycle-aware node for data quality improvement
305
- class DataQualityImproverNode(CycleAwareNode):
306
- def get_parameters(self) -> Dict[str, NodeParameter]:
307
- return {
308
- "data": NodeParameter(name="data", type=list, required=True),
309
- "target_quality": NodeParameter(name="target_quality", type=float, required=False, default=0.95)
310
- }
311
-
312
- def run(self, context: Dict[str, Any], **kwargs) -> Dict[str, Any]:
313
- """Iteratively improve data quality."""
314
- data = kwargs["data"]
315
- target_quality = kwargs.get("target_quality", 0.95)
316
-
317
- # Get current iteration and previous state
318
- iteration = self.get_iteration(context)
319
- prev_state = self.get_previous_state(context)
320
-
321
- # Calculate current quality
322
- quality = prev_state.get("quality", 0.5) + 0.1 # Improve by 10% each iteration
323
- quality = min(quality, 1.0)
324
-
325
- # Process data (simplified)
326
- processed_data = [item for item in data if item is not None]
327
-
328
- # Track quality history
329
- quality_history = self.accumulate_values(context, "quality_history", quality, max_history=10)
330
-
331
- # Detect convergence trend
332
- trend = self.detect_convergence_trend(context, "quality_history", window_size=5)
333
- converged = quality >= target_quality or (trend and trend["slope"] < 0.01)
334
-
335
- # Log progress
336
- self.log_cycle_info(context, f"Iteration {iteration}: Quality={quality:.2%}")
337
-
338
- # Save state for next iteration
339
- self.set_cycle_state({"quality": quality, "processed_count": len(processed_data)})
340
-
341
- return {
342
- "data": processed_data,
343
- "quality": quality,
344
- "converged": converged,
345
- "iteration": iteration,
346
- "history": quality_history
347
- }
348
-
349
- # Build cyclic workflow
350
- workflow = Workflow("quality_improvement", "Iterative Data Quality")
351
- workflow.add_node("improver", DataQualityImproverNode())
352
-
353
- # Create a cycle - node connects to itself!
354
- workflow.connect("improver", "improver",
355
- mapping={"data": "data"}, # Pass data to next iteration
356
- cycle=True, # This is a cycle
357
- max_iterations=20, # Safety limit
358
- convergence_check="converged == True") # Stop condition
359
-
360
- # Execute with automatic iteration management
361
- runtime = LocalRuntime()
362
- results, run_id = runtime.execute(workflow, parameters={
363
- "improver": {
364
- "data": [1, None, 3, None, 5, 6, None, 8, 9, 10],
365
- "target_quality": 0.9
366
- }
367
- })
368
-
369
- print(f"Converged after {results['improver']['iteration']} iterations")
370
- print(f"Final quality: {results['improver']['quality']:.2%}")
371
- print(f"Quality history: {results['improver']['history']}")
372
- ```
373
-
374
- ### NEW in v0.2.0: CycleBuilder API
375
-
376
- The new CycleBuilder API provides a fluent interface for creating cyclic workflows:
377
-
378
- ```python
379
- # Modern approach with CycleBuilder
380
- workflow.create_cycle("optimization_loop")
381
- .connect("gradient", "optimizer")
382
- .connect("optimizer", "evaluator")
383
- .connect("evaluator", "gradient")
384
- .max_iterations(100)
385
- .converge_when("loss < 0.01")
386
- .early_stop_when("gradient_norm < 1e-6")
387
- .checkpoint_every(10)
388
- .build()
389
-
390
- # Developer tools for production workflows
391
- from kailash.workflow import CycleAnalyzer, CycleDebugger, CycleProfiler
392
-
393
- # Analyze cycle patterns
394
- analyzer = CycleAnalyzer(workflow)
395
- report = analyzer.analyze()
396
- print(f"Found {len(report.cycles)} cycles")
397
- print(f"Max depth: {report.max_cycle_depth}")
398
-
399
- # Debug with breakpoints
400
- debugger = CycleDebugger(workflow)
401
- debugger.set_breakpoint("optimizer", iteration=50)
402
- debugger.set_trace("gradient_norm", lambda x: x < 0.001)
403
-
404
- # Profile performance
405
- profiler = CycleProfiler(workflow)
406
- profile_data = profiler.profile(runtime, parameters)
407
- print(f"Bottleneck: {profile_data.bottleneck_node}")
408
- print(f"Iterations/sec: {profile_data.iterations_per_second}")
409
- ```
410
-
411
- #### Cyclic Workflow Features
412
-
413
- - **Built-in Iteration Management**: No manual loops or recursion needed
414
- - **State Persistence**: Maintain state across iterations with `get_previous_state()` and `set_cycle_state()`
415
- - **Convergence Detection**: Automatic trend analysis with `detect_convergence_trend()`
416
- - **Value Accumulation**: Track metrics over time with `accumulate_values()`
417
- - **Safety Limits**: Max iterations prevent infinite loops
418
- - **Performance**: Optimized execution with ~30,000 iterations/second
419
- - **Developer Tools**: CycleAnalyzer, CycleDebugger, CycleProfiler for production workflows
420
-
421
- Common cyclic patterns include:
422
- - **Retry with Backoff**: ETL pipelines with automatic retry
423
- - **Optimization Loops**: Iterative parameter tuning
424
- - **ML Training**: Training until accuracy threshold
425
- - **Polling**: API polling with rate limiting
426
- - **Stream Processing**: Windowed data processing
427
-
428
- ### Workflow API Wrapper - Expose Workflows as REST APIs
429
-
430
- Transform any Kailash workflow into a production-ready REST API in just 3 lines of code:
431
-
432
- ```python
433
- from kailash.api.workflow_api import WorkflowAPI
434
-
435
- # Take any workflow and expose it as an API
436
- api = WorkflowAPI(workflow)
437
- api.run(port=8000) # That's it! Your workflow is now a REST API
438
- ```
439
-
440
- #### Features
441
-
442
- - **Automatic REST Endpoints**:
443
- - `POST /execute` - Execute workflow with inputs
444
- - `GET /workflow/info` - Get workflow metadata
445
- - `GET /health` - Health check endpoint
446
- - Automatic OpenAPI docs at `/docs`
447
-
448
- - **Multiple Execution Modes**:
449
- ```python
450
- # Synchronous execution (wait for results)
451
- curl -X POST http://localhost:8000/execute \
452
- -d '{"inputs": {...}, "mode": "sync"}'
453
-
454
- # Asynchronous execution (get execution ID)
455
- curl -X POST http://localhost:8000/execute \
456
- -d '{"inputs": {...}, "mode": "async"}'
457
-
458
- # Check async status
459
- curl http://localhost:8000/status/{execution_id}
460
- ```
461
-
462
- - **Specialized APIs** for specific domains:
463
- ```python
464
- from kailash.api.workflow_api import create_workflow_api
465
-
466
- # Create a RAG-specific API with custom endpoints
467
- api = create_workflow_api(rag_workflow, api_type="rag")
468
- # Adds /documents and /query endpoints
469
- ```
470
-
471
- - **Production Ready**:
472
- ```python
473
- # Development
474
- api.run(reload=True, log_level="debug")
475
-
476
- # Production with SSL
477
- api.run(
478
- host="0.0.0.0",
479
- port=443,
480
- ssl_keyfile="key.pem",
481
- ssl_certfile="cert.pem",
482
- workers=4
483
- )
484
- ```
485
-
486
- See the [API demo example](examples/integration_examples/integration_api_demo.py) for complete usage patterns.
487
-
488
- ### Multi-Workflow API Gateway - Manage Multiple Workflows
489
-
490
- Run multiple workflows through a single unified API gateway with dynamic routing and MCP integration:
491
-
492
- ```python
493
- from kailash.api.gateway import WorkflowAPIGateway
494
- from kailash.api.mcp_integration import MCPIntegration
495
-
496
- # Create gateway
497
- gateway = WorkflowAPIGateway(
498
- title="Enterprise Platform",
499
- description="Unified API for all workflows"
500
- )
501
-
502
- # Register multiple workflows
503
- gateway.register_workflow("sales", sales_workflow)
504
- gateway.register_workflow("analytics", analytics_workflow)
505
- gateway.register_workflow("reports", reporting_workflow)
506
-
507
- # Add AI-powered tools via MCP
508
- mcp = MCPIntegration("ai_tools")
509
- mcp.add_tool("analyze", analyze_function)
510
- mcp.add_tool("predict", predict_function)
511
- gateway.register_mcp_server("ai", mcp)
512
-
513
- # Run unified server
514
- gateway.run(port=8000)
515
- ```
516
-
517
- #### Gateway Features
518
-
519
- - **Unified Access Point**: All workflows accessible through one server
520
- - `/sales/execute` - Execute sales workflow
521
- - `/analytics/execute` - Execute analytics workflow
522
- - `/workflows` - List all available workflows
523
- - `/health` - Check health of all services
524
-
525
- - **MCP Integration**: AI-powered tools available to all workflows
526
- ```python
527
- # Use MCP tools in workflows
528
- from kailash.api.mcp_integration import MCPToolNode
529
-
530
- tool_node = MCPToolNode(
531
- mcp_server="ai_tools",
532
- tool_name="analyze"
533
- )
534
- workflow.add_node("ai_analysis", tool_node)
535
- ```
536
-
537
- - **Flexible Deployment Patterns**:
538
- ```python
539
- # Pattern 1: Single Gateway (most cases)
540
- gateway.register_workflow("workflow1", wf1)
541
- gateway.register_workflow("workflow2", wf2)
542
-
543
- # Pattern 2: Hybrid (heavy workflows separate)
544
- gateway.register_workflow("light", light_wf)
545
- gateway.proxy_workflow("heavy", "http://gpu-service:8080")
546
-
547
- # Pattern 3: High Availability
548
- # Run multiple gateway instances behind load balancer
549
-
550
- # Pattern 4: Kubernetes
551
- # Deploy with horizontal pod autoscaling
552
- ```
553
-
554
- - **Production Features**:
555
- - WebSocket support for real-time updates
556
- - Health monitoring across all workflows
557
- - Dynamic workflow registration/unregistration
558
- - Built-in CORS and authentication support
559
-
560
- See the [Gateway examples](examples/integration_examples/gateway_comprehensive_demo.py) for complete implementation patterns.
561
-
562
- ### Self-Organizing Agent Pools - Autonomous Multi-Agent Systems
563
-
564
- Build intelligent agent systems that can autonomously form teams, share information, and solve complex problems collaboratively:
565
-
566
- ```python
567
- from kailash import Workflow
568
- from kailash.runtime import LocalRuntime
569
- from kailash.nodes.ai.intelligent_agent_orchestrator import (
570
- OrchestrationManagerNode,
571
- IntelligentCacheNode,
572
- ConvergenceDetectorNode
573
- )
574
- from kailash.nodes.ai.self_organizing import (
575
- AgentPoolManagerNode,
576
- TeamFormationNode,
577
- ProblemAnalyzerNode
578
- )
579
- from kailash.nodes.ai.a2a import SharedMemoryPoolNode, A2AAgentNode
580
-
581
- # Create self-organizing agent workflow
582
- workflow = Workflow("self_organizing_research")
583
-
584
- # Shared infrastructure
585
- memory_pool = SharedMemoryPoolNode(
586
- memory_size_limit=1000,
587
- attention_window=50
588
- )
589
- workflow.add_node("memory", memory_pool)
590
-
591
- # Intelligent caching to prevent redundant operations
592
- cache = IntelligentCacheNode(
593
- ttl=3600, # 1 hour cache
594
- similarity_threshold=0.8,
595
- max_entries=1000
596
- )
597
- workflow.add_node("cache", cache)
598
-
599
- # Problem analysis and team formation
600
- problem_analyzer = ProblemAnalyzerNode()
601
- team_former = TeamFormationNode(
602
- formation_strategy="capability_matching",
603
- optimization_rounds=3
604
- )
605
- workflow.add_node("analyzer", problem_analyzer)
606
- workflow.add_node("team_former", team_former)
607
-
608
- # Self-organizing agent pool
609
- pool_manager = AgentPoolManagerNode(
610
- max_active_agents=20,
611
- agent_timeout=120
612
- )
613
- workflow.add_node("pool", pool_manager)
614
-
615
- # Convergence detection for stopping criteria
616
- convergence = ConvergenceDetectorNode(
617
- quality_threshold=0.85,
618
- improvement_threshold=0.02,
619
- max_iterations=10
620
- )
621
- workflow.add_node("convergence", convergence)
622
-
623
- # Orchestration manager coordinates the entire system
624
- orchestrator = OrchestrationManagerNode(
625
- max_iterations=10,
626
- quality_threshold=0.85,
627
- parallel_execution=True
628
- )
629
- workflow.add_node("orchestrator", orchestrator)
630
-
631
- # Execute with complex business problem
632
- runtime = LocalRuntime()
633
- result, _ = runtime.execute(workflow, parameters={
634
- "orchestrator": {
635
- "query": "Analyze market trends and develop growth strategy for fintech",
636
- "agent_pool_size": 12,
637
- "mcp_servers": [
638
- {"name": "market_data", "command": "python", "args": ["-m", "market_mcp"]},
639
- {"name": "financial", "command": "python", "args": ["-m", "finance_mcp"]},
640
- {"name": "research", "command": "python", "args": ["-m", "research_mcp"]}
641
- ],
642
- "context": {
643
- "domain": "fintech",
644
- "depth": "comprehensive",
645
- "output_format": "strategic_report"
646
- }
647
- }
648
- })
649
-
650
- print(f"Solution Quality: {result['orchestrator']['quality_score']:.2%}")
651
- print(f"Agents Used: {result['orchestrator']['agents_deployed']}")
652
- print(f"Iterations: {result['orchestrator']['iterations_completed']}")
653
- print(f"Final Strategy: {result['orchestrator']['final_solution']['strategy']}")
654
- ```
655
-
656
- #### Key Self-Organizing Features
657
-
658
- - **Autonomous Team Formation**: Agents automatically form optimal teams based on:
659
- - Capability matching for skill-specific tasks
660
- - Swarm-based formation for exploration
661
- - Market-based allocation for resource constraints
662
- - Hierarchical organization for complex problems
663
-
664
- - **Intelligent Information Sharing**:
665
- - **SharedMemoryPoolNode**: Selective attention mechanisms for relevant information
666
- - **IntelligentCacheNode**: Semantic similarity detection prevents redundant operations
667
- - **A2AAgentNode**: Direct agent-to-agent communication with context awareness
668
-
669
- - **Convergence Detection**: Automatic termination when:
670
- - Solution quality exceeds threshold (e.g., 85% confidence)
671
- - Improvement rate drops below minimum (e.g., <2% per iteration)
672
- - Maximum iterations reached
673
- - Time limits exceeded
674
-
675
- - **MCP Integration**: Agents can access external tools and data sources:
676
- - File systems, databases, APIs
677
- - Web scraping and research tools
678
- - Specialized domain knowledge bases
679
- - Real-time data streams
680
-
681
- - **Performance Optimization**:
682
- - Multi-level caching strategies
683
- - Parallel agent execution
684
- - Resource management and monitoring
685
- - Cost tracking for API usage
686
-
687
- See the [Self-Organizing Agents examples](examples/integration_examples/) for complete implementation patterns and the [Agent Systems Guide](docs/guides/self_organizing_agents.rst) for detailed documentation.
688
-
689
- ### Zero-Code MCP Ecosystem - Visual Workflow Builder
690
-
691
- Build and deploy workflows through an interactive web interface without writing any code:
692
-
693
- ```python
694
- from kailash.api.gateway import WorkflowAPIGateway
695
- from kailash.api.mcp_integration import MCPServerRegistry
696
-
697
- # Run the MCP ecosystem demo
698
- # cd examples/integration_examples
699
- # ./run_ecosystem.sh
700
-
701
- # Or run programmatically:
702
- python examples/integration_examples/mcp_ecosystem_demo.py
703
- ```
704
-
705
- #### Features
706
-
707
- - **Drag-and-Drop Builder**: Visual interface for creating workflows
708
- - Drag nodes from palette (CSV Reader, Python Code, JSON Writer, etc.)
709
- - Drop onto canvas to build workflows
710
- - Deploy with one click
711
-
712
- - **Live Dashboard**: Real-time monitoring and statistics
713
- - Connected MCP server status
714
- - Running workflow count
715
- - Execution logs with timestamps
716
-
717
- - **Pre-built Templates**: One-click deployment
718
- - GitHub → Slack Notifier
719
- - Data Processing Pipeline (CSV → Transform → JSON)
720
- - AI Research Assistant
721
-
722
- - **Technology Stack**: Lightweight and fast
723
- - Backend: FastAPI + Kailash SDK
724
- - Frontend: Vanilla HTML/CSS/JavaScript (no frameworks)
725
- - Zero build process required
726
-
727
- See the [MCP Ecosystem example](examples/integration_examples/) for the complete zero-code workflow deployment platform.
728
-
729
- ## 📚 Documentation
730
-
731
- | Resource | Description |
732
- |----------|-------------|
733
- | 📖 [User Guide](docs/user-guide.md) | Comprehensive guide for using the SDK |
734
- | 📋 [API Reference](docs/) | Detailed API documentation |
735
- | 🌐 [API Integration Guide](examples/API_INTEGRATION_README.md) | Complete API integration documentation |
736
- | 🎓 [Examples](examples/) | Working examples and tutorials |
737
- | 🤝 [Contributing](CONTRIBUTING.md) | Contribution guidelines |
738
-
739
- ## 🛠️ Features
740
-
741
- ### 📦 Pre-built Nodes
742
-
743
- The SDK includes a rich set of pre-built nodes for common operations:
744
-
745
- <table>
746
- <tr>
747
- <td width="50%">
748
-
749
- **Data Operations**
750
- - `CSVReaderNode` - Read CSV files
751
- - `JSONReaderNode` - Read JSON files
752
- - `DocumentSourceNode` - Sample document provider
753
- - `QuerySourceNode` - Sample query provider
754
- - `RelevanceScorerNode` - Multi-method similarity
755
- - `SQLDatabaseNode` - Query databases
756
- - `CSVWriterNode` - Write CSV files
757
- - `JSONWriterNode` - Write JSON files
758
-
759
- </td>
760
- <td width="50%">
761
-
762
- **Transform Nodes**
763
- - `PythonCodeNode` - Custom Python logic
764
- - `DataTransformer` - Transform data
765
- - `HierarchicalChunkerNode` - Document chunking
766
- - `ChunkTextExtractorNode` - Extract chunk text
767
- - `QueryTextWrapperNode` - Wrap queries for processing
768
- - `ContextFormatterNode` - Format LLM context
769
- - `Filter` - Filter records
770
- - `Aggregator` - Aggregate data
771
-
772
- **Logic Nodes**
773
- - `SwitchNode` - Conditional routing
774
- - `MergeNode` - Combine multiple inputs
775
- - `WorkflowNode` - Wrap workflows as reusable nodes
776
-
777
- </td>
778
- </tr>
779
- <tr>
780
- <td width="50%">
781
-
782
- **AI/ML Nodes**
783
- - `LLMAgentNode` - Multi-provider LLM with memory & tools
784
- - `EmbeddingGeneratorNode` - Vector embeddings with caching
785
- - `MCPClient/MCPServer` - Model Context Protocol
786
- - `TextClassifier` - Text classification
787
- - `SentimentAnalyzer` - Sentiment analysis
788
- - `NamedEntityRecognizer` - NER extraction
789
-
790
- **Self-Organizing Agent Nodes**
791
- - `SharedMemoryPoolNode` - Agent memory sharing
792
- - `A2AAgentNode` - Agent-to-agent communication
793
- - `A2ACoordinatorNode` - Multi-agent coordination
794
- - `IntelligentCacheNode` - Semantic caching system
795
- - `MCPAgentNode` - MCP-enabled agents
796
- - `QueryAnalysisNode` - Query complexity analysis
797
- - `OrchestrationManagerNode` - System orchestration
798
- - `ConvergenceDetectorNode` - Solution convergence
799
- - `AgentPoolManagerNode` - Agent pool management
800
- - `ProblemAnalyzerNode` - Problem decomposition
801
- - `TeamFormationNode` - Optimal team creation
802
- - `SolutionEvaluatorNode` - Multi-criteria evaluation
803
- - `SelfOrganizingAgentNode` - Adaptive individual agents
804
-
805
- </td>
806
- <td width="50%">
807
-
808
- **API Integration Nodes**
809
- - `HTTPRequestNode` - HTTP requests
810
- - `RESTAPINode` - REST API client
811
- - `GraphQLClientNode` - GraphQL queries
812
- - `OAuth2AuthNode` - OAuth 2.0 authentication
813
- - `RateLimitedAPINode` - Rate-limited API calls
814
-
815
- **Other Integration Nodes**
816
- - `KafkaConsumerNode` - Kafka streaming
817
- - `WebSocketNode` - WebSocket connections
818
- - `EmailNode` - Send emails
819
-
820
- **SharePoint Integration**
821
- - `SharePointGraphReader` - Read SharePoint files
822
- - `SharePointGraphWriter` - Upload to SharePoint
823
-
824
- **Real-time Monitoring**
825
- - `RealTimeDashboard` - Live workflow monitoring
826
- - `WorkflowPerformanceReporter` - Comprehensive reports
827
- - `SimpleDashboardAPI` - REST API for metrics
828
- - `DashboardAPIServer` - WebSocket streaming server
829
-
830
- </td>
831
- </tr>
832
- </table>
833
-
834
- ### 🔧 Core Capabilities
835
-
836
- #### Workflow Management
837
- ```python
838
- from kailash.workflow import Workflow
839
- from kailash.nodes.logic import SwitchNode
840
- from kailash.nodes.transform import DataTransformer
841
-
842
- # Create complex workflows with branching logic
843
- workflow = Workflow("data_pipeline", name="data_pipeline")
844
-
845
- # Add conditional branching with SwitchNode
846
- switch = SwitchNode()
847
- workflow.add_node("route", switch)
848
-
849
- # Different paths based on validation
850
- processor_a = DataTransformer(transformations=["lambda x: x"])
851
- error_handler = DataTransformer(transformations=["lambda x: {'error': str(x)}"])
852
- workflow.add_node("process_valid", processor_a)
853
- workflow.add_node("handle_errors", error_handler)
854
-
855
- # Connect with switch routing
856
- workflow.connect("route", "process_valid")
857
- workflow.connect("route", "handle_errors")
858
- ```
859
-
860
- #### Hierarchical Workflow Composition
861
- ```python
862
- from kailash.workflow import Workflow
863
- from kailash.nodes.logic import WorkflowNode
864
- from kailash.runtime.local import LocalRuntime
865
-
866
- # Create a reusable data processing workflow
867
- inner_workflow = Workflow("data_processor", name="Data Processor")
868
- # ... add nodes to inner workflow ...
869
-
870
- # Wrap the workflow as a node
871
- processor_node = WorkflowNode(
872
- workflow=inner_workflow,
873
- name="data_processor"
874
- )
875
-
876
- # Use in a larger workflow
877
- main_workflow = Workflow("main", name="Main Pipeline")
878
- main_workflow.add_node("process", processor_node)
879
- main_workflow.add_node("analyze", analyzer_node)
880
-
881
- # Connect workflows
882
- main_workflow.connect("process", "analyze")
883
-
884
- # Execute - parameters automatically mapped to inner workflow
885
- runtime = LocalRuntime()
886
- results, _ = runtime.execute(main_workflow)
887
- ```
888
-
889
- #### Immutable State Management
890
- ```python
891
- from kailash.workflow import Workflow
892
- from kailash.workflow.state import WorkflowStateWrapper
893
- from pydantic import BaseModel
894
-
895
- # Define state model
896
- class MyStateModel(BaseModel):
897
- counter: int = 0
898
- status: str = "pending"
899
- nested: dict = {}
900
-
901
- # Create workflow
902
- workflow = Workflow("state_workflow", name="state_workflow")
903
-
904
- # Create and wrap state object
905
- state = MyStateModel()
906
- state_wrapper = workflow.create_state_wrapper(state)
907
-
908
- # Single path-based update
909
- updated_wrapper = state_wrapper.update_in(
910
- ["counter"],
911
- 42
912
- )
913
-
914
- # Batch update multiple fields atomically
915
- updated_wrapper = state_wrapper.batch_update([
916
- (["counter"], 10),
917
- (["status"], "processing")
918
- ])
919
-
920
- # Access the updated state
921
- print(f"Updated counter: {updated_wrapper._state.counter}")
922
- print(f"Updated status: {updated_wrapper._state.status}")
923
- ```
924
-
925
- #### Task Tracking
926
- ```python
927
- from kailash.tracking import TaskManager
928
-
929
- # Initialize task manager
930
- task_manager = TaskManager()
931
-
932
- # Create a sample workflow
933
- from kailash.workflow import Workflow
934
- workflow = Workflow("sample_workflow", name="Sample Workflow")
935
-
936
- # Run workflow with tracking
937
- from kailash.runtime.local import LocalRuntime
938
- runtime = LocalRuntime()
939
- results, run_id = runtime.execute(workflow)
940
-
941
- # Query execution history
942
- # Note: list_runs() may fail with timezone comparison errors in some cases
943
- try:
944
- # List all runs
945
- all_runs = task_manager.list_runs()
946
-
947
- # Filter by status
948
- completed_runs = task_manager.list_runs(status="completed")
949
- failed_runs = task_manager.list_runs(status="failed")
950
-
951
- # Filter by workflow name
952
- workflow_runs = task_manager.list_runs(workflow_name="sample_workflow")
953
-
954
- # Process run information
955
- for run in completed_runs[:5]: # First 5 runs
956
- print(f"Run {run.run_id[:8]}: {run.workflow_name} - {run.status}")
957
-
958
- except Exception as e:
959
- print(f"Error listing runs: {e}")
960
- # Fallback: Access run details directly if available
961
- if hasattr(task_manager, 'storage'):
962
- run = task_manager.get_run(run_id)
963
- ```
964
-
965
- #### Local Testing
966
- ```python
967
- from kailash.runtime.local import LocalRuntime
968
- from kailash.workflow import Workflow
969
-
970
- # Create a test workflow
971
- workflow = Workflow("test_workflow", name="test_workflow")
972
-
973
- # Create test runtime with debugging enabled
974
- runtime = LocalRuntime(debug=True)
975
-
976
- # Execute with test data
977
- results, run_id = runtime.execute(workflow)
978
-
979
- # Validate results
980
- assert isinstance(results, dict)
981
- ```
982
-
983
- #### Performance Monitoring & Real-time Dashboards
984
- ```python
985
- from kailash.visualization.performance import PerformanceVisualizer
986
- from kailash.visualization.dashboard import RealTimeDashboard, DashboardConfig
987
- from kailash.visualization.reports import WorkflowPerformanceReporter, ReportFormat
988
- from kailash.tracking import TaskManager
989
- from kailash.runtime.local import LocalRuntime
990
- from kailash.workflow import Workflow
991
- from kailash.nodes.transform import DataTransformer
992
-
993
- # Create a workflow to monitor
994
- workflow = Workflow("monitored_workflow", name="monitored_workflow")
995
- node = DataTransformer(transformations=["lambda x: x"])
996
- workflow.add_node("transform", node)
997
-
998
- # Run workflow with task tracking
999
- # Note: Pass task_manager to execute() to enable performance tracking
1000
- task_manager = TaskManager()
1001
- runtime = LocalRuntime()
1002
- results, run_id = runtime.execute(workflow, task_manager=task_manager)
1003
-
1004
- # Static performance analysis
1005
- from pathlib import Path
1006
- perf_viz = PerformanceVisualizer(task_manager)
1007
- outputs = perf_viz.create_run_performance_summary(run_id, output_dir=Path("performance_report"))
1008
-
1009
- # Real-time monitoring dashboard
1010
- config = DashboardConfig(
1011
- update_interval=1.0,
1012
- max_history_points=100,
1013
- auto_refresh=True,
1014
- theme="light"
1015
- )
1016
-
1017
- dashboard = RealTimeDashboard(task_manager, config)
1018
- dashboard.start_monitoring(run_id)
1019
-
1020
- # Add real-time callbacks
1021
- def on_metrics_update(metrics):
1022
- print(f"Tasks: {metrics.completed_tasks} completed, {metrics.active_tasks} active")
1023
-
1024
- dashboard.add_metrics_callback(on_metrics_update)
1025
-
1026
- # Generate live HTML dashboard
1027
- dashboard.generate_live_report("live_dashboard.html", include_charts=True)
1028
- dashboard.stop_monitoring()
1029
-
1030
- # Comprehensive performance reports
1031
- reporter = WorkflowPerformanceReporter(task_manager)
1032
- report_path = reporter.generate_report(
1033
- run_id,
1034
- output_path="workflow_report.html",
1035
- format=ReportFormat.HTML
1036
- )
1037
- ```
1038
-
1039
- **Real-time Dashboard Features**:
1040
- - ⚡ **Live Metrics Streaming**: Real-time task progress and resource monitoring
1041
- - 📊 **Interactive Charts**: CPU, memory, and throughput visualizations with Chart.js
1042
- - 🔌 **API Endpoints**: REST and WebSocket APIs for custom integrations
1043
- - 📈 **Performance Reports**: Multi-format reports (HTML, Markdown, JSON) with insights
1044
- - 🎯 **Bottleneck Detection**: Automatic identification of performance issues
1045
- - 📱 **Responsive Design**: Mobile-friendly dashboards with auto-refresh
1046
-
1047
- **Performance Metrics Collected**:
1048
- - **Execution Timeline**: Gantt charts showing node execution order and duration
1049
- - **Resource Usage**: Real-time CPU and memory consumption
1050
- - **I/O Analysis**: Read/write operations and data transfer volumes
1051
- - **Performance Heatmaps**: Identify bottlenecks across workflow runs
1052
- - **Throughput Metrics**: Tasks per minute and completion rates
1053
- - **Error Tracking**: Failed task analysis and error patterns
1054
-
1055
- #### API Integration
1056
- ```python
1057
- from kailash.nodes.api import (
1058
- HTTPRequestNode as RESTAPINode,
1059
- # OAuth2AuthNode,
1060
- # RateLimitedAPINode,
1061
- # RateLimitConfig
1062
- )
1063
-
1064
- # OAuth 2.0 authentication
1065
- # # auth_node = OAuth2AuthNode(
1066
- # client_id="your_client_id",
1067
- # client_secret="your_client_secret",
1068
- # token_url="https://api.example.com/oauth/token"
1069
- # )
1070
-
1071
- # Rate-limited API client
1072
- rate_config = None # RateLimitConfig(
1073
- # max_requests=100,
1074
- # time_window=60.0,
1075
- # strategy="token_bucket"
1076
- # )
1077
-
1078
- api_client = RESTAPINode(
1079
- base_url="https://api.example.com"
1080
- # auth_node=auth_node
1081
- )
1082
-
1083
- # rate_limited_client = RateLimitedAPINode(
1084
- # wrapped_node=api_client,
1085
- # rate_limit_config=rate_config
1086
- # )
1087
- ```
1088
-
1089
- #### Export Formats
1090
- ```python
1091
- from kailash.utils.export import WorkflowExporter, ExportConfig
1092
- from kailash.workflow import Workflow
1093
- from kailash.nodes.transform import DataTransformer
1094
-
1095
- # Create a workflow to export
1096
- workflow = Workflow("export_example", name="export_example")
1097
- node = DataTransformer(transformations=["lambda x: x"])
1098
- workflow.add_node("transform", node)
1099
-
1100
- exporter = WorkflowExporter()
1101
-
1102
- # Export to different formats
1103
- workflow.save("workflow.yaml", format="yaml") # Kailash YAML format
1104
- workflow.save("workflow.json", format="json") # JSON representation
1105
-
1106
- # Export with custom configuration
1107
- config = ExportConfig(
1108
- include_metadata=True,
1109
- container_tag="latest"
1110
- )
1111
- workflow.save("deployment.yaml")
1112
- ```
1113
-
1114
- ### 🎨 Visualization
1115
-
1116
- ```python
1117
- from kailash.workflow import Workflow
1118
- from kailash.workflow.visualization import WorkflowVisualizer
1119
- from kailash.nodes.transform import DataTransformer
1120
-
1121
- # Create a workflow to visualize
1122
- workflow = Workflow("viz_example", name="viz_example")
1123
- node = DataTransformer(transformations=["lambda x: x"])
1124
- workflow.add_node("transform", node)
1125
-
1126
- # Generate Mermaid diagram (recommended for documentation)
1127
- mermaid_code = workflow.to_mermaid()
1128
- print(mermaid_code)
1129
-
1130
- # Save as Mermaid markdown file
1131
- with open("workflow.md", "w") as f:
1132
- f.write(workflow.to_mermaid_markdown(title="My Workflow"))
1133
-
1134
- # Or use matplotlib visualization
1135
- visualizer = WorkflowVisualizer(workflow)
1136
- visualizer.visualize()
1137
- visualizer.save("workflow.png", dpi=300) # Save as PNG
1138
- ```
1139
-
1140
- #### Hierarchical RAG (Retrieval-Augmented Generation)
1141
- ```python
1142
- from kailash.workflow import Workflow
1143
- from kailash.nodes.data.sources import DocumentSourceNode, QuerySourceNode
1144
- from kailash.nodes.data.retrieval import RelevanceScorerNode
1145
- from kailash.nodes.transform.chunkers import HierarchicalChunkerNode
1146
- from kailash.nodes.transform.formatters import (
1147
- ChunkTextExtractorNode,
1148
- QueryTextWrapperNode,
1149
- ContextFormatterNode,
1150
- )
1151
- from kailash.nodes.ai.llm_agent import LLMAgent
1152
- from kailash.nodes.ai.embedding_generator import EmbeddingGenerator
1153
-
1154
- # Create hierarchical RAG workflow
1155
- workflow = Workflow(
1156
- workflow_id="hierarchical_rag_example",
1157
- name="Hierarchical RAG Workflow",
1158
- description="Complete RAG pipeline with embedding-based retrieval",
1159
- version="1.0.0"
1160
- )
1161
-
1162
- # Create data source nodes
1163
- doc_source = DocumentSourceNode()
1164
- query_source = QuerySourceNode()
1165
-
1166
- # Create document processing pipeline
1167
- chunker = HierarchicalChunkerNode()
1168
- chunk_text_extractor = ChunkTextExtractorNode()
1169
- query_text_wrapper = QueryTextWrapperNode()
1170
-
1171
- # Create embedding generators
1172
- chunk_embedder = EmbeddingGeneratorNode(
1173
- provider="ollama",
1174
- model="nomic-embed-text",
1175
- operation="embed_batch"
1176
- )
1177
-
1178
- query_embedder = EmbeddingGeneratorNode(
1179
- provider="ollama",
1180
- model="nomic-embed-text",
1181
- operation="embed_batch"
1182
- )
1183
-
1184
- # Create retrieval and formatting nodes
1185
- relevance_scorer = RelevanceScorerNode(similarity_method="cosine")
1186
- context_formatter = ContextFormatterNode()
1187
-
1188
- # Create LLM agent for final answer generation
1189
- llm_agent = LLMAgentNode(
1190
- provider="ollama",
1191
- model="llama3.2",
1192
- temperature=0.7,
1193
- max_tokens=500
1194
- )
1195
-
1196
- # Add all nodes to workflow
1197
- for node_id, node in [
1198
- ("doc_source", doc_source),
1199
- ("chunker", chunker),
1200
- ("query_source", query_source),
1201
- ("chunk_text_extractor", chunk_text_extractor),
1202
- ("query_text_wrapper", query_text_wrapper),
1203
- ("chunk_embedder", chunk_embedder),
1204
- ("query_embedder", query_embedder),
1205
- ("relevance_scorer", relevance_scorer),
1206
- ("context_formatter", context_formatter),
1207
- ("llm_agent", llm_agent)
1208
- ]:
1209
- workflow.add_node(node_id, node)
1210
-
1211
- # Connect the workflow pipeline
1212
- # Document processing: docs → chunks → text → embeddings
1213
- workflow.connect("doc_source", "chunker", {"documents": "documents"})
1214
- workflow.connect("chunker", "chunk_text_extractor", {"chunks": "chunks"})
1215
- workflow.connect("chunk_text_extractor", "chunk_embedder", {"input_texts": "input_texts"})
1216
-
1217
- # Query processing: query → text wrapper → embeddings
1218
- workflow.connect("query_source", "query_text_wrapper", {"query": "query"})
1219
- workflow.connect("query_text_wrapper", "query_embedder", {"input_texts": "input_texts"})
1220
-
1221
- # Relevance scoring: chunks + embeddings → scored chunks
1222
- workflow.connect("chunker", "relevance_scorer", {"chunks": "chunks"})
1223
- workflow.connect("query_embedder", "relevance_scorer", {"embeddings": "query_embedding"})
1224
- workflow.connect("chunk_embedder", "relevance_scorer", {"embeddings": "chunk_embeddings"})
1225
-
1226
- # Context formatting: relevant chunks + query → formatted context
1227
- workflow.connect("relevance_scorer", "context_formatter", {"relevant_chunks": "relevant_chunks"})
1228
- workflow.connect("query_source", "context_formatter", {"query": "query"})
1229
-
1230
- # Final answer generation: formatted context → LLM response
1231
- workflow.connect("context_formatter", "llm_agent", {"messages": "messages"})
1232
-
1233
- # Execute workflow
1234
- results, run_id = workflow.run()
1235
-
1236
- # Access results
1237
- print("🎯 Top Relevant Chunks:")
1238
- for chunk in results["relevance_scorer"]["relevant_chunks"]:
1239
- print(f" - {chunk['document_title']}: {chunk['relevance_score']:.3f}")
1240
-
1241
- print("\n🤖 Final Answer:")
1242
- print(results["llm_agent"]["response"]["content"])
1243
- ```
1244
-
1245
- This example demonstrates:
1246
- - **Document chunking** with hierarchical structure
1247
- - **Vector embeddings** using Ollama's nomic-embed-text model
1248
- - **Semantic similarity** scoring with cosine similarity
1249
- - **Context formatting** for LLM input
1250
- - **Answer generation** using Ollama's llama3.2 model
1251
-
1252
- ### 🔒 Access Control and Security
1253
-
1254
- Kailash SDK provides comprehensive access control and security features for enterprise deployments:
1255
-
1256
- #### Role-Based Access Control (RBAC)
1257
- ```python
1258
- from kailash.access_control import UserContext, PermissionRule, NodePermission
1259
- from kailash.runtime.access_controlled import AccessControlledRuntime
1260
-
1261
- # Define user with roles
1262
- user = UserContext(
1263
- user_id="analyst_001",
1264
- tenant_id="company_abc",
1265
- email="analyst@company.com",
1266
- roles=["analyst", "viewer"]
1267
- )
1268
-
1269
- # Create secure runtime
1270
- runtime = AccessControlledRuntime(user_context=user)
1271
-
1272
- # Execute workflow with automatic permission checks
1273
- results, run_id = runtime.execute(workflow, parameters={})
1274
- ```
1275
-
1276
- #### Multi-Tenant Isolation
1277
- ```python
1278
- from kailash.access_control import get_access_control_manager, PermissionEffect, WorkflowPermission
1279
-
1280
- # Configure tenant-based access rules
1281
- acm = get_access_control_manager()
1282
- acm.enabled = True
1283
-
1284
- # Tenant isolation rule
1285
- acm.add_rule(PermissionRule(
1286
- id="tenant_isolation",
1287
- resource_type="workflow",
1288
- resource_id="customer_analytics",
1289
- permission=WorkflowPermission.EXECUTE,
1290
- effect=PermissionEffect.ALLOW,
1291
- tenant_id="company_abc" # Only this tenant can access
1292
- ))
1293
- ```
1294
-
1295
- #### Data Masking and Field Protection
1296
- ```python
1297
- from kailash.nodes.base_with_acl import add_access_control
1298
-
1299
- # Add access control to sensitive data nodes
1300
- secure_reader = add_access_control(
1301
- CSVReaderNode(file_path="customers.csv"),
1302
- enable_access_control=True,
1303
- required_permission=NodePermission.READ_OUTPUT,
1304
- mask_output_fields=["ssn", "phone"] # Mask for non-admin users
1305
- )
1306
-
1307
- workflow.add_node("secure_data", secure_reader)
1308
- ```
1309
-
1310
- #### Permission-Based Routing
1311
- ```python
1312
- # Different execution paths based on user permissions
1313
- from kailash.access_control import NodePermission
1314
-
1315
- # Admin users get full processing
1316
- admin_processor = PythonCodeNode.from_function(
1317
- lambda data: {"result": process_all_data(data)},
1318
- name="admin_processor"
1319
- )
1320
-
1321
- # Analyst users get limited processing
1322
- analyst_processor = PythonCodeNode.from_function(
1323
- lambda data: {"result": process_limited_data(data)},
1324
- name="analyst_processor"
1325
- )
1326
-
1327
- # Runtime automatically routes based on user permissions
1328
- workflow.add_node("admin_path", admin_processor)
1329
- workflow.add_node("analyst_path", analyst_processor)
1330
- ```
1331
-
1332
- **Security Features:**
1333
- - 🔐 **JWT Authentication**: Token-based authentication with refresh support
1334
- - 👥 **Multi-Tenant Isolation**: Complete data separation between tenants
1335
- - 🛡️ **Field-Level Security**: Mask sensitive data based on user roles
1336
- - 📊 **Audit Logging**: Complete access attempt logging for compliance
1337
- - 🚫 **Path Traversal Prevention**: Built-in protection against directory attacks
1338
- - 🏗️ **Backward Compatibility**: Existing workflows work unchanged
1339
- - ⚡ **Performance Optimized**: Minimal overhead with caching
1340
-
1341
- ## 💻 CLI Commands
1342
-
1343
- The SDK includes a comprehensive CLI for workflow management:
1344
-
1345
- ```bash
1346
- # Project initialization
1347
- kailash init my-project --template data-pipeline
1348
-
1349
- # Workflow operations
1350
- kailash validate workflow.yaml
1351
- kailash run workflow.yaml --inputs data.json
1352
- kailash export workflow.py --format kubernetes
1353
-
1354
- # Task management
1355
- kailash tasks list --status running
1356
- kailash tasks show run-123
1357
- kailash tasks cancel run-123
1358
-
1359
- # Development tools
1360
- kailash test workflow.yaml --data test_data.json
1361
- kailash debug workflow.yaml --breakpoint node-id
1362
- ```
1363
-
1364
- ## 🏗️ Architecture
1365
-
1366
- The SDK follows a clean, modular architecture:
1367
-
1368
- ```
1369
- kailash/
1370
- ├── nodes/ # Node implementations and base classes
1371
- │ ├── base.py # Abstract Node class
1372
- │ ├── data/ # Data I/O nodes
1373
- │ ├── transform/ # Transformation nodes
1374
- │ ├── logic/ # Business logic nodes
1375
- │ └── ai/ # AI/ML nodes
1376
- ├── workflow/ # Workflow management
1377
- │ ├── graph.py # DAG representation
1378
- │ └── visualization.py # Visualization tools
1379
- ├── visualization/ # Performance visualization
1380
- │ └── performance.py # Performance metrics charts
1381
- ├── runtime/ # Execution engines
1382
- │ ├── local.py # Local execution
1383
- │ └── docker.py # Docker execution (planned)
1384
- ├── tracking/ # Monitoring and tracking
1385
- │ ├── manager.py # Task management
1386
- │ └── metrics_collector.py # Performance metrics
1387
- │ └── storage/ # Storage backends
1388
- ├── cli/ # Command-line interface
1389
- └── utils/ # Utilities and helpers
1390
- ```
1391
-
1392
- ### 🤖 Unified AI Provider Architecture
1393
-
1394
- The SDK features a unified provider architecture for AI capabilities:
1395
-
1396
- ```python
1397
- from kailash.nodes.ai import LLMAgentNode, EmbeddingGeneratorNode
1398
-
1399
- # Multi-provider LLM support
1400
- agent = LLMAgentNode()
1401
- result = agent.run(
1402
- provider="ollama", # or "openai", "anthropic", "mock"
1403
- model="llama3.1:8b-instruct-q8_0",
1404
- messages=[{"role": "user", "content": "Explain quantum computing"}],
1405
- generation_config={"temperature": 0.7, "max_tokens": 500}
1406
- )
1407
-
1408
- # Vector embeddings with the same providers
1409
- embedder = EmbeddingGeneratorNode()
1410
- embedding = embedder.run(
1411
- provider="ollama", # Same providers support embeddings
1412
- model="snowflake-arctic-embed2",
1413
- operation="embed_text",
1414
- input_text="Quantum computing uses quantum mechanics principles"
1415
- )
1416
-
1417
- # Check available providers and capabilities
1418
- from kailash.nodes.ai.ai_providers import get_available_providers
1419
- providers = get_available_providers()
1420
- # Returns: {"ollama": {"available": True, "chat": True, "embeddings": True}, ...}
1421
- ```
1422
-
1423
- **Supported AI Providers:**
1424
- - **Ollama**: Local LLMs with both chat and embeddings (llama3.1, mistral, etc.)
1425
- - **OpenAI**: GPT models and text-embedding-3 series
1426
- - **Anthropic**: Claude models (chat only)
1427
- - **Cohere**: Embedding models (embed-english-v3.0)
1428
- - **HuggingFace**: Sentence transformers and local models
1429
- - **Mock**: Testing provider with consistent outputs
1430
-
1431
- ## 🧪 Testing
1432
-
1433
- The SDK is thoroughly tested with comprehensive test suites:
1434
-
1435
- ```bash
1436
- # Run all tests
1437
- uv run pytest
1438
-
1439
- # Run with coverage
1440
- uv run pytest --cov=kailash --cov-report=html
1441
-
1442
- # Run specific test categories
1443
- uv run pytest tests/unit/
1444
- uv run pytest tests/integration/
1445
- uv run pytest tests/e2e/
1446
- ```
1447
-
1448
- ## 🤝 Contributing
1449
-
1450
- We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
1451
-
1452
- ### Development Setup
1453
-
1454
- ```bash
1455
- # Clone the repository
1456
- git clone https://github.com/integrum/kailash-python-sdk.git
1457
- cd kailash-python-sdk
1458
-
1459
- # Install uv if you haven't already
1460
- curl -LsSf https://astral.sh/uv/install.sh | sh
1461
-
1462
- # Sync dependencies (creates venv automatically and installs everything)
1463
- uv sync
1464
-
1465
- # Run commands using uv (no need to activate venv)
1466
- uv run pytest
1467
- uv run kailash --help
1468
-
1469
- # Or activate the venv if you prefer
1470
- source .venv/bin/activate # On Windows: .venv\Scripts\activate
1471
-
1472
- # Install development dependencies
1473
- uv add --dev pre-commit detect-secrets doc8
1474
-
1475
- # Install Trivy (macOS with Homebrew)
1476
- brew install trivy
1477
-
1478
- # Set up pre-commit hooks
1479
- pre-commit install
1480
- pre-commit install --hook-type pre-push
1481
-
1482
- # Run initial setup (formats code and fixes issues)
1483
- pre-commit run --all-files
1484
- ```
1485
-
1486
- ### Code Quality & Pre-commit Hooks
1487
-
1488
- We use automated pre-commit hooks to ensure code quality:
1489
-
1490
- **Hooks Include:**
1491
- - **Black**: Code formatting
1492
- - **isort**: Import sorting
1493
- - **Ruff**: Fast Python linting
1494
- - **pytest**: Unit tests
1495
- - **Trivy**: Security vulnerability scanning
1496
- - **detect-secrets**: Secret detection
1497
- - **doc8**: Documentation linting
1498
- - **mypy**: Type checking
1499
-
1500
- **Manual Quality Checks:**
1501
- ```bash
1502
- # Format code
1503
- black src/ tests/
1504
- isort src/ tests/
1505
-
1506
- # Linting and fixes
1507
- ruff check src/ tests/ --fix
1508
-
1509
- # Type checking
1510
- mypy src/
1511
-
1512
- # Run all pre-commit hooks manually
1513
- pre-commit run --all-files
1514
-
1515
- # Run specific hooks
1516
- pre-commit run black
1517
- pre-commit run pytest-check
1518
- ```
1519
-
1520
- ## 📈 Project Status
1521
-
1522
- <table>
1523
- <tr>
1524
- <td width="40%">
1525
-
1526
- ### ✅ Completed
1527
- - Core node system with 66+ node types
1528
- - Workflow builder with DAG validation
1529
- - Local & async execution engines
1530
- - Task tracking with metrics
1531
- - Multiple storage backends
1532
- - Export functionality (YAML/JSON)
1533
- - CLI interface
1534
- - Immutable state management
1535
- - API integration with rate limiting
1536
- - OAuth 2.0 authentication
1537
- - Production security framework
1538
- - Path traversal prevention
1539
- - Code execution sandboxing
1540
- - Comprehensive security testing
1541
- - SharePoint Graph API integration
1542
- - **Self-organizing agent pools with 13 specialized nodes**
1543
- - **Agent-to-agent communication and shared memory**
1544
- - **Intelligent caching and convergence detection**
1545
- - **MCP integration for external tool access**
1546
- - **Multi-strategy team formation algorithms**
1547
- - **Real-time performance metrics collection**
1548
- - **Performance visualization dashboards**
1549
- - **Real-time monitoring dashboard with WebSocket streaming**
1550
- - **Comprehensive performance reports (HTML, Markdown, JSON)**
1551
- - **100% test coverage (591 tests)**
1552
- - **All test categories passing**
1553
- - 68 working examples
1554
-
1555
- </td>
1556
- <td width="30%">
1557
-
1558
- ### 🚧 In Progress
1559
- - **Kailash Workflow Studio** - Visual workflow builder UI
1560
- - React-based drag-and-drop interface
1561
- - Multi-tenant architecture with Docker
1562
- - WorkflowStudioAPI backend
1563
- - Real-time execution monitoring
1564
- - Performance optimizations
1565
- - Docker runtime finalization
1566
-
1567
- </td>
1568
- <td width="30%">
1569
-
1570
- ### 📋 Planned
1571
- - Cloud deployment templates
1572
- - Visual workflow editor
1573
- - Plugin system
1574
- - Additional integrations
1575
-
1576
- </td>
1577
- </tr>
1578
- </table>
1579
-
1580
- ### 🎯 Test Suite Status
1581
- - **Total Tests**: 591 passing (100%)
1582
- - **Test Categories**: All passing
1583
- - **Integration Tests**: All passing
1584
- - **Security Tests**: 10 consolidated comprehensive tests
1585
- - **Examples**: 68/68 working
1586
- - **Code Coverage**: 100%
1587
-
1588
- ## ⚠️ Known Issues
1589
-
1590
- 1. **DateTime Comparison in `list_runs()`**: The `TaskManager.list_runs()` method may encounter timezone comparison errors between timezone-aware and timezone-naive datetime objects. Workaround: Use try-catch blocks when calling `list_runs()` or access run details directly via `get_run(run_id)`.
1591
-
1592
- 2. **Performance Tracking**: To enable performance metrics collection, you must pass the `task_manager` parameter to the `runtime.execute()` method: `runtime.execute(workflow, task_manager=task_manager)`.
1593
-
1594
- ## 📄 License
1595
-
1596
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
1597
-
1598
- ## 🙏 Acknowledgments
1599
-
1600
- - The Integrum team for the Kailash architecture
1601
- - All contributors who have helped shape this SDK
1602
- - The Python community for excellent tools and libraries
1603
-
1604
- ## 📞 Support
1605
-
1606
- - 📋 [GitHub Issues](https://github.com/integrum/kailash-python-sdk/issues)
1607
- - 📧 Email: support@integrum.com
1608
- - 💬 Slack: [Join our community](https://integrum.slack.com/kailash-sdk)
1609
-
1610
- ---
1611
-
1612
- <p align="center">
1613
- Made with ❤️ by the Integrum Team
1614
- </p>