chuk-tool-processor 0.6.19__py3-none-any.whl → 0.6.21__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of chuk-tool-processor might be problematic. Click here for more details.
- chuk_tool_processor/execution/strategies/__init__.py +6 -0
- chuk_tool_processor-0.6.21.dist-info/METADATA +1085 -0
- {chuk_tool_processor-0.6.19.dist-info → chuk_tool_processor-0.6.21.dist-info}/RECORD +5 -5
- chuk_tool_processor-0.6.19.dist-info/METADATA +0 -698
- {chuk_tool_processor-0.6.19.dist-info → chuk_tool_processor-0.6.21.dist-info}/WHEEL +0 -0
- {chuk_tool_processor-0.6.19.dist-info → chuk_tool_processor-0.6.21.dist-info}/top_level.txt +0 -0
|
@@ -0,0 +1,1085 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: chuk-tool-processor
|
|
3
|
+
Version: 0.6.21
|
|
4
|
+
Summary: Async-native framework for registering, discovering, and executing tools referenced in LLM responses
|
|
5
|
+
Author-email: CHUK Team <chrishayuk@somejunkmailbox.com>
|
|
6
|
+
Maintainer-email: CHUK Team <chrishayuk@somejunkmailbox.com>
|
|
7
|
+
License: MIT
|
|
8
|
+
Keywords: llm,tools,async,ai,openai,mcp,model-context-protocol,tool-calling,function-calling
|
|
9
|
+
Classifier: Development Status :: 4 - Beta
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
+
Classifier: Operating System :: OS Independent
|
|
13
|
+
Classifier: Programming Language :: Python :: 3
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
17
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
18
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
19
|
+
Classifier: Framework :: AsyncIO
|
|
20
|
+
Classifier: Typing :: Typed
|
|
21
|
+
Requires-Python: >=3.11
|
|
22
|
+
Description-Content-Type: text/markdown
|
|
23
|
+
Requires-Dist: chuk-mcp>=0.5.4
|
|
24
|
+
Requires-Dist: dotenv>=0.9.9
|
|
25
|
+
Requires-Dist: psutil>=7.0.0
|
|
26
|
+
Requires-Dist: pydantic>=2.11.3
|
|
27
|
+
Requires-Dist: uuid>=1.30
|
|
28
|
+
|
|
29
|
+
# CHUK Tool Processor
|
|
30
|
+
|
|
31
|
+
[](https://pypi.org/project/chuk-tool-processor/)
|
|
32
|
+
[](https://pypi.org/project/chuk-tool-processor/)
|
|
33
|
+
[](LICENSE)
|
|
34
|
+
|
|
35
|
+
**The missing link between LLM tool calls and reliable execution.**
|
|
36
|
+
|
|
37
|
+
CHUK Tool Processor is a focused, production-ready framework that solves one problem exceptionally well: **processing tool calls from LLM outputs**. It's not a chatbot framework or LLM orchestration platform—it's the glue layer that bridges LLM responses and actual tool execution.
|
|
38
|
+
|
|
39
|
+
## The Problem
|
|
40
|
+
|
|
41
|
+
When you build LLM applications, you face a gap:
|
|
42
|
+
|
|
43
|
+
1. **LLM generates tool calls** in various formats (XML tags, OpenAI `tool_calls`, JSON)
|
|
44
|
+
2. **??? Mystery step ???** where you need to:
|
|
45
|
+
- Parse those calls reliably
|
|
46
|
+
- Handle timeouts, retries, failures
|
|
47
|
+
- Cache expensive results
|
|
48
|
+
- Rate limit API calls
|
|
49
|
+
- Run untrusted code safely
|
|
50
|
+
- Connect to external tool servers
|
|
51
|
+
- Log everything for debugging
|
|
52
|
+
3. **Get results back** to continue the LLM conversation
|
|
53
|
+
|
|
54
|
+
Most frameworks give you steps 1 and 3, but step 2 is where the complexity lives. CHUK Tool Processor **is** step 2.
|
|
55
|
+
|
|
56
|
+
## Why chuk-tool-processor?
|
|
57
|
+
|
|
58
|
+
### It's a Building Block, Not a Framework
|
|
59
|
+
|
|
60
|
+
Unlike full-fledged LLM frameworks (LangChain, LlamaIndex, etc.), CHUK Tool Processor:
|
|
61
|
+
|
|
62
|
+
- ✅ **Does one thing well**: Process tool calls reliably
|
|
63
|
+
- ✅ **Plugs into any LLM app**: Works with any framework or no framework
|
|
64
|
+
- ✅ **Composable by design**: Stack strategies and wrappers like middleware
|
|
65
|
+
- ✅ **No opinions about your LLM**: Bring your own OpenAI, Anthropic, local model
|
|
66
|
+
- ❌ **Doesn't manage conversations**: That's your job
|
|
67
|
+
- ❌ **Doesn't do prompt engineering**: Use whatever prompting you want
|
|
68
|
+
- ❌ **Doesn't bundle an LLM client**: Use any client library you prefer
|
|
69
|
+
|
|
70
|
+
### It's Built for Production
|
|
71
|
+
|
|
72
|
+
Research code vs production code is about handling the edges:
|
|
73
|
+
|
|
74
|
+
- **Timeouts**: Every tool execution has proper timeout handling
|
|
75
|
+
- **Retries**: Automatic retry with exponential backoff
|
|
76
|
+
- **Rate Limiting**: Global and per-tool rate limits with sliding windows
|
|
77
|
+
- **Caching**: Intelligent result caching with TTL
|
|
78
|
+
- **Error Handling**: Graceful degradation, never crashes your app
|
|
79
|
+
- **Observability**: Structured logging, metrics, request tracing
|
|
80
|
+
- **Safety**: Subprocess isolation for untrusted code
|
|
81
|
+
|
|
82
|
+
### It's About Stacks
|
|
83
|
+
|
|
84
|
+
CHUK Tool Processor uses a **composable stack architecture**:
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
┌─────────────────────────────────┐
|
|
88
|
+
│ Your LLM Application │
|
|
89
|
+
│ (handles prompts, responses) │
|
|
90
|
+
└────────────┬────────────────────┘
|
|
91
|
+
│ tool calls
|
|
92
|
+
▼
|
|
93
|
+
┌─────────────────────────────────┐
|
|
94
|
+
│ Caching Wrapper │ ← Cache expensive results
|
|
95
|
+
├─────────────────────────────────┤
|
|
96
|
+
│ Rate Limiting Wrapper │ ← Prevent API abuse
|
|
97
|
+
├─────────────────────────────────┤
|
|
98
|
+
│ Retry Wrapper │ ← Handle transient failures
|
|
99
|
+
├─────────────────────────────────┤
|
|
100
|
+
│ Execution Strategy │ ← How to run tools
|
|
101
|
+
│ • InProcess (fast) │
|
|
102
|
+
│ • Subprocess (isolated) │
|
|
103
|
+
├─────────────────────────────────┤
|
|
104
|
+
│ Tool Registry │ ← Your registered tools
|
|
105
|
+
└─────────────────────────────────┘
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
Each layer is **optional** and **configurable**. Mix and match what you need.
|
|
109
|
+
|
|
110
|
+
## Quick Start
|
|
111
|
+
|
|
112
|
+
### Installation
|
|
113
|
+
|
|
114
|
+
**Prerequisites:** Python 3.11+ • Works on macOS, Linux, Windows
|
|
115
|
+
|
|
116
|
+
```bash
|
|
117
|
+
# Using pip
|
|
118
|
+
pip install chuk-tool-processor
|
|
119
|
+
|
|
120
|
+
# Using uv (recommended)
|
|
121
|
+
uv pip install chuk-tool-processor
|
|
122
|
+
|
|
123
|
+
# Or from source
|
|
124
|
+
git clone https://github.com/chrishayuk/chuk-tool-processor.git
|
|
125
|
+
cd chuk-tool-processor
|
|
126
|
+
uv pip install -e .
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### 3-Minute Example
|
|
130
|
+
|
|
131
|
+
Copy-paste this into a file and run it:
|
|
132
|
+
|
|
133
|
+
```python
|
|
134
|
+
import asyncio
|
|
135
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
136
|
+
from chuk_tool_processor.registry import initialize, register_tool
|
|
137
|
+
|
|
138
|
+
# Step 1: Define a tool
|
|
139
|
+
@register_tool(name="calculator")
|
|
140
|
+
class Calculator:
|
|
141
|
+
async def execute(self, operation: str, a: float, b: float) -> dict:
|
|
142
|
+
ops = {"add": a + b, "multiply": a * b, "subtract": a - b}
|
|
143
|
+
if operation not in ops:
|
|
144
|
+
raise ValueError(f"Unsupported operation: {operation}")
|
|
145
|
+
return {"result": ops[operation]}
|
|
146
|
+
|
|
147
|
+
# Step 2: Process LLM output
|
|
148
|
+
async def main():
|
|
149
|
+
await initialize()
|
|
150
|
+
processor = ToolProcessor()
|
|
151
|
+
|
|
152
|
+
# Your LLM returned this tool call
|
|
153
|
+
llm_output = '<tool name="calculator" args=\'{"operation": "multiply", "a": 15, "b": 23}\'/>'
|
|
154
|
+
|
|
155
|
+
# Process it
|
|
156
|
+
results = await processor.process(llm_output)
|
|
157
|
+
|
|
158
|
+
# Each result is a ToolExecutionResult with: tool, args, result, error, duration, cached
|
|
159
|
+
# results[0].result contains the tool output
|
|
160
|
+
# results[0].error contains any error message (None if successful)
|
|
161
|
+
if results[0].error:
|
|
162
|
+
print(f"Error: {results[0].error}")
|
|
163
|
+
else:
|
|
164
|
+
print(results[0].result) # {'result': 345}
|
|
165
|
+
|
|
166
|
+
asyncio.run(main())
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
**That's it.** You now have production-ready tool execution with timeouts, retries, and caching.
|
|
170
|
+
|
|
171
|
+
> **Why not just use OpenAI tool calls?**
|
|
172
|
+
> OpenAI's function calling is great for parsing, but you still need: parsing multiple formats (Anthropic XML, etc.), timeouts, retries, rate limits, caching, subprocess isolation, and connecting to external MCP servers. CHUK Tool Processor **is** that missing middle layer.
|
|
173
|
+
|
|
174
|
+
## Choose Your Path
|
|
175
|
+
|
|
176
|
+
| Your Goal | What You Need | Where to Look |
|
|
177
|
+
|-----------|---------------|---------------|
|
|
178
|
+
| ☕ **Just process LLM tool calls** | Basic tool registration + processor | [3-Minute Example](#3-minute-example) |
|
|
179
|
+
| 🔌 **Connect to external tools** | MCP integration (HTTP/STDIO/SSE) | [MCP Integration](#5-mcp-integration-external-tools) |
|
|
180
|
+
| 🛡️ **Production deployment** | Timeouts, retries, rate limits, caching | [Production Configuration](#using-the-processor) |
|
|
181
|
+
| 🔒 **Run untrusted code safely** | Subprocess isolation strategy | [Subprocess Strategy](#using-subprocess-strategy) |
|
|
182
|
+
| 📊 **Monitor and observe** | Structured logging and metrics | [Observability](#observability) |
|
|
183
|
+
| 🌊 **Stream incremental results** | StreamingTool pattern | [StreamingTool](#streamingtool-real-time-results) |
|
|
184
|
+
|
|
185
|
+
### Real-World Quick Start
|
|
186
|
+
|
|
187
|
+
Here are the most common patterns you'll use:
|
|
188
|
+
|
|
189
|
+
**Pattern 1: Local tools only**
|
|
190
|
+
```python
|
|
191
|
+
import asyncio
|
|
192
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
193
|
+
from chuk_tool_processor.registry import initialize, register_tool
|
|
194
|
+
|
|
195
|
+
@register_tool(name="my_tool")
|
|
196
|
+
class MyTool:
|
|
197
|
+
async def execute(self, arg: str) -> dict:
|
|
198
|
+
return {"result": f"Processed: {arg}"}
|
|
199
|
+
|
|
200
|
+
async def main():
|
|
201
|
+
await initialize()
|
|
202
|
+
processor = ToolProcessor()
|
|
203
|
+
|
|
204
|
+
llm_output = '<tool name="my_tool" args=\'{"arg": "hello"}\'/>'
|
|
205
|
+
results = await processor.process(llm_output)
|
|
206
|
+
print(results[0].result) # {'result': 'Processed: hello'}
|
|
207
|
+
|
|
208
|
+
asyncio.run(main())
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
**Pattern 2: Mix local + remote MCP tools (Notion)**
|
|
212
|
+
```python
|
|
213
|
+
import asyncio
|
|
214
|
+
from chuk_tool_processor.registry import initialize, register_tool
|
|
215
|
+
from chuk_tool_processor.mcp import setup_mcp_http_streamable
|
|
216
|
+
|
|
217
|
+
@register_tool(name="local_calculator")
|
|
218
|
+
class Calculator:
|
|
219
|
+
async def execute(self, a: int, b: int) -> int:
|
|
220
|
+
return a + b
|
|
221
|
+
|
|
222
|
+
async def main():
|
|
223
|
+
# Register local tools first
|
|
224
|
+
await initialize()
|
|
225
|
+
|
|
226
|
+
# Then add Notion MCP tools (requires OAuth token)
|
|
227
|
+
processor, manager = await setup_mcp_http_streamable(
|
|
228
|
+
servers=[{
|
|
229
|
+
"name": "notion",
|
|
230
|
+
"url": "https://mcp.notion.com/mcp",
|
|
231
|
+
"headers": {"Authorization": f"Bearer {access_token}"}
|
|
232
|
+
}],
|
|
233
|
+
namespace="notion",
|
|
234
|
+
initialization_timeout=120.0
|
|
235
|
+
)
|
|
236
|
+
|
|
237
|
+
# Now you have both local and remote tools!
|
|
238
|
+
results = await processor.process('''
|
|
239
|
+
<tool name="local_calculator" args='{"a": 5, "b": 3}'/>
|
|
240
|
+
<tool name="notion.search_pages" args='{"query": "project docs"}'/>
|
|
241
|
+
''')
|
|
242
|
+
print(f"Local result: {results[0].result}")
|
|
243
|
+
print(f"Notion result: {results[1].result}")
|
|
244
|
+
|
|
245
|
+
asyncio.run(main())
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
See `examples/notion_oauth.py` for complete OAuth flow.
|
|
249
|
+
|
|
250
|
+
**Pattern 3: Local SQLite database via STDIO**
|
|
251
|
+
```python
|
|
252
|
+
import asyncio
|
|
253
|
+
import json
|
|
254
|
+
from chuk_tool_processor.mcp import setup_mcp_stdio
|
|
255
|
+
|
|
256
|
+
async def main():
|
|
257
|
+
# Configure SQLite MCP server (runs locally)
|
|
258
|
+
config = {
|
|
259
|
+
"mcpServers": {
|
|
260
|
+
"sqlite": {
|
|
261
|
+
"command": "uvx",
|
|
262
|
+
"args": ["mcp-server-sqlite", "--db-path", "./app.db"],
|
|
263
|
+
"transport": "stdio"
|
|
264
|
+
}
|
|
265
|
+
}
|
|
266
|
+
}
|
|
267
|
+
|
|
268
|
+
with open("mcp_config.json", "w") as f:
|
|
269
|
+
json.dump(config, f)
|
|
270
|
+
|
|
271
|
+
processor, manager = await setup_mcp_stdio(
|
|
272
|
+
config_file="mcp_config.json",
|
|
273
|
+
servers=["sqlite"],
|
|
274
|
+
namespace="db",
|
|
275
|
+
initialization_timeout=120.0 # First run downloads the package
|
|
276
|
+
)
|
|
277
|
+
|
|
278
|
+
# Query your local database via MCP
|
|
279
|
+
results = await processor.process(
|
|
280
|
+
'<tool name="db.query" args=\'{"sql": "SELECT * FROM users LIMIT 10"}\'/>'
|
|
281
|
+
)
|
|
282
|
+
print(results[0].result)
|
|
283
|
+
|
|
284
|
+
asyncio.run(main())
|
|
285
|
+
```
|
|
286
|
+
|
|
287
|
+
See `examples/stdio_sqlite.py` for complete working example.
|
|
288
|
+
|
|
289
|
+
## Core Concepts
|
|
290
|
+
|
|
291
|
+
### 1. Tool Registry
|
|
292
|
+
|
|
293
|
+
The **registry** is where you register tools for execution. Tools can be:
|
|
294
|
+
|
|
295
|
+
- **Simple classes** with an `async execute()` method
|
|
296
|
+
- **ValidatedTool** subclasses with Pydantic validation
|
|
297
|
+
- **StreamingTool** for real-time incremental results
|
|
298
|
+
- **Functions** registered via `register_fn_tool()`
|
|
299
|
+
|
|
300
|
+
```python
|
|
301
|
+
from chuk_tool_processor.registry import register_tool
|
|
302
|
+
from chuk_tool_processor.models.validated_tool import ValidatedTool
|
|
303
|
+
from pydantic import BaseModel, Field
|
|
304
|
+
|
|
305
|
+
@register_tool(name="weather")
|
|
306
|
+
class WeatherTool(ValidatedTool):
|
|
307
|
+
class Arguments(BaseModel):
|
|
308
|
+
location: str = Field(..., description="City name")
|
|
309
|
+
units: str = Field("celsius", description="Temperature units")
|
|
310
|
+
|
|
311
|
+
class Result(BaseModel):
|
|
312
|
+
temperature: float
|
|
313
|
+
conditions: str
|
|
314
|
+
|
|
315
|
+
async def _execute(self, location: str, units: str) -> Result:
|
|
316
|
+
# Your weather API logic here
|
|
317
|
+
return self.Result(temperature=22.5, conditions="Sunny")
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
### 2. Execution Strategies
|
|
321
|
+
|
|
322
|
+
**Strategies** determine *how* tools run:
|
|
323
|
+
|
|
324
|
+
| Strategy | Use Case | Trade-offs |
|
|
325
|
+
|----------|----------|------------|
|
|
326
|
+
| **InProcessStrategy** | Fast, trusted tools | Speed ✅, Isolation ❌ |
|
|
327
|
+
| **SubprocessStrategy** | Untrusted or risky code | Isolation ✅, Speed ❌ |
|
|
328
|
+
|
|
329
|
+
```python
|
|
330
|
+
import asyncio
|
|
331
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
332
|
+
from chuk_tool_processor.execution.strategies.subprocess_strategy import SubprocessStrategy
|
|
333
|
+
from chuk_tool_processor.registry import get_default_registry
|
|
334
|
+
|
|
335
|
+
async def main():
|
|
336
|
+
registry = await get_default_registry()
|
|
337
|
+
processor = ToolProcessor(
|
|
338
|
+
strategy=SubprocessStrategy(
|
|
339
|
+
registry=registry,
|
|
340
|
+
max_workers=4,
|
|
341
|
+
default_timeout=30.0
|
|
342
|
+
)
|
|
343
|
+
)
|
|
344
|
+
# Use processor...
|
|
345
|
+
|
|
346
|
+
asyncio.run(main())
|
|
347
|
+
```
|
|
348
|
+
|
|
349
|
+
### 3. Execution Wrappers (Middleware)
|
|
350
|
+
|
|
351
|
+
**Wrappers** add production features as composable layers:
|
|
352
|
+
|
|
353
|
+
```python
|
|
354
|
+
processor = ToolProcessor(
|
|
355
|
+
enable_caching=True, # Cache expensive calls
|
|
356
|
+
cache_ttl=600, # 10 minutes
|
|
357
|
+
enable_rate_limiting=True, # Prevent abuse
|
|
358
|
+
global_rate_limit=100, # 100 req/min globally
|
|
359
|
+
enable_retries=True, # Auto-retry failures
|
|
360
|
+
max_retries=3 # Up to 3 attempts
|
|
361
|
+
)
|
|
362
|
+
```
|
|
363
|
+
|
|
364
|
+
The processor stacks them automatically: **Cache → Rate Limit → Retry → Strategy → Tool**
|
|
365
|
+
|
|
366
|
+
### 4. Input Parsers (Plugins)
|
|
367
|
+
|
|
368
|
+
**Parsers** extract tool calls from various LLM output formats:
|
|
369
|
+
|
|
370
|
+
**XML Tags (Anthropic-style)**
|
|
371
|
+
```xml
|
|
372
|
+
<tool name="search" args='{"query": "Python"}'/>
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
**OpenAI `tool_calls` (JSON)**
|
|
376
|
+
```json
|
|
377
|
+
{
|
|
378
|
+
"tool_calls": [
|
|
379
|
+
{
|
|
380
|
+
"type": "function",
|
|
381
|
+
"function": {
|
|
382
|
+
"name": "search",
|
|
383
|
+
"arguments": "{\"query\": \"Python\"}"
|
|
384
|
+
}
|
|
385
|
+
}
|
|
386
|
+
]
|
|
387
|
+
}
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
**Direct JSON (array of calls)**
|
|
391
|
+
```json
|
|
392
|
+
[
|
|
393
|
+
{ "tool": "search", "arguments": { "query": "Python" } }
|
|
394
|
+
]
|
|
395
|
+
```
|
|
396
|
+
|
|
397
|
+
All formats work automatically—no configuration needed.
|
|
398
|
+
|
|
399
|
+
**Input Format Compatibility:**
|
|
400
|
+
|
|
401
|
+
| Format | Example | Use Case |
|
|
402
|
+
|--------|---------|----------|
|
|
403
|
+
| **XML Tool Tag** | `<tool name="search" args='{"q":"Python"}'/>`| Anthropic Claude, XML-based LLMs |
|
|
404
|
+
| **OpenAI tool_calls** | JSON object (above) | OpenAI GPT-4 function calling |
|
|
405
|
+
| **Direct JSON** | `[{"tool": "search", "arguments": {"q": "Python"}}]` | Generic API integrations |
|
|
406
|
+
| **Single dict** | `{"tool": "search", "arguments": {"q": "Python"}}` | Programmatic calls |
|
|
407
|
+
|
|
408
|
+
### 5. MCP Integration (External Tools)
|
|
409
|
+
|
|
410
|
+
Connect to **remote tool servers** using the [Model Context Protocol](https://modelcontextprotocol.io). CHUK Tool Processor supports three transport mechanisms for different use cases:
|
|
411
|
+
|
|
412
|
+
#### HTTP Streamable (⭐ Recommended for Cloud Services)
|
|
413
|
+
|
|
414
|
+
Modern HTTP streaming transport for cloud-based MCP servers like Notion:
|
|
415
|
+
|
|
416
|
+
```python
|
|
417
|
+
from chuk_tool_processor.mcp import setup_mcp_http_streamable
|
|
418
|
+
|
|
419
|
+
# Connect to Notion MCP with OAuth
|
|
420
|
+
servers = [
|
|
421
|
+
{
|
|
422
|
+
"name": "notion",
|
|
423
|
+
"url": "https://mcp.notion.com/mcp",
|
|
424
|
+
"headers": {"Authorization": f"Bearer {access_token}"}
|
|
425
|
+
}
|
|
426
|
+
]
|
|
427
|
+
|
|
428
|
+
processor, manager = await setup_mcp_http_streamable(
|
|
429
|
+
servers=servers,
|
|
430
|
+
namespace="notion",
|
|
431
|
+
initialization_timeout=120.0, # Some services need time to initialize
|
|
432
|
+
enable_caching=True,
|
|
433
|
+
enable_retries=True
|
|
434
|
+
)
|
|
435
|
+
|
|
436
|
+
# Use Notion tools through MCP
|
|
437
|
+
results = await processor.process(
|
|
438
|
+
'<tool name="notion.search_pages" args=\'{"query": "meeting notes"}\'/>'
|
|
439
|
+
)
|
|
440
|
+
```
|
|
441
|
+
|
|
442
|
+
#### STDIO (Best for Local/On-Device Tools)
|
|
443
|
+
|
|
444
|
+
For running local MCP servers as subprocesses—great for databases, file systems, and local tools:
|
|
445
|
+
|
|
446
|
+
```python
|
|
447
|
+
from chuk_tool_processor.mcp import setup_mcp_stdio
|
|
448
|
+
import json
|
|
449
|
+
|
|
450
|
+
# Configure SQLite MCP server
|
|
451
|
+
config = {
|
|
452
|
+
"mcpServers": {
|
|
453
|
+
"sqlite": {
|
|
454
|
+
"command": "uvx",
|
|
455
|
+
"args": ["mcp-server-sqlite", "--db-path", "/path/to/database.db"],
|
|
456
|
+
"env": {"MCP_SERVER_NAME": "sqlite"},
|
|
457
|
+
"transport": "stdio"
|
|
458
|
+
}
|
|
459
|
+
}
|
|
460
|
+
}
|
|
461
|
+
|
|
462
|
+
# Save config to file
|
|
463
|
+
with open("mcp_config.json", "w") as f:
|
|
464
|
+
json.dump(config, f)
|
|
465
|
+
|
|
466
|
+
# Connect to local SQLite server
|
|
467
|
+
processor, manager = await setup_mcp_stdio(
|
|
468
|
+
config_file="mcp_config.json",
|
|
469
|
+
servers=["sqlite"],
|
|
470
|
+
namespace="db",
|
|
471
|
+
initialization_timeout=120.0 # First run downloads packages
|
|
472
|
+
)
|
|
473
|
+
|
|
474
|
+
# Query your local database via MCP
|
|
475
|
+
results = await processor.process(
|
|
476
|
+
'<tool name="db.query" args=\'{"sql": "SELECT * FROM users LIMIT 10"}\'/>'
|
|
477
|
+
)
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
#### SSE (Legacy Support)
|
|
481
|
+
|
|
482
|
+
For backward compatibility with older MCP servers using Server-Sent Events:
|
|
483
|
+
|
|
484
|
+
```python
|
|
485
|
+
from chuk_tool_processor.mcp import setup_mcp_sse
|
|
486
|
+
|
|
487
|
+
# Connect to Atlassian with OAuth via SSE
|
|
488
|
+
servers = [
|
|
489
|
+
{
|
|
490
|
+
"name": "atlassian",
|
|
491
|
+
"url": "https://mcp.atlassian.com/v1/sse",
|
|
492
|
+
"headers": {"Authorization": f"Bearer {access_token}"}
|
|
493
|
+
}
|
|
494
|
+
]
|
|
495
|
+
|
|
496
|
+
processor, manager = await setup_mcp_sse(
|
|
497
|
+
servers=servers,
|
|
498
|
+
namespace="atlassian",
|
|
499
|
+
initialization_timeout=120.0
|
|
500
|
+
)
|
|
501
|
+
```
|
|
502
|
+
|
|
503
|
+
**Transport Comparison:**
|
|
504
|
+
|
|
505
|
+
| Transport | Use Case | Real Examples |
|
|
506
|
+
|-----------|----------|---------------|
|
|
507
|
+
| **HTTP Streamable** | Cloud APIs, SaaS services | Notion (`mcp.notion.com`) |
|
|
508
|
+
| **STDIO** | Local tools, databases | SQLite (`mcp-server-sqlite`), Echo (`chuk-mcp-echo`) |
|
|
509
|
+
| **SSE** | Legacy cloud services | Atlassian (`mcp.atlassian.com`) |
|
|
510
|
+
|
|
511
|
+
**Relationship with [chuk-mcp](https://github.com/chrishayuk/chuk-mcp):**
|
|
512
|
+
- `chuk-mcp` is a low-level MCP protocol client (handles transports, protocol negotiation)
|
|
513
|
+
- `chuk-tool-processor` wraps `chuk-mcp` to integrate external tools into your execution pipeline
|
|
514
|
+
- You can use local tools, remote MCP tools, or both in the same processor
|
|
515
|
+
|
|
516
|
+
## Getting Started
|
|
517
|
+
|
|
518
|
+
### Creating Tools
|
|
519
|
+
|
|
520
|
+
CHUK Tool Processor supports multiple patterns for defining tools:
|
|
521
|
+
|
|
522
|
+
#### Simple Function-Based Tools
|
|
523
|
+
```python
|
|
524
|
+
from chuk_tool_processor.registry.auto_register import register_fn_tool
|
|
525
|
+
from datetime import datetime
|
|
526
|
+
from zoneinfo import ZoneInfo
|
|
527
|
+
|
|
528
|
+
def get_current_time(timezone: str = "UTC") -> str:
|
|
529
|
+
"""Get the current time in the specified timezone."""
|
|
530
|
+
now = datetime.now(ZoneInfo(timezone))
|
|
531
|
+
return now.strftime("%Y-%m-%d %H:%M:%S %Z")
|
|
532
|
+
|
|
533
|
+
# Register the function as a tool (sync — no await needed)
|
|
534
|
+
register_fn_tool(get_current_time, namespace="utilities")
|
|
535
|
+
```
|
|
536
|
+
|
|
537
|
+
#### ValidatedTool (Pydantic Type Safety)
|
|
538
|
+
|
|
539
|
+
For production tools, use Pydantic validation:
|
|
540
|
+
|
|
541
|
+
```python
|
|
542
|
+
@register_tool(name="weather")
|
|
543
|
+
class WeatherTool(ValidatedTool):
|
|
544
|
+
class Arguments(BaseModel):
|
|
545
|
+
location: str = Field(..., description="City name")
|
|
546
|
+
units: str = Field("celsius", description="Temperature units")
|
|
547
|
+
|
|
548
|
+
class Result(BaseModel):
|
|
549
|
+
temperature: float
|
|
550
|
+
conditions: str
|
|
551
|
+
|
|
552
|
+
async def _execute(self, location: str, units: str) -> Result:
|
|
553
|
+
return self.Result(temperature=22.5, conditions="Sunny")
|
|
554
|
+
```
|
|
555
|
+
|
|
556
|
+
#### StreamingTool (Real-time Results)
|
|
557
|
+
|
|
558
|
+
For long-running operations that produce incremental results:
|
|
559
|
+
|
|
560
|
+
```python
|
|
561
|
+
from chuk_tool_processor.models import StreamingTool
|
|
562
|
+
|
|
563
|
+
@register_tool(name="file_processor")
|
|
564
|
+
class FileProcessor(StreamingTool):
|
|
565
|
+
class Arguments(BaseModel):
|
|
566
|
+
file_path: str
|
|
567
|
+
|
|
568
|
+
class Result(BaseModel):
|
|
569
|
+
line: int
|
|
570
|
+
content: str
|
|
571
|
+
|
|
572
|
+
async def _stream_execute(self, file_path: str):
|
|
573
|
+
with open(file_path) as f:
|
|
574
|
+
for i, line in enumerate(f, 1):
|
|
575
|
+
yield self.Result(line=i, content=line.strip())
|
|
576
|
+
```
|
|
577
|
+
|
|
578
|
+
**Consuming streaming results:**
|
|
579
|
+
|
|
580
|
+
```python
|
|
581
|
+
import asyncio
|
|
582
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
583
|
+
from chuk_tool_processor.registry import initialize
|
|
584
|
+
|
|
585
|
+
async def main():
|
|
586
|
+
await initialize()
|
|
587
|
+
processor = ToolProcessor()
|
|
588
|
+
async for event in processor.astream('<tool name="file_processor" args=\'{"file_path":"README.md"}\'/>'):
|
|
589
|
+
# 'event' is a streamed chunk (either your Result model instance or a dict)
|
|
590
|
+
line = event["line"] if isinstance(event, dict) else getattr(event, "line", None)
|
|
591
|
+
content = event["content"] if isinstance(event, dict) else getattr(event, "content", None)
|
|
592
|
+
print(f"Line {line}: {content}")
|
|
593
|
+
|
|
594
|
+
asyncio.run(main())
|
|
595
|
+
```
|
|
596
|
+
|
|
597
|
+
### Using the Processor
|
|
598
|
+
|
|
599
|
+
#### Basic Usage
|
|
600
|
+
|
|
601
|
+
Call `await initialize()` once at startup to load your registry.
|
|
602
|
+
|
|
603
|
+
```python
|
|
604
|
+
import asyncio
|
|
605
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
606
|
+
from chuk_tool_processor.registry import initialize
|
|
607
|
+
|
|
608
|
+
async def main():
|
|
609
|
+
await initialize()
|
|
610
|
+
processor = ToolProcessor()
|
|
611
|
+
llm_output = '<tool name="calculator" args=\'{"operation":"add","a":2,"b":3}\'/>'
|
|
612
|
+
results = await processor.process(llm_output)
|
|
613
|
+
for result in results:
|
|
614
|
+
if result.error:
|
|
615
|
+
print(f"Error: {result.error}")
|
|
616
|
+
else:
|
|
617
|
+
print(f"Success: {result.result}")
|
|
618
|
+
|
|
619
|
+
asyncio.run(main())
|
|
620
|
+
```
|
|
621
|
+
|
|
622
|
+
#### Production Configuration
|
|
623
|
+
|
|
624
|
+
```python
|
|
625
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
626
|
+
|
|
627
|
+
processor = ToolProcessor(
|
|
628
|
+
# Execution settings
|
|
629
|
+
default_timeout=30.0,
|
|
630
|
+
max_concurrency=20,
|
|
631
|
+
|
|
632
|
+
# Production features
|
|
633
|
+
enable_caching=True,
|
|
634
|
+
cache_ttl=600,
|
|
635
|
+
enable_rate_limiting=True,
|
|
636
|
+
global_rate_limit=100,
|
|
637
|
+
enable_retries=True,
|
|
638
|
+
max_retries=3
|
|
639
|
+
)
|
|
640
|
+
```
|
|
641
|
+
|
|
642
|
+
## Advanced Topics
|
|
643
|
+
|
|
644
|
+
### Using Subprocess Strategy
|
|
645
|
+
|
|
646
|
+
Use `SubprocessStrategy` when running untrusted, third-party, or potentially unsafe code that shouldn't share the same process as your main app.
|
|
647
|
+
|
|
648
|
+
For isolation and safety when running untrusted code:
|
|
649
|
+
|
|
650
|
+
```python
|
|
651
|
+
import asyncio
|
|
652
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
653
|
+
from chuk_tool_processor.execution.strategies.subprocess_strategy import SubprocessStrategy
|
|
654
|
+
from chuk_tool_processor.registry import get_default_registry
|
|
655
|
+
|
|
656
|
+
async def main():
|
|
657
|
+
registry = await get_default_registry()
|
|
658
|
+
processor = ToolProcessor(
|
|
659
|
+
strategy=SubprocessStrategy(
|
|
660
|
+
registry=registry,
|
|
661
|
+
max_workers=4,
|
|
662
|
+
default_timeout=30.0
|
|
663
|
+
)
|
|
664
|
+
)
|
|
665
|
+
# Use processor...
|
|
666
|
+
|
|
667
|
+
asyncio.run(main())
|
|
668
|
+
```
|
|
669
|
+
|
|
670
|
+
### Real-World MCP Examples
|
|
671
|
+
|
|
672
|
+
#### Example 1: Notion Integration with OAuth
|
|
673
|
+
|
|
674
|
+
Complete OAuth flow connecting to Notion's MCP server:
|
|
675
|
+
|
|
676
|
+
```python
|
|
677
|
+
from chuk_tool_processor.mcp import setup_mcp_http_streamable
|
|
678
|
+
|
|
679
|
+
# After completing OAuth flow (see examples/notion_oauth.py for full flow)
|
|
680
|
+
processor, manager = await setup_mcp_http_streamable(
|
|
681
|
+
servers=[{
|
|
682
|
+
"name": "notion",
|
|
683
|
+
"url": "https://mcp.notion.com/mcp",
|
|
684
|
+
"headers": {"Authorization": f"Bearer {access_token}"}
|
|
685
|
+
}],
|
|
686
|
+
namespace="notion",
|
|
687
|
+
initialization_timeout=120.0
|
|
688
|
+
)
|
|
689
|
+
|
|
690
|
+
# Get available Notion tools
|
|
691
|
+
tools = manager.get_all_tools()
|
|
692
|
+
print(f"Available tools: {[t['name'] for t in tools]}")
|
|
693
|
+
|
|
694
|
+
# Use Notion tools in your LLM workflow
|
|
695
|
+
results = await processor.process(
|
|
696
|
+
'<tool name="notion.search_pages" args=\'{"query": "Q4 planning"}\'/>'
|
|
697
|
+
)
|
|
698
|
+
```
|
|
699
|
+
|
|
700
|
+
#### Example 2: Local SQLite Database Access
|
|
701
|
+
|
|
702
|
+
Run SQLite MCP server locally for database operations:
|
|
703
|
+
|
|
704
|
+
```python
|
|
705
|
+
from chuk_tool_processor.mcp import setup_mcp_stdio
|
|
706
|
+
import json
|
|
707
|
+
|
|
708
|
+
# Configure SQLite server
|
|
709
|
+
config = {
|
|
710
|
+
"mcpServers": {
|
|
711
|
+
"sqlite": {
|
|
712
|
+
"command": "uvx",
|
|
713
|
+
"args": ["mcp-server-sqlite", "--db-path", "./data/app.db"],
|
|
714
|
+
"transport": "stdio"
|
|
715
|
+
}
|
|
716
|
+
}
|
|
717
|
+
}
|
|
718
|
+
|
|
719
|
+
with open("mcp_config.json", "w") as f:
|
|
720
|
+
json.dump(config, f)
|
|
721
|
+
|
|
722
|
+
# Connect to local database
|
|
723
|
+
processor, manager = await setup_mcp_stdio(
|
|
724
|
+
config_file="mcp_config.json",
|
|
725
|
+
servers=["sqlite"],
|
|
726
|
+
namespace="db",
|
|
727
|
+
initialization_timeout=120.0 # First run downloads mcp-server-sqlite
|
|
728
|
+
)
|
|
729
|
+
|
|
730
|
+
# Query your database via LLM
|
|
731
|
+
results = await processor.process(
|
|
732
|
+
'<tool name="db.query" args=\'{"sql": "SELECT COUNT(*) FROM users"}\'/>'
|
|
733
|
+
)
|
|
734
|
+
```
|
|
735
|
+
|
|
736
|
+
#### Example 3: Simple STDIO Echo Server
|
|
737
|
+
|
|
738
|
+
Minimal example for testing STDIO transport:
|
|
739
|
+
|
|
740
|
+
```python
|
|
741
|
+
from chuk_tool_processor.mcp import setup_mcp_stdio
|
|
742
|
+
import json
|
|
743
|
+
|
|
744
|
+
# Configure echo server (great for testing)
|
|
745
|
+
config = {
|
|
746
|
+
"mcpServers": {
|
|
747
|
+
"echo": {
|
|
748
|
+
"command": "uvx",
|
|
749
|
+
"args": ["chuk-mcp-echo", "stdio"],
|
|
750
|
+
"transport": "stdio"
|
|
751
|
+
}
|
|
752
|
+
}
|
|
753
|
+
}
|
|
754
|
+
|
|
755
|
+
with open("echo_config.json", "w") as f:
|
|
756
|
+
json.dump(config, f)
|
|
757
|
+
|
|
758
|
+
processor, manager = await setup_mcp_stdio(
|
|
759
|
+
config_file="echo_config.json",
|
|
760
|
+
servers=["echo"],
|
|
761
|
+
namespace="echo",
|
|
762
|
+
initialization_timeout=60.0
|
|
763
|
+
)
|
|
764
|
+
|
|
765
|
+
# Test echo functionality
|
|
766
|
+
results = await processor.process(
|
|
767
|
+
'<tool name="echo.echo" args=\'{"message": "Hello MCP!"}\'/>'
|
|
768
|
+
)
|
|
769
|
+
```
|
|
770
|
+
|
|
771
|
+
See `examples/notion_oauth.py`, `examples/stdio_sqlite.py`, and `examples/stdio_echo.py` for complete working implementations.
|
|
772
|
+
|
|
773
|
+
### Observability
|
|
774
|
+
|
|
775
|
+
#### Structured Logging
|
|
776
|
+
|
|
777
|
+
Enable JSON logging for production observability:
|
|
778
|
+
|
|
779
|
+
```python
|
|
780
|
+
import asyncio
|
|
781
|
+
from chuk_tool_processor.logging import setup_logging, get_logger
|
|
782
|
+
|
|
783
|
+
async def main():
|
|
784
|
+
await setup_logging(
|
|
785
|
+
level="INFO",
|
|
786
|
+
structured=True, # JSON output (structured=False for human-readable)
|
|
787
|
+
log_file="tool_processor.log"
|
|
788
|
+
)
|
|
789
|
+
logger = get_logger("my_app")
|
|
790
|
+
logger.info("logging ready")
|
|
791
|
+
|
|
792
|
+
asyncio.run(main())
|
|
793
|
+
```
|
|
794
|
+
|
|
795
|
+
When `structured=True`, logs are output as JSON. When `structured=False`, they're human-readable text.
|
|
796
|
+
|
|
797
|
+
Example JSON log output:
|
|
798
|
+
|
|
799
|
+
```json
|
|
800
|
+
{
|
|
801
|
+
"timestamp": "2025-01-15T10:30:45.123Z",
|
|
802
|
+
"level": "INFO",
|
|
803
|
+
"tool": "calculator",
|
|
804
|
+
"status": "success",
|
|
805
|
+
"duration_ms": 4.2,
|
|
806
|
+
"cached": false,
|
|
807
|
+
"attempts": 1
|
|
808
|
+
}
|
|
809
|
+
```
|
|
810
|
+
|
|
811
|
+
#### Automatic Metrics
|
|
812
|
+
|
|
813
|
+
Metrics are automatically collected for:
|
|
814
|
+
- ✅ Tool execution (success/failure rates, duration)
|
|
815
|
+
- ✅ Cache performance (hit/miss rates)
|
|
816
|
+
- ✅ Parser accuracy (which parsers succeeded)
|
|
817
|
+
- ✅ Retry attempts (how many retries per tool)
|
|
818
|
+
|
|
819
|
+
Access metrics programmatically:
|
|
820
|
+
|
|
821
|
+
```python
|
|
822
|
+
import asyncio
|
|
823
|
+
from chuk_tool_processor.logging import metrics
|
|
824
|
+
|
|
825
|
+
async def main():
|
|
826
|
+
# Metrics are logged automatically, but you can also access them
|
|
827
|
+
await metrics.log_tool_execution(
|
|
828
|
+
tool="custom_tool",
|
|
829
|
+
success=True,
|
|
830
|
+
duration=1.5,
|
|
831
|
+
cached=False,
|
|
832
|
+
attempts=1
|
|
833
|
+
)
|
|
834
|
+
|
|
835
|
+
asyncio.run(main())
|
|
836
|
+
```
|
|
837
|
+
|
|
838
|
+
### Error Handling
|
|
839
|
+
|
|
840
|
+
```python
|
|
841
|
+
results = await processor.process(llm_output)
|
|
842
|
+
|
|
843
|
+
for result in results:
|
|
844
|
+
if result.error:
|
|
845
|
+
print(f"Tool '{result.tool}' failed: {result.error}")
|
|
846
|
+
print(f"Duration: {result.duration}s")
|
|
847
|
+
else:
|
|
848
|
+
print(f"Tool '{result.tool}' succeeded: {result.result}")
|
|
849
|
+
```
|
|
850
|
+
|
|
851
|
+
### Testing Tools
|
|
852
|
+
|
|
853
|
+
```python
|
|
854
|
+
import pytest
|
|
855
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
856
|
+
from chuk_tool_processor.registry import initialize
|
|
857
|
+
|
|
858
|
+
@pytest.mark.asyncio
|
|
859
|
+
async def test_calculator():
|
|
860
|
+
await initialize()
|
|
861
|
+
processor = ToolProcessor()
|
|
862
|
+
|
|
863
|
+
results = await processor.process(
|
|
864
|
+
'<tool name="calculator" args=\'{"operation": "add", "a": 5, "b": 3}\'/>'
|
|
865
|
+
)
|
|
866
|
+
|
|
867
|
+
assert results[0].result["result"] == 8
|
|
868
|
+
```
|
|
869
|
+
|
|
870
|
+
## Configuration
|
|
871
|
+
|
|
872
|
+
### Environment Variables
|
|
873
|
+
|
|
874
|
+
| Variable | Default | Description |
|
|
875
|
+
|----------|---------|-------------|
|
|
876
|
+
| `CHUK_TOOL_REGISTRY_PROVIDER` | `memory` | Registry backend |
|
|
877
|
+
| `CHUK_DEFAULT_TIMEOUT` | `30.0` | Default timeout (seconds) |
|
|
878
|
+
| `CHUK_LOG_LEVEL` | `INFO` | Logging level |
|
|
879
|
+
| `CHUK_STRUCTURED_LOGGING` | `true` | Enable JSON logging |
|
|
880
|
+
| `MCP_BEARER_TOKEN` | - | Bearer token for MCP SSE |
|
|
881
|
+
|
|
882
|
+
### ToolProcessor Options
|
|
883
|
+
|
|
884
|
+
```python
|
|
885
|
+
processor = ToolProcessor(
|
|
886
|
+
default_timeout=30.0, # Timeout per tool
|
|
887
|
+
max_concurrency=10, # Max concurrent executions
|
|
888
|
+
enable_caching=True, # Result caching
|
|
889
|
+
cache_ttl=300, # Cache TTL (seconds)
|
|
890
|
+
enable_rate_limiting=False, # Rate limiting
|
|
891
|
+
global_rate_limit=None, # (requests per minute) global cap
|
|
892
|
+
enable_retries=True, # Auto-retry failures
|
|
893
|
+
max_retries=3, # Max retry attempts
|
|
894
|
+
# Optional per-tool rate limits: {"tool.name": (requests, per_seconds)}
|
|
895
|
+
tool_rate_limits=None
|
|
896
|
+
)
|
|
897
|
+
```
|
|
898
|
+
|
|
899
|
+
### Performance & Tuning
|
|
900
|
+
|
|
901
|
+
| Parameter | Default | When to Adjust |
|
|
902
|
+
|-----------|---------|----------------|
|
|
903
|
+
| `default_timeout` | `30.0` | Increase for slow tools (e.g., AI APIs) |
|
|
904
|
+
| `max_concurrency` | `10` | Increase for I/O-bound tools, decrease for CPU-bound |
|
|
905
|
+
| `enable_caching` | `True` | Keep on for deterministic tools |
|
|
906
|
+
| `cache_ttl` | `300` | Longer for stable data, shorter for real-time |
|
|
907
|
+
| `enable_rate_limiting` | `False` | Enable when hitting API rate limits |
|
|
908
|
+
| `global_rate_limit` | `None` | Set a global requests/min cap across all tools |
|
|
909
|
+
| `enable_retries` | `True` | Disable for non-idempotent operations |
|
|
910
|
+
| `max_retries` | `3` | Increase for flaky external APIs |
|
|
911
|
+
| `tool_rate_limits` | `None` | Dict mapping tool name → (max_requests, window_seconds). Overrides `global_rate_limit` per tool |
|
|
912
|
+
|
|
913
|
+
**Per-tool rate limiting example:**
|
|
914
|
+
|
|
915
|
+
```python
|
|
916
|
+
processor = ToolProcessor(
|
|
917
|
+
enable_rate_limiting=True,
|
|
918
|
+
global_rate_limit=100, # 100 requests/minute across all tools
|
|
919
|
+
tool_rate_limits={
|
|
920
|
+
"notion.search_pages": (10, 60), # 10 requests per 60 seconds
|
|
921
|
+
"expensive_api": (5, 60), # 5 requests per minute
|
|
922
|
+
"local_tool": (1000, 60), # 1000 requests per minute (local is fast)
|
|
923
|
+
}
|
|
924
|
+
)
|
|
925
|
+
```
|
|
926
|
+
|
|
927
|
+
### Security Model
|
|
928
|
+
|
|
929
|
+
CHUK Tool Processor provides multiple layers of safety:
|
|
930
|
+
|
|
931
|
+
| Concern | Protection | Configuration |
|
|
932
|
+
|---------|------------|---------------|
|
|
933
|
+
| **Timeouts** | Every tool has a timeout | `default_timeout=30.0` |
|
|
934
|
+
| **Process Isolation** | Run tools in separate processes | `strategy=SubprocessStrategy()` |
|
|
935
|
+
| **Rate Limiting** | Prevent abuse and API overuse | `enable_rate_limiting=True` |
|
|
936
|
+
| **Input Validation** | Pydantic validation on arguments | Use `ValidatedTool` |
|
|
937
|
+
| **Error Containment** | Failures don't crash the processor | Built-in exception handling |
|
|
938
|
+
| **Retry Limits** | Prevent infinite retry loops | `max_retries=3` |
|
|
939
|
+
|
|
940
|
+
**Important Security Notes:**
|
|
941
|
+
- **Environment Variables**: Subprocess strategy inherits the parent process environment by default. For stricter isolation, use container-level controls (Docker, cgroups).
|
|
942
|
+
- **Network Access**: Tools inherit network access from the host. For network isolation, use OS-level sandboxing (containers, network namespaces, firewalls).
|
|
943
|
+
- **Resource Limits**: For hard CPU/memory caps, use OS-level controls (cgroups on Linux, Job Objects on Windows, or Docker resource limits).
|
|
944
|
+
- **Secrets**: Never injected automatically. Pass secrets explicitly via tool arguments or environment variables, and prefer scoped env vars for subprocess tools to minimize exposure.
|
|
945
|
+
|
|
946
|
+
Example security-focused setup for untrusted code:
|
|
947
|
+
|
|
948
|
+
```python
|
|
949
|
+
import asyncio
|
|
950
|
+
from chuk_tool_processor.core.processor import ToolProcessor
|
|
951
|
+
from chuk_tool_processor.execution.strategies.subprocess_strategy import SubprocessStrategy
|
|
952
|
+
from chuk_tool_processor.registry import get_default_registry
|
|
953
|
+
|
|
954
|
+
async def create_secure_processor():
|
|
955
|
+
# Maximum isolation for untrusted code
|
|
956
|
+
# Runs each tool in a separate process
|
|
957
|
+
registry = await get_default_registry()
|
|
958
|
+
|
|
959
|
+
processor = ToolProcessor(
|
|
960
|
+
strategy=SubprocessStrategy(
|
|
961
|
+
registry=registry,
|
|
962
|
+
max_workers=4,
|
|
963
|
+
default_timeout=10.0
|
|
964
|
+
),
|
|
965
|
+
default_timeout=10.0,
|
|
966
|
+
enable_rate_limiting=True,
|
|
967
|
+
global_rate_limit=50, # 50 requests/minute
|
|
968
|
+
max_retries=2
|
|
969
|
+
)
|
|
970
|
+
return processor
|
|
971
|
+
|
|
972
|
+
# For even stricter isolation:
|
|
973
|
+
# - Run the entire processor inside a Docker container with resource limits
|
|
974
|
+
# - Use network policies to restrict outbound connections
|
|
975
|
+
# - Use read-only filesystems where possible
|
|
976
|
+
```
|
|
977
|
+
|
|
978
|
+
## Architecture Principles
|
|
979
|
+
|
|
980
|
+
1. **Composability**: Stack strategies and wrappers like middleware
|
|
981
|
+
2. **Async-First**: Built for `async/await` from the ground up
|
|
982
|
+
3. **Production-Ready**: Timeouts, retries, caching, rate limiting—all built-in
|
|
983
|
+
4. **Pluggable**: Parsers, strategies, transports—swap components as needed
|
|
984
|
+
5. **Observable**: Structured logging and metrics collection throughout
|
|
985
|
+
|
|
986
|
+
## Examples
|
|
987
|
+
|
|
988
|
+
Check out the [`examples/`](examples/) directory for complete working examples:
|
|
989
|
+
|
|
990
|
+
### Getting Started
|
|
991
|
+
- **Quick start**: `examples/quickstart_demo.py` - Basic tool registration and execution
|
|
992
|
+
- **Execution strategies**: `examples/execution_strategies_demo.py` - InProcess vs Subprocess
|
|
993
|
+
- **Production wrappers**: `examples/wrappers_demo.py` - Caching, retries, rate limiting
|
|
994
|
+
- **Streaming tools**: `examples/streaming_demo.py` - Real-time incremental results
|
|
995
|
+
|
|
996
|
+
### MCP Integration (Real-World)
|
|
997
|
+
- **Notion + OAuth**: `examples/notion_oauth.py` - Complete OAuth 2.1 flow with HTTP Streamable
|
|
998
|
+
- Shows: Authorization Server discovery, client registration, PKCE flow, token exchange
|
|
999
|
+
- **SQLite Local**: `examples/stdio_sqlite.py` - Local database access via STDIO
|
|
1000
|
+
- Shows: Command/args passing, environment variables, file paths, initialization timeouts
|
|
1001
|
+
- **Echo Server**: `examples/stdio_echo.py` - Minimal STDIO transport example
|
|
1002
|
+
- Shows: Simplest possible MCP integration for testing
|
|
1003
|
+
- **Atlassian + OAuth**: `examples/atlassian_sse.py` - OAuth with SSE transport (legacy)
|
|
1004
|
+
|
|
1005
|
+
### Advanced MCP
|
|
1006
|
+
- **HTTP Streamable**: `examples/mcp_http_streamable_example.py`
|
|
1007
|
+
- **STDIO**: `examples/mcp_stdio_example.py`
|
|
1008
|
+
- **SSE**: `examples/mcp_sse_example.py`
|
|
1009
|
+
- **Plugin system**: `examples/plugins_builtins_demo.py`, `examples/plugins_custom_parser_demo.py`
|
|
1010
|
+
|
|
1011
|
+
## FAQ
|
|
1012
|
+
|
|
1013
|
+
**Q: What happens if a tool takes too long?**
|
|
1014
|
+
A: The tool is cancelled after `default_timeout` seconds and returns an error result. The processor continues with other tools.
|
|
1015
|
+
|
|
1016
|
+
**Q: Can I mix local and remote (MCP) tools?**
|
|
1017
|
+
A: Yes! Register local tools first, then use `setup_mcp_*` to add remote tools. They all work in the same processor.
|
|
1018
|
+
|
|
1019
|
+
**Q: How do I handle malformed LLM outputs?**
|
|
1020
|
+
A: The processor is resilient—invalid tool calls are logged and return error results without crashing.
|
|
1021
|
+
|
|
1022
|
+
**Q: What about API rate limits?**
|
|
1023
|
+
A: Use `enable_rate_limiting=True` and set `tool_rate_limits` per tool or `global_rate_limit` for all tools.
|
|
1024
|
+
|
|
1025
|
+
**Q: Can tools return files or binary data?**
|
|
1026
|
+
A: Yes—tools can return any JSON-serializable data including base64-encoded files, URLs, or structured data.
|
|
1027
|
+
|
|
1028
|
+
**Q: How do I test my tools?**
|
|
1029
|
+
A: Use pytest with `@pytest.mark.asyncio`. See [Testing Tools](#testing-tools) for examples.
|
|
1030
|
+
|
|
1031
|
+
**Q: Does this work with streaming LLM responses?**
|
|
1032
|
+
A: Yes—as tool calls appear in the stream, extract and process them. The processor handles partial/incremental tool call lists.
|
|
1033
|
+
|
|
1034
|
+
**Q: What's the difference between InProcess and Subprocess strategies?**
|
|
1035
|
+
A: InProcess is faster (same process), Subprocess is safer (isolated process). Use InProcess for trusted code, Subprocess for untrusted.
|
|
1036
|
+
|
|
1037
|
+
## Comparison with Other Tools
|
|
1038
|
+
|
|
1039
|
+
| Feature | chuk-tool-processor | LangChain Tools | OpenAI Tools | MCP SDK |
|
|
1040
|
+
|---------|-------------------|-----------------|--------------|---------|
|
|
1041
|
+
| **Async-native** | ✅ | ⚠️ Partial | ✅ | ✅ |
|
|
1042
|
+
| **Process isolation** | ✅ SubprocessStrategy | ❌ | ❌ | ⚠️ |
|
|
1043
|
+
| **Built-in retries** | ✅ | ❌ † | ❌ | ❌ |
|
|
1044
|
+
| **Rate limiting** | ✅ | ❌ † | ⚠️ ‡ | ❌ |
|
|
1045
|
+
| **Caching** | ✅ | ⚠️ † | ❌ ‡ | ❌ |
|
|
1046
|
+
| **Multiple parsers** | ✅ (XML, OpenAI, JSON) | ⚠️ | ✅ | ✅ |
|
|
1047
|
+
| **Streaming tools** | ✅ | ⚠️ | ⚠️ | ✅ |
|
|
1048
|
+
| **MCP integration** | ✅ All transports | ❌ | ❌ | ✅ (protocol only) |
|
|
1049
|
+
| **Zero-config start** | ✅ | ❌ | ✅ | ⚠️ |
|
|
1050
|
+
| **Production-ready** | ✅ Timeouts, metrics | ⚠️ | ⚠️ | ⚠️ |
|
|
1051
|
+
|
|
1052
|
+
**Notes:**
|
|
1053
|
+
- † LangChain offers caching and rate-limiting through separate libraries (`langchain-cache`, external rate limiters), but they're not core features.
|
|
1054
|
+
- ‡ OpenAI Tools can be combined with external rate limiters and caches, but tool execution itself doesn't include these features.
|
|
1055
|
+
|
|
1056
|
+
**When to use chuk-tool-processor:**
|
|
1057
|
+
- You need production-ready tool execution (timeouts, retries, caching)
|
|
1058
|
+
- You want to connect to MCP servers (local or remote)
|
|
1059
|
+
- You need to run untrusted code safely (subprocess isolation)
|
|
1060
|
+
- You're building a custom LLM application (not using a framework)
|
|
1061
|
+
|
|
1062
|
+
**When to use alternatives:**
|
|
1063
|
+
- **LangChain**: You want a full-featured LLM framework with chains, agents, and memory
|
|
1064
|
+
- **OpenAI Tools**: You only use OpenAI and don't need advanced execution features
|
|
1065
|
+
- **MCP SDK**: You're building an MCP server, not a client
|
|
1066
|
+
|
|
1067
|
+
## Related Projects
|
|
1068
|
+
|
|
1069
|
+
- **[chuk-mcp](https://github.com/chrishayuk/chuk-mcp)**: Low-level Model Context Protocol client
|
|
1070
|
+
- Powers the MCP transport layer in chuk-tool-processor
|
|
1071
|
+
- Use directly if you need protocol-level control
|
|
1072
|
+
- Use chuk-tool-processor if you want high-level tool execution
|
|
1073
|
+
|
|
1074
|
+
## Contributing & Support
|
|
1075
|
+
|
|
1076
|
+
- **GitHub**: [chrishayuk/chuk-tool-processor](https://github.com/chrishayuk/chuk-tool-processor)
|
|
1077
|
+
- **Issues**: [Report bugs and request features](https://github.com/chrishayuk/chuk-tool-processor/issues)
|
|
1078
|
+
- **Discussions**: [Community discussions](https://github.com/chrishayuk/chuk-tool-processor/discussions)
|
|
1079
|
+
- **License**: MIT
|
|
1080
|
+
|
|
1081
|
+
---
|
|
1082
|
+
|
|
1083
|
+
**Remember**: CHUK Tool Processor is the missing link between LLM outputs and reliable tool execution. It's not trying to be everything—it's trying to be the best at one thing: processing tool calls in production.
|
|
1084
|
+
|
|
1085
|
+
Built with ❤️ by the CHUK AI team for the LLM tool integration community.
|