openbox-deepagent-sdk-python 0.1.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- openbox_deepagent/__init__.py +91 -0
- openbox_deepagent/middleware.py +354 -0
- openbox_deepagent/middleware_factory.py +74 -0
- openbox_deepagent/middleware_hooks.py +783 -0
- openbox_deepagent/py.typed +0 -0
- openbox_deepagent/subagent_resolver.py +130 -0
- openbox_deepagent_sdk_python-0.1.0.dist-info/METADATA +739 -0
- openbox_deepagent_sdk_python-0.1.0.dist-info/RECORD +9 -0
- openbox_deepagent_sdk_python-0.1.0.dist-info/WHEEL +4 -0
|
@@ -0,0 +1,739 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: openbox-deepagent-sdk-python
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: OpenBox governance SDK for DeepAgents (langchain-ai/deepagents)
|
|
5
|
+
License: MIT
|
|
6
|
+
Requires-Python: >=3.11
|
|
7
|
+
Requires-Dist: langchain-core>=0.3.0
|
|
8
|
+
Requires-Dist: langchain>=0.3.0
|
|
9
|
+
Requires-Dist: langgraph>=0.2.0
|
|
10
|
+
Requires-Dist: openbox-langgraph-sdk-python>=0.1.0
|
|
11
|
+
Provides-Extra: deepagents
|
|
12
|
+
Requires-Dist: deepagents>=0.1.0; extra == 'deepagents'
|
|
13
|
+
Provides-Extra: dev
|
|
14
|
+
Requires-Dist: mypy>=1.10.0; extra == 'dev'
|
|
15
|
+
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
|
|
16
|
+
Requires-Dist: pytest>=8.0.0; extra == 'dev'
|
|
17
|
+
Requires-Dist: ruff>=0.6.0; extra == 'dev'
|
|
18
|
+
Description-Content-Type: text/markdown
|
|
19
|
+
|
|
20
|
+
# openbox-deepagent-sdk-python
|
|
21
|
+
|
|
22
|
+
[](https://pypi.org/project/openbox-deepagent-sdk-python/)
|
|
23
|
+
[](https://pypi.org/project/openbox-deepagent-sdk-python/)
|
|
24
|
+
[](LICENSE)
|
|
25
|
+
|
|
26
|
+
Add real-time governance to any [DeepAgents](https://github.com/langchain-ai/deepagents) application — powered by [OpenBox](https://openbox.ai).
|
|
27
|
+
|
|
28
|
+
This package extends [`openbox-langgraph-sdk`](../sdk-langgraph-python) with three things DeepAgents specifically needs: **per-subagent policy targeting** (govern the `writer` subagent differently from `researcher`), **HITL conflict detection** (prevent clashes with DeepAgents' own `interrupt_on`), and **built-in `a2a` tool classification** for subagent dispatches.
|
|
29
|
+
|
|
30
|
+
> **New to OpenBox?** Start with the [`openbox-langgraph-sdk` README](../sdk-langgraph-python/README.md). It covers policies, guardrails, HITL, error handling, and debugging. This document covers only what's different or additional for DeepAgents.
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## Table of Contents
|
|
35
|
+
|
|
36
|
+
- [How DeepAgents governance works](#how-deepagents-governance-works)
|
|
37
|
+
- [Installation](#installation)
|
|
38
|
+
- [Quickstart](#quickstart)
|
|
39
|
+
- [Configuration reference](#configuration-reference)
|
|
40
|
+
- [Governance features](#governance-features)
|
|
41
|
+
- [Policies (OPA / Rego)](#policies-opa--rego)
|
|
42
|
+
- [Per-subagent policies](#per-subagent-policies)
|
|
43
|
+
- [Guardrails](#guardrails)
|
|
44
|
+
- [Human-in-the-loop (HITL)](#human-in-the-loop-hitl)
|
|
45
|
+
- [Behavior Rules (AGE)](#behavior-rules-age)
|
|
46
|
+
- [Tool classification](#tool-classification)
|
|
47
|
+
- [Error handling](#error-handling)
|
|
48
|
+
- [Advanced usage](#advanced-usage)
|
|
49
|
+
- [Known limitations](#known-limitations)
|
|
50
|
+
- [Debugging](#debugging)
|
|
51
|
+
- [Contributing](#contributing)
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## How DeepAgents governance works
|
|
56
|
+
|
|
57
|
+
DeepAgents dispatches subagents through the built-in `task` tool:
|
|
58
|
+
|
|
59
|
+
```python
|
|
60
|
+
task(description="Research quantum computing", subagent_type="researcher")
|
|
61
|
+
task(description="Write a technical report", subagent_type="writer")
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
The problem: subagents execute *inside* the `task` tool body, so their internal events are invisible to the outer LangGraph event stream. Only the `task` tool's `on_tool_start` is observable.
|
|
65
|
+
|
|
66
|
+
The SDK solves this by reading `subagent_type` from the `task` input before the call executes and embedding it as `__openbox` metadata in the governance event. Your Rego policy then has a clean, explicit handle to target specific subagent types:
|
|
67
|
+
|
|
68
|
+
```
|
|
69
|
+
Your agent SDK OpenBox Core
|
|
70
|
+
────────── ─── ───────────
|
|
71
|
+
task(subagent_type="writer")
|
|
72
|
+
│
|
|
73
|
+
└─ on_tool_start ────────► ActivityStarted Policy engine
|
|
74
|
+
activity_type="task" ───► input.activity_type == "task"
|
|
75
|
+
activity_input=[ some item in input.activity_input
|
|
76
|
+
{description, subagent_type}, item["__openbox"].subagent_name
|
|
77
|
+
{__openbox: { == "writer"
|
|
78
|
+
tool_type: "a2a", ◄─── REQUIRE_APPROVAL
|
|
79
|
+
subagent_name: "writer"
|
|
80
|
+
}}
|
|
81
|
+
]
|
|
82
|
+
↑
|
|
83
|
+
enforce verdict
|
|
84
|
+
(block / pause for HITL approval)
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Your graph code is untouched.
|
|
88
|
+
|
|
89
|
+
> **`create_openbox_deep_agent_handler` is synchronous.** Do not `await` it.
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
## Installation
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
pip install openbox-deepagent-sdk-python
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
**Requirements:** Python 3.11+, `openbox-langgraph-sdk-python >= 0.1.0`, `langgraph >= 0.2`, `deepagents`
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## Quickstart
|
|
104
|
+
|
|
105
|
+
### 1. Create an agent in the dashboard
|
|
106
|
+
|
|
107
|
+
Sign in to [dashboard.openbox.ai](https://dashboard.openbox.ai), create an agent named `"ResearchBot"`, and copy your API key.
|
|
108
|
+
|
|
109
|
+
### 2. Export credentials
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
export OPENBOX_URL="https://core.openbox.ai"
|
|
113
|
+
export OPENBOX_API_KEY="obx_live_..."
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
### 3. Add OpenBox middleware to your agent
|
|
117
|
+
|
|
118
|
+
```python
|
|
119
|
+
import os
|
|
120
|
+
import asyncio
|
|
121
|
+
from deepagents import create_deep_agent
|
|
122
|
+
from langchain.chat_models import init_chat_model
|
|
123
|
+
from openbox_deepagent import create_openbox_middleware
|
|
124
|
+
|
|
125
|
+
# Create governance middleware
|
|
126
|
+
middleware = create_openbox_middleware(
|
|
127
|
+
api_url=os.environ["OPENBOX_URL"],
|
|
128
|
+
api_key=os.environ["OPENBOX_API_KEY"],
|
|
129
|
+
agent_name="ResearchBot", # must match the agent name in your dashboard
|
|
130
|
+
known_subagents=["researcher", "analyst", "writer", "general-purpose"],
|
|
131
|
+
tool_type_map={"search_web": "http", "export_data": "http"},
|
|
132
|
+
)
|
|
133
|
+
|
|
134
|
+
# Inject middleware into your DeepAgents graph — no wrapper needed
|
|
135
|
+
# IMPORTANT: do NOT pass interrupt_on if using OpenBox HITL (see HITL section)
|
|
136
|
+
agent = create_deep_agent(
|
|
137
|
+
model=init_chat_model("openai:gpt-4o-mini", temperature=0),
|
|
138
|
+
tools=[search_web, write_report, export_data],
|
|
139
|
+
subagents=[
|
|
140
|
+
{"name": "researcher", "description": "Web research.",
|
|
141
|
+
"system_prompt": "You are a research assistant.", "tools": [search_web]},
|
|
142
|
+
{"name": "writer", "description": "Drafting reports.",
|
|
143
|
+
"system_prompt": "You are a professional writer.", "tools": [write_report]},
|
|
144
|
+
],
|
|
145
|
+
middleware=[middleware], # <-- governance injected here
|
|
146
|
+
)
|
|
147
|
+
|
|
148
|
+
async def main():
|
|
149
|
+
# Invoke directly — no handler wrapper needed
|
|
150
|
+
result = await agent.ainvoke(
|
|
151
|
+
{"messages": [{"role": "user", "content": "Research recent LangGraph papers"}]},
|
|
152
|
+
config={"configurable": {"thread_id": "session-001"}},
|
|
153
|
+
)
|
|
154
|
+
print(result["messages"][-1].content)
|
|
155
|
+
|
|
156
|
+
asyncio.run(main())
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
> **Migrating from `create_openbox_deep_agent_handler`?** The handler-based approach is deprecated. See [Legacy handler](#legacy-handler-based-approach) below.
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Configuration reference
|
|
164
|
+
|
|
165
|
+
`create_openbox_deep_agent_handler` accepts the following keyword arguments:
|
|
166
|
+
|
|
167
|
+
| Parameter | Type | Default | Description |
|
|
168
|
+
|---|---|---|---|
|
|
169
|
+
| `graph` | `CompiledGraph` | **required** | Compiled LangGraph graph from `create_deep_agent()` |
|
|
170
|
+
| `api_url` | `str` | **required** | Base URL of your OpenBox Core instance |
|
|
171
|
+
| `api_key` | `str` | **required** | API key (`obx_live_*` or `obx_test_*`) |
|
|
172
|
+
| `agent_name` | `str` | `None` | Agent name as configured in the dashboard. Used as `workflow_type` on all governance events — **must match exactly** for policies and Behavior Rules to fire |
|
|
173
|
+
| `known_subagents` | `list[str]` | `["general-purpose"]` | Subagent names from `create_deep_agent(subagents=[...])`. Always include `"general-purpose"` if the default subagent is active |
|
|
174
|
+
| `validate` | `bool` | `True` | Validate API key against server on startup |
|
|
175
|
+
| `on_api_error` | `str` | `"fail_open"` | `"fail_open"` (allow on error) or `"fail_closed"` (block on error) |
|
|
176
|
+
| `governance_timeout` | `float` | `30.0` | HTTP timeout in seconds for governance calls |
|
|
177
|
+
| `session_id` | `str` | `None` | Optional session identifier |
|
|
178
|
+
| `hitl` | `dict` | `{}` | Human-in-the-loop config (see [HITL](#human-in-the-loop-hitl)) |
|
|
179
|
+
| `guard_interrupt_on_conflict` | `bool` | `True` | Raise if `interrupt_on` and OpenBox HITL are both enabled |
|
|
180
|
+
| `send_chain_start_event` | `bool` | `True` | Send `WorkflowStarted` event |
|
|
181
|
+
| `send_chain_end_event` | `bool` | `True` | Send `WorkflowCompleted` event |
|
|
182
|
+
| `send_llm_start_event` | `bool` | `True` | Send `LLMStarted` event (enables prompt guardrails + PII redaction) |
|
|
183
|
+
| `send_llm_end_event` | `bool` | `True` | Send `LLMCompleted` event |
|
|
184
|
+
| `tool_type_map` | `dict[str, str]` | `{}` | Map tool names to semantic types for classification |
|
|
185
|
+
| `skip_chain_types` | `set[str]` | `set()` | Chain node names to skip governance for |
|
|
186
|
+
| `skip_tool_types` | `set[str]` | `set()` | Tool names to skip governance for entirely |
|
|
187
|
+
| `sqlalchemy_engine` | `Engine` | `None` | SQLAlchemy Engine instance to instrument for DB governance. Required when the engine is created before the middleware (see [Database governance](#database-governance)) |
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## Governance features
|
|
192
|
+
|
|
193
|
+
### Policies (OPA / Rego)
|
|
194
|
+
|
|
195
|
+
Policies live in the OpenBox dashboard and are written in [Rego](https://www.openpolicyagent.org/docs/latest/policy-language/). Before every tool call the SDK sends an `ActivityStarted` event — your policy evaluates the payload and returns a decision.
|
|
196
|
+
|
|
197
|
+
**Fields available in `input`:**
|
|
198
|
+
|
|
199
|
+
| Field | Type | Description |
|
|
200
|
+
|---|---|---|
|
|
201
|
+
| `input.event_type` | `string` | `"ActivityStarted"` or `"ActivityCompleted"` |
|
|
202
|
+
| `input.activity_type` | `string` | Tool name (e.g. `"search_web"`, `"task"`) |
|
|
203
|
+
| `input.activity_input` | `array` | Tool arguments + optional `__openbox` metadata |
|
|
204
|
+
| `input.workflow_type` | `string` | Your `agent_name` |
|
|
205
|
+
| `input.workflow_id` | `string` | Session workflow ID |
|
|
206
|
+
| `input.trust_tier` | `int` | Agent trust tier (1–4) from dashboard |
|
|
207
|
+
| `input.hook_trigger` | `bool` | `true` when event is triggered by an outbound HTTP span |
|
|
208
|
+
|
|
209
|
+
**Example — block a restricted research topic:**
|
|
210
|
+
|
|
211
|
+
```rego
|
|
212
|
+
package org.openboxai.policy
|
|
213
|
+
|
|
214
|
+
import future.keywords.if
|
|
215
|
+
import future.keywords.in
|
|
216
|
+
|
|
217
|
+
default result = {"decision": "CONTINUE", "reason": null}
|
|
218
|
+
|
|
219
|
+
restricted_terms := {"nuclear weapon", "bioweapon", "chemical weapon", "malware synthesis"}
|
|
220
|
+
|
|
221
|
+
result := {"decision": "BLOCK", "reason": "Search blocked: restricted research topic."} if {
|
|
222
|
+
input.event_type == "ActivityStarted"
|
|
223
|
+
input.activity_type == "search_web"
|
|
224
|
+
not input.hook_trigger
|
|
225
|
+
count(input.activity_input) > 0
|
|
226
|
+
entry := input.activity_input[0]
|
|
227
|
+
is_object(entry)
|
|
228
|
+
query := entry.query
|
|
229
|
+
is_string(query)
|
|
230
|
+
some term in restricted_terms
|
|
231
|
+
contains(lower(query), term)
|
|
232
|
+
}
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
**Possible decisions:**
|
|
236
|
+
|
|
237
|
+
| Decision | Effect |
|
|
238
|
+
|---|---|
|
|
239
|
+
| `CONTINUE` | Tool executes normally |
|
|
240
|
+
| `BLOCK` | `GovernanceBlockedError` raised — tool does not execute |
|
|
241
|
+
| `REQUIRE_APPROVAL` | Agent pauses; human must approve or reject in dashboard |
|
|
242
|
+
| `HALT` | `GovernanceHaltError` raised — session terminated |
|
|
243
|
+
|
|
244
|
+
> **Always include `not input.hook_trigger`** in `BLOCK` and `REQUIRE_APPROVAL` rules. When a tool makes an outbound HTTP call, the SDK fires a second `ActivityStarted` with `hook_trigger: true`. Without this guard, the rule fires once for the tool call and once for every HTTP request it makes.
|
|
245
|
+
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
### Per-subagent policies
|
|
249
|
+
|
|
250
|
+
All subagent dispatches share the same `activity_type: "task"` — so a rule matching on `activity_type` can't distinguish a `writer` dispatch from a `researcher` one. The SDK appends a `__openbox` sentinel to `activity_input` to give your Rego policy that handle:
|
|
251
|
+
|
|
252
|
+
```json
|
|
253
|
+
"activity_input": [
|
|
254
|
+
{
|
|
255
|
+
"description": "Write a report on AI safety",
|
|
256
|
+
"subagent_type": "writer"
|
|
257
|
+
},
|
|
258
|
+
{
|
|
259
|
+
"__openbox": {
|
|
260
|
+
"tool_type": "a2a",
|
|
261
|
+
"subagent_name": "writer"
|
|
262
|
+
}
|
|
263
|
+
}
|
|
264
|
+
]
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
OpenBox Core forwards `activity_input` to OPA unchanged — **no Core changes needed**.
|
|
268
|
+
|
|
269
|
+
**Target a specific subagent:**
|
|
270
|
+
|
|
271
|
+
```rego
|
|
272
|
+
# All tasks dispatched to the writer subagent require human approval
|
|
273
|
+
result := {"decision": "REQUIRE_APPROVAL", "reason": "Writer subagent tasks require approval."} if {
|
|
274
|
+
input.event_type == "ActivityStarted"
|
|
275
|
+
input.activity_type == "task"
|
|
276
|
+
not input.hook_trigger
|
|
277
|
+
some item in input.activity_input
|
|
278
|
+
meta := item["__openbox"]
|
|
279
|
+
meta.subagent_name == "writer"
|
|
280
|
+
}
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
**Target all subagent dispatches (any type):**
|
|
284
|
+
|
|
285
|
+
```rego
|
|
286
|
+
result := {"decision": "BLOCK", "reason": "Subagent calls disabled outside business hours."} if {
|
|
287
|
+
input.event_type == "ActivityStarted"
|
|
288
|
+
input.activity_type == "task"
|
|
289
|
+
not input.hook_trigger
|
|
290
|
+
some item in input.activity_input
|
|
291
|
+
item["__openbox"].tool_type == "a2a"
|
|
292
|
+
# ... add time-based condition
|
|
293
|
+
}
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
> `subagent_name` is extracted from the `task` tool's `subagent_type` field automatically. If it's missing, the SDK falls back to `"general-purpose"` and logs a warning when `OPENBOX_DEBUG=1` is set.
|
|
297
|
+
|
|
298
|
+
---
|
|
299
|
+
|
|
300
|
+
### Guardrails
|
|
301
|
+
|
|
302
|
+
Guardrails screen LLM prompts before the model sees them. Configure them per agent in the dashboard.
|
|
303
|
+
|
|
304
|
+
Before each `ainvoke`, the SDK sends the user's message as an `LLMStarted` event to Core:
|
|
305
|
+
|
|
306
|
+
- **PII redaction** — matched fields are redacted in-place. The original text never reaches the model.
|
|
307
|
+
- **Content block** — `GuardrailsValidationError` is raised and the session halts before the graph starts.
|
|
308
|
+
|
|
309
|
+
Supported guardrail types:
|
|
310
|
+
|
|
311
|
+
| Type | ID | What it detects |
|
|
312
|
+
|---|---|---|
|
|
313
|
+
| PII detection | `1` | Names, emails, phone numbers, SSNs, credit cards |
|
|
314
|
+
| Content filter | `2` | Harmful or unsafe content categories |
|
|
315
|
+
| Toxicity | `3` | Toxic language |
|
|
316
|
+
| Ban words | `4` | Custom word/phrase blocklist |
|
|
317
|
+
|
|
318
|
+
> DeepAgents fires `on_chat_model_start` for every LLM call — including internal subagent calls that contain only system or tool messages. The SDK automatically skips those (no user-turn content = nothing to screen), so you won't hit guardrail parse errors from subagent-internal LLM invocations.
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
### Human-in-the-loop (HITL)
|
|
323
|
+
|
|
324
|
+
When a policy returns `REQUIRE_APPROVAL`, the agent pauses and polls OpenBox until a human approves or rejects from the dashboard.
|
|
325
|
+
|
|
326
|
+
```python
|
|
327
|
+
governed = create_openbox_deep_agent_handler(
|
|
328
|
+
graph=agent,
|
|
329
|
+
api_url=os.environ["OPENBOX_URL"],
|
|
330
|
+
api_key=os.environ["OPENBOX_API_KEY"],
|
|
331
|
+
agent_name="ResearchBot",
|
|
332
|
+
known_subagents=["researcher", "writer", "general-purpose"],
|
|
333
|
+
hitl={
|
|
334
|
+
"enabled": True,
|
|
335
|
+
"poll_interval_ms": 5_000, # check every 5 seconds
|
|
336
|
+
"max_wait_ms": 300_000, # timeout after 5 minutes
|
|
337
|
+
},
|
|
338
|
+
)
|
|
339
|
+
```
|
|
340
|
+
|
|
341
|
+
**HITL config options:**
|
|
342
|
+
|
|
343
|
+
| Key | Type | Default | Description |
|
|
344
|
+
|---|---|---|---|
|
|
345
|
+
| `enabled` | `bool` | `False` | Enable HITL polling |
|
|
346
|
+
| `poll_interval_ms` | `int` | `5000` | How often to poll for a decision (ms) |
|
|
347
|
+
| `max_wait_ms` | `int` | `300000` | Total wait before `ApprovalTimeoutError` (ms) |
|
|
348
|
+
| `skip_tool_types` | `set[str]` | `set()` | Tools that never wait for HITL |
|
|
349
|
+
|
|
350
|
+
**On approval:** tool or subagent execution continues normally.
|
|
351
|
+
**On rejection:** `ApprovalRejectedError` is raised → re-raised as `GovernanceHaltError`.
|
|
352
|
+
**On timeout:** `ApprovalTimeoutError` is raised.
|
|
353
|
+
|
|
354
|
+
#### Conflict with DeepAgents `interrupt_on`
|
|
355
|
+
|
|
356
|
+
DeepAgents has its own interrupt mechanism via `interrupt_on` (`HumanInTheLoopMiddleware`). Using both OpenBox HITL and `interrupt_on` simultaneously causes double-pausing and unpredictable execution.
|
|
357
|
+
|
|
358
|
+
The SDK detects this at construction time and raises a `ValueError`:
|
|
359
|
+
|
|
360
|
+
```
|
|
361
|
+
[OpenBox] DeepAgents graph has interrupt_on configured AND OpenBox HITL is enabled.
|
|
362
|
+
These conflict — OpenBox must own the HITL flow.
|
|
363
|
+
Remove interrupt_on from create_deep_agent, or set guard_interrupt_on_conflict=False to suppress this check.
|
|
364
|
+
```
|
|
365
|
+
|
|
366
|
+
If you want OpenBox to own HITL (which gives you the full dashboard + audit trail), remove `interrupt_on`:
|
|
367
|
+
|
|
368
|
+
```python
|
|
369
|
+
# ✅ OpenBox owns HITL
|
|
370
|
+
agent = create_deep_agent(model="gpt-4o-mini", tools=[...], subagents=[...])
|
|
371
|
+
|
|
372
|
+
# ❌ conflict
|
|
373
|
+
agent = create_deep_agent(model="gpt-4o-mini", tools=[...], interrupt_on=["task"])
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
---
|
|
377
|
+
|
|
378
|
+
### Behavior Rules (AGE)
|
|
379
|
+
|
|
380
|
+
> ⚠️ **Read [Known Limitations — Behavior Rules](#behavior-rules-count-task-dispatches-not-subagent-tool-calls) before setting these up.** The semantics are materially different from the Temporal SDK, especially for DeepAgents.
|
|
381
|
+
|
|
382
|
+
Behavior Rules detect patterns across tool call sequences within a session — rate limits, unusual sequences, repeated high-risk dispatches. They're configured in the dashboard and enforced by the OpenBox Activity Governance Engine (AGE).
|
|
383
|
+
|
|
384
|
+
The SDK instruments `httpx` at startup (one-time idempotent patch). Any `httpx` call a tool makes is captured as a span and attached to that tool's `ActivityCompleted` event.
|
|
385
|
+
|
|
386
|
+
---
|
|
387
|
+
|
|
388
|
+
### Tool classification
|
|
389
|
+
|
|
390
|
+
Map your non-subagent tools to semantic types so your Rego policies can target whole categories instead of listing every tool name.
|
|
391
|
+
|
|
392
|
+
```python
|
|
393
|
+
governed = create_openbox_deep_agent_handler(
|
|
394
|
+
graph=agent,
|
|
395
|
+
agent_name="ResearchBot",
|
|
396
|
+
known_subagents=["researcher", "writer", "general-purpose"],
|
|
397
|
+
tool_type_map={
|
|
398
|
+
"search_web": "http",
|
|
399
|
+
"export_data": "http",
|
|
400
|
+
"query_db": "database",
|
|
401
|
+
},
|
|
402
|
+
...
|
|
403
|
+
)
|
|
404
|
+
```
|
|
405
|
+
|
|
406
|
+
**Supported `tool_type` values:** `"http"`, `"database"`, `"builtin"`, `"a2a"`
|
|
407
|
+
|
|
408
|
+
> `"a2a"` is set automatically on every `task` call when `subagent_name` is resolved. Don't add `"task"` to `tool_type_map`.
|
|
409
|
+
|
|
410
|
+
When a type is set, the SDK appends an `__openbox` sentinel to `activity_input`:
|
|
411
|
+
|
|
412
|
+
```json
|
|
413
|
+
{"__openbox": {"tool_type": "http"}}
|
|
414
|
+
```
|
|
415
|
+
|
|
416
|
+
Rego can match on it:
|
|
417
|
+
|
|
418
|
+
```rego
|
|
419
|
+
result := {"decision": "REQUIRE_APPROVAL", "reason": "HTTP calls require approval."} if {
|
|
420
|
+
input.event_type == "ActivityStarted"
|
|
421
|
+
not input.hook_trigger
|
|
422
|
+
some item in input.activity_input
|
|
423
|
+
item["__openbox"].tool_type == "http"
|
|
424
|
+
}
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
---
|
|
428
|
+
|
|
429
|
+
### Database governance
|
|
430
|
+
|
|
431
|
+
The SDK instruments database operations via OpenTelemetry. Supported libraries: psycopg2, asyncpg, mysql, pymysql, sqlite3, pymongo, redis, sqlalchemy.
|
|
432
|
+
|
|
433
|
+
Install the instrumentor for your database:
|
|
434
|
+
|
|
435
|
+
```bash
|
|
436
|
+
pip install opentelemetry-instrumentation-sqlite3 # SQLite
|
|
437
|
+
pip install opentelemetry-instrumentation-psycopg2 # PostgreSQL
|
|
438
|
+
pip install opentelemetry-instrumentation-sqlalchemy # SQLAlchemy ORM
|
|
439
|
+
```
|
|
440
|
+
|
|
441
|
+
**Important: initialization order.** If your database connection or SQLAlchemy engine is created **before** `create_openbox_middleware()`, pass the engine explicitly:
|
|
442
|
+
|
|
443
|
+
```python
|
|
444
|
+
from langchain_community.utilities import SQLDatabase
|
|
445
|
+
|
|
446
|
+
# Engine created here (before middleware)
|
|
447
|
+
db = SQLDatabase.from_uri("sqlite:///Chinook.db")
|
|
448
|
+
|
|
449
|
+
# Pass engine so the SDK can instrument it retroactively
|
|
450
|
+
middleware = create_openbox_middleware(
|
|
451
|
+
api_url=os.environ["OPENBOX_URL"],
|
|
452
|
+
api_key=os.environ["OPENBOX_API_KEY"],
|
|
453
|
+
agent_name="TextToSQL",
|
|
454
|
+
sqlalchemy_engine=db._engine, # <-- instrument existing engine
|
|
455
|
+
)
|
|
456
|
+
```
|
|
457
|
+
|
|
458
|
+
Without `sqlalchemy_engine=`, only engines created **after** middleware initialization are instrumented. Check startup logs for `Instrumented: sqlalchemy (existing engine)` vs `Instrumented: sqlalchemy (future engines)` to confirm.
|
|
459
|
+
|
|
460
|
+
---
|
|
461
|
+
|
|
462
|
+
## Error handling
|
|
463
|
+
|
|
464
|
+
All governance exceptions are importable from `openbox_deepagent`:
|
|
465
|
+
|
|
466
|
+
```python
|
|
467
|
+
from openbox_deepagent import (
|
|
468
|
+
GovernanceBlockedError,
|
|
469
|
+
GovernanceHaltError,
|
|
470
|
+
GuardrailsValidationError,
|
|
471
|
+
ApprovalRejectedError,
|
|
472
|
+
ApprovalTimeoutError,
|
|
473
|
+
)
|
|
474
|
+
|
|
475
|
+
try:
|
|
476
|
+
result = await governed.ainvoke({"messages": [...]}, config=...)
|
|
477
|
+
except GovernanceBlockedError as e:
|
|
478
|
+
# Policy returned BLOCK — tool or subagent dispatch did not execute
|
|
479
|
+
print(f"Blocked: {e}")
|
|
480
|
+
except GovernanceHaltError as e:
|
|
481
|
+
# Policy returned HALT, or a HITL decision was rejected/expired
|
|
482
|
+
print(f"Session halted: {e}")
|
|
483
|
+
except GuardrailsValidationError as e:
|
|
484
|
+
# Guardrail fired on the user prompt — graph never started
|
|
485
|
+
print(f"Guardrail: {e}")
|
|
486
|
+
except ApprovalRejectedError as e:
|
|
487
|
+
print(f"Rejected by reviewer: {e}")
|
|
488
|
+
except ApprovalTimeoutError as e:
|
|
489
|
+
print(f"HITL timed out: {e}")
|
|
490
|
+
```
|
|
491
|
+
|
|
492
|
+
| Exception | Raised when |
|
|
493
|
+
|---|---|
|
|
494
|
+
| `GovernanceBlockedError` | Policy returned `BLOCK` |
|
|
495
|
+
| `GovernanceHaltError` | Policy returned `HALT`, or a HITL decision was rejected or expired |
|
|
496
|
+
| `GuardrailsValidationError` | A guardrail fired on the user prompt |
|
|
497
|
+
| `ApprovalRejectedError` | A human explicitly rejected a `REQUIRE_APPROVAL` decision |
|
|
498
|
+
| `ApprovalTimeoutError` | HITL polling exceeded `max_wait_ms` |
|
|
499
|
+
| `OpenBoxAuthError` | API key is invalid or unauthorized |
|
|
500
|
+
|
|
501
|
+
---
|
|
502
|
+
|
|
503
|
+
## Advanced usage
|
|
504
|
+
|
|
505
|
+
### Streaming
|
|
506
|
+
|
|
507
|
+
Use `astream_governed` to stream graph updates with governance applied inline:
|
|
508
|
+
|
|
509
|
+
```python
|
|
510
|
+
async for chunk in governed.astream_governed(
|
|
511
|
+
{"messages": [{"role": "user", "content": "Research quantum computing"}]},
|
|
512
|
+
config={"configurable": {"thread_id": "session-001"}},
|
|
513
|
+
stream_mode="values",
|
|
514
|
+
):
|
|
515
|
+
process(chunk)
|
|
516
|
+
```
|
|
517
|
+
|
|
518
|
+
`astream` and `astream_events` are also available as thin proxies, useful for `langgraph dev` and other tooling that expects a `CompiledStateGraph`.
|
|
519
|
+
|
|
520
|
+
### `langgraph dev` compatibility
|
|
521
|
+
|
|
522
|
+
Export the governed handler as a module-level variable:
|
|
523
|
+
|
|
524
|
+
```python
|
|
525
|
+
# agent.py
|
|
526
|
+
import os
|
|
527
|
+
from deepagents import create_deep_agent
|
|
528
|
+
from openbox_deepagent import create_openbox_deep_agent_handler
|
|
529
|
+
|
|
530
|
+
_agent = create_deep_agent(model="...", tools=[...], subagents=[...])
|
|
531
|
+
|
|
532
|
+
graph = create_openbox_deep_agent_handler(
|
|
533
|
+
graph=_agent,
|
|
534
|
+
api_url=os.environ["OPENBOX_URL"],
|
|
535
|
+
api_key=os.environ["OPENBOX_API_KEY"],
|
|
536
|
+
agent_name="ResearchBot",
|
|
537
|
+
known_subagents=["researcher", "writer", "general-purpose"],
|
|
538
|
+
)
|
|
539
|
+
```
|
|
540
|
+
|
|
541
|
+
```json
|
|
542
|
+
// langgraph.json
|
|
543
|
+
{
|
|
544
|
+
"graphs": {
|
|
545
|
+
"agent": "./agent.py:graph"
|
|
546
|
+
}
|
|
547
|
+
}
|
|
548
|
+
```
|
|
549
|
+
|
|
550
|
+
### Multi-turn sessions
|
|
551
|
+
|
|
552
|
+
Use a stable `thread_id` across turns. The SDK generates a fresh `workflow_id` per call internally — your code just passes the same `thread_id`:
|
|
553
|
+
|
|
554
|
+
```python
|
|
555
|
+
config = {"configurable": {"thread_id": "user-42-session-7"}}
|
|
556
|
+
|
|
557
|
+
await governed.ainvoke({"messages": [{"role": "user", "content": "Research LangGraph"}]}, config=config)
|
|
558
|
+
await governed.ainvoke({"messages": [{"role": "user", "content": "Now write a report on it"}]}, config=config)
|
|
559
|
+
```
|
|
560
|
+
|
|
561
|
+
### Inspecting registered subagents
|
|
562
|
+
|
|
563
|
+
```python
|
|
564
|
+
print(governed.get_known_subagents())
|
|
565
|
+
# ['analyst', 'general-purpose', 'researcher', 'writer']
|
|
566
|
+
```
|
|
567
|
+
|
|
568
|
+
### `fail_closed` mode
|
|
569
|
+
|
|
570
|
+
```python
|
|
571
|
+
governed = create_openbox_deep_agent_handler(
|
|
572
|
+
graph=agent,
|
|
573
|
+
on_api_error="fail_closed",
|
|
574
|
+
...
|
|
575
|
+
)
|
|
576
|
+
```
|
|
577
|
+
|
|
578
|
+
### Reducing governance noise from DeepAgents middleware nodes
|
|
579
|
+
|
|
580
|
+
`create_deep_agent()` fires `on_chain_start` for every middleware node. None of these need governance — suppress them:
|
|
581
|
+
|
|
582
|
+
```python
|
|
583
|
+
governed = create_openbox_deep_agent_handler(
|
|
584
|
+
graph=agent,
|
|
585
|
+
skip_chain_types={
|
|
586
|
+
"model",
|
|
587
|
+
"tools",
|
|
588
|
+
"PatchToolCallsMiddleware.before_agent",
|
|
589
|
+
"TodoListMiddleware.after_model",
|
|
590
|
+
"FilesystemMiddleware.before_agent",
|
|
591
|
+
"SummarizationMiddleware.before_agent",
|
|
592
|
+
"AnthropicPromptCachingMiddleware.before_agent",
|
|
593
|
+
"SubAgentMiddleware.before_agent",
|
|
594
|
+
"MemoryMiddleware.before_agent",
|
|
595
|
+
"SkillsMiddleware.before_agent",
|
|
596
|
+
},
|
|
597
|
+
...
|
|
598
|
+
)
|
|
599
|
+
```
|
|
600
|
+
|
|
601
|
+
Skipping `"tools"` or `"model"` does **not** suppress tool or LLM governance — those events come from `on_tool_start` and `on_chat_model_start` inside those nodes, which are separate.
|
|
602
|
+
|
|
603
|
+
Run with `OPENBOX_DEBUG=1` and look for `[OBX_EVENT]` lines to find the exact node names your graph emits.
|
|
604
|
+
|
|
605
|
+
---
|
|
606
|
+
|
|
607
|
+
## Known limitations
|
|
608
|
+
|
|
609
|
+
These constraints come from how DeepAgents and LangGraph work at runtime. The base limitations (Behavior Rules, `httpx`-only spans, `ainvoke` session scoping) are covered in the [openbox-langgraph-sdk README](../sdk-langgraph-python/README.md#known-limitations). These are the DeepAgents-specific additions.
|
|
610
|
+
|
|
611
|
+
### Subagent internals are invisible to governance
|
|
612
|
+
|
|
613
|
+
Subagents execute *inside* the `task` tool body via `subagent.invoke()`. Their internal tool calls and LLM calls are not surfaced in the outer `astream_events` stream. From the governance layer, the `task` call is a single atomic unit.
|
|
614
|
+
|
|
615
|
+
**What this means concretely:**
|
|
616
|
+
- A `search_web` call made by the `researcher` subagent is not a separate `ActivityStarted` event — you cannot write a Rego policy that targets it
|
|
617
|
+
- You cannot apply HITL to a tool call a subagent makes — only to the `task` dispatch itself
|
|
618
|
+
- The `ActivityCompleted` for `task` carries the final output, but not a breakdown of what the subagent did internally
|
|
619
|
+
|
|
620
|
+
**What you can govern:**
|
|
621
|
+
- Whether a specific subagent type is dispatched at all (`BLOCK` / `REQUIRE_APPROVAL` on `activity_type == "task"` with `subagent_name` matching)
|
|
622
|
+
- Patterns in how many times each subagent type is dispatched per session
|
|
623
|
+
|
|
624
|
+
**Workaround:** If you need to govern a specific tool call regardless of which subagent triggers it, add it to the outer agent's tool list as well. The outer agent's tool calls are fully governed.
|
|
625
|
+
|
|
626
|
+
**Contrast with Temporal:** In the Temporal SDK, every activity — including ones inside a "subagent" — runs as an independent governed unit. DeepAgents has no equivalent. Subagent execution is opaque to the outer event stream.
|
|
627
|
+
|
|
628
|
+
---
|
|
629
|
+
|
|
630
|
+
### Behavior Rules count `task` dispatches, not subagent-internal tool calls
|
|
631
|
+
|
|
632
|
+
The AGE sees `task(subagent_type="researcher")` as one `ActivityStarted` + one `ActivityCompleted`. The researcher then calling `search_web` five times internally is invisible.
|
|
633
|
+
|
|
634
|
+
A rule like "block if `search_web` exceeds 10 calls per session" only counts direct `search_web` calls from the outer agent — not from subagents.
|
|
635
|
+
|
|
636
|
+
**What works reliably for DeepAgents:**
|
|
637
|
+
- Rate-limiting `task` dispatches per subagent type (e.g. researcher called more than 5 times)
|
|
638
|
+
- Rate-limiting total subagent dispatches
|
|
639
|
+
- Detecting unusual outer-agent tool sequences
|
|
640
|
+
|
|
641
|
+
---
|
|
642
|
+
|
|
643
|
+
### HTTP spans are captured for outer agent tools only
|
|
644
|
+
|
|
645
|
+
The `httpx` instrumentation captures calls made during outer agent tool execution. HTTP calls inside subagent tool bodies run in a separate async context and are not captured as spans on the `task` `ActivityCompleted`.
|
|
646
|
+
|
|
647
|
+
---
|
|
648
|
+
|
|
649
|
+
### Behavior Rules don't span `ainvoke` calls
|
|
650
|
+
|
|
651
|
+
Each `ainvoke` is a separate governance session with a new `workflow_id`. Behavior Rules track patterns **within a single invocation only**. Cross-turn pattern detection is not yet supported.
|
|
652
|
+
|
|
653
|
+
---
|
|
654
|
+
|
|
655
|
+
### Subagent model providers must be configured independently
|
|
656
|
+
|
|
657
|
+
Each subagent can specify its own `model`. If a subagent uses a different provider from the outer agent (e.g. outer agent uses OpenAI, subagent uses Anthropic), you need the corresponding API key configured in your environment. The SDK doesn't validate subagent model configs at startup.
|
|
658
|
+
|
|
659
|
+
---
|
|
660
|
+
|
|
661
|
+
## Debugging
|
|
662
|
+
|
|
663
|
+
```bash
|
|
664
|
+
OPENBOX_DEBUG=1 python agent.py
|
|
665
|
+
```
|
|
666
|
+
|
|
667
|
+
This enables two log streams:
|
|
668
|
+
|
|
669
|
+
**`[OBX_EVENT]`** — every raw LangGraph event (stderr):
|
|
670
|
+
```
|
|
671
|
+
[OBX_EVENT] on_chain_start name='LangGraph' node=None
|
|
672
|
+
[OBX_EVENT] on_chain_start name='SubAgentMiddleware...' node='SubAgentMiddleware...'
|
|
673
|
+
[OBX_EVENT] on_chat_model_start name='ChatOpenAI' node='model'
|
|
674
|
+
[OBX_EVENT] on_tool_start name='task' node='tools'
|
|
675
|
+
[OBX_EVENT] on_tool_start name='search_web' node='tools'
|
|
676
|
+
```
|
|
677
|
+
|
|
678
|
+
**`[OpenBox Debug]`** — every governance request and response (stdout):
|
|
679
|
+
```
|
|
680
|
+
[OpenBox Debug] governance request: {
|
|
681
|
+
"event_type": "ActivityStarted",
|
|
682
|
+
"activity_type": "task",
|
|
683
|
+
"workflow_type": "ResearchBot",
|
|
684
|
+
"activity_input": [
|
|
685
|
+
{"description": "Write a report on AI safety", "subagent_type": "writer"},
|
|
686
|
+
{"__openbox": {"tool_type": "a2a", "subagent_name": "writer"}}
|
|
687
|
+
],
|
|
688
|
+
"hook_trigger": false
|
|
689
|
+
}
|
|
690
|
+
[OpenBox Debug] governance response: { "verdict": "require_approval", "reason": "Writer tasks require approval." }
|
|
691
|
+
```
|
|
692
|
+
|
|
693
|
+
**If things aren't working, check for these:**
|
|
694
|
+
|
|
695
|
+
- `workflow_type` doesn't match your dashboard agent name → policies never fire
|
|
696
|
+
- `subagent_name` is `"general-purpose"` when you expected something else → `subagent_type` was missing from the `task` input; look for a `task tool input missing subagent_type` warning in the debug output
|
|
697
|
+
- A rule is double-triggering → you're missing `not input.hook_trigger` in your Rego
|
|
698
|
+
- Warning at startup about `known_subagents` → you passed an empty list; include at least `["general-purpose"]`
|
|
699
|
+
|
|
700
|
+
---
|
|
701
|
+
|
|
702
|
+
## Legacy handler-based approach
|
|
703
|
+
|
|
704
|
+
> **Deprecated.** Use `create_openbox_middleware()` with `create_deep_agent(middleware=[...])` instead.
|
|
705
|
+
|
|
706
|
+
The handler-based approach wraps the compiled graph and processes events via `astream_events`:
|
|
707
|
+
|
|
708
|
+
```python
|
|
709
|
+
from openbox_deepagent import create_openbox_deep_agent_handler
|
|
710
|
+
|
|
711
|
+
agent = create_deep_agent(model=init_chat_model("openai:gpt-4o-mini"), tools=[...], subagents=[...])
|
|
712
|
+
|
|
713
|
+
# Deprecated — emits DeprecationWarning
|
|
714
|
+
governed = create_openbox_deep_agent_handler(
|
|
715
|
+
graph=agent,
|
|
716
|
+
api_url=os.environ["OPENBOX_URL"],
|
|
717
|
+
api_key=os.environ["OPENBOX_API_KEY"],
|
|
718
|
+
agent_name="ResearchBot",
|
|
719
|
+
known_subagents=["researcher", "analyst", "writer", "general-purpose"],
|
|
720
|
+
)
|
|
721
|
+
|
|
722
|
+
result = await governed.ainvoke(
|
|
723
|
+
{"messages": [{"role": "user", "content": "Research LangGraph papers"}]},
|
|
724
|
+
config={"configurable": {"thread_id": "session-001"}},
|
|
725
|
+
)
|
|
726
|
+
```
|
|
727
|
+
|
|
728
|
+
---
|
|
729
|
+
|
|
730
|
+
## Contributing
|
|
731
|
+
|
|
732
|
+
Contributions are welcome! Please open an issue before submitting a large pull request.
|
|
733
|
+
|
|
734
|
+
```bash
|
|
735
|
+
git clone https://github.com/openbox-ai/openbox-langchain-sdk
|
|
736
|
+
cd sdk-deepagent-python
|
|
737
|
+
uv sync --extra dev
|
|
738
|
+
uv run pytest tests/ -v
|
|
739
|
+
```
|