openbox-deepagent-sdk-python 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,739 @@
1
+ Metadata-Version: 2.4
2
+ Name: openbox-deepagent-sdk-python
3
+ Version: 0.1.0
4
+ Summary: OpenBox governance SDK for DeepAgents (langchain-ai/deepagents)
5
+ License: MIT
6
+ Requires-Python: >=3.11
7
+ Requires-Dist: langchain-core>=0.3.0
8
+ Requires-Dist: langchain>=0.3.0
9
+ Requires-Dist: langgraph>=0.2.0
10
+ Requires-Dist: openbox-langgraph-sdk-python>=0.1.0
11
+ Provides-Extra: deepagents
12
+ Requires-Dist: deepagents>=0.1.0; extra == 'deepagents'
13
+ Provides-Extra: dev
14
+ Requires-Dist: mypy>=1.10.0; extra == 'dev'
15
+ Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
16
+ Requires-Dist: pytest>=8.0.0; extra == 'dev'
17
+ Requires-Dist: ruff>=0.6.0; extra == 'dev'
18
+ Description-Content-Type: text/markdown
19
+
20
+ # openbox-deepagent-sdk-python
21
+
22
+ [![PyPI](https://img.shields.io/pypi/v/openbox-deepagent-sdk-python)](https://pypi.org/project/openbox-deepagent-sdk-python/)
23
+ [![Python](https://img.shields.io/pypi/pyversions/openbox-deepagent-sdk-python)](https://pypi.org/project/openbox-deepagent-sdk-python/)
24
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
25
+
26
+ Add real-time governance to any [DeepAgents](https://github.com/langchain-ai/deepagents) application — powered by [OpenBox](https://openbox.ai).
27
+
28
+ This package extends [`openbox-langgraph-sdk`](../sdk-langgraph-python) with three things DeepAgents specifically needs: **per-subagent policy targeting** (govern the `writer` subagent differently from `researcher`), **HITL conflict detection** (prevent clashes with DeepAgents' own `interrupt_on`), and **built-in `a2a` tool classification** for subagent dispatches.
29
+
30
+ > **New to OpenBox?** Start with the [`openbox-langgraph-sdk` README](../sdk-langgraph-python/README.md). It covers policies, guardrails, HITL, error handling, and debugging. This document covers only what's different or additional for DeepAgents.
31
+
32
+ ---
33
+
34
+ ## Table of Contents
35
+
36
+ - [How DeepAgents governance works](#how-deepagents-governance-works)
37
+ - [Installation](#installation)
38
+ - [Quickstart](#quickstart)
39
+ - [Configuration reference](#configuration-reference)
40
+ - [Governance features](#governance-features)
41
+ - [Policies (OPA / Rego)](#policies-opa--rego)
42
+ - [Per-subagent policies](#per-subagent-policies)
43
+ - [Guardrails](#guardrails)
44
+ - [Human-in-the-loop (HITL)](#human-in-the-loop-hitl)
45
+ - [Behavior Rules (AGE)](#behavior-rules-age)
46
+ - [Tool classification](#tool-classification)
47
+ - [Error handling](#error-handling)
48
+ - [Advanced usage](#advanced-usage)
49
+ - [Known limitations](#known-limitations)
50
+ - [Debugging](#debugging)
51
+ - [Contributing](#contributing)
52
+
53
+ ---
54
+
55
+ ## How DeepAgents governance works
56
+
57
+ DeepAgents dispatches subagents through the built-in `task` tool:
58
+
59
+ ```python
60
+ task(description="Research quantum computing", subagent_type="researcher")
61
+ task(description="Write a technical report", subagent_type="writer")
62
+ ```
63
+
64
+ The problem: subagents execute *inside* the `task` tool body, so their internal events are invisible to the outer LangGraph event stream. Only the `task` tool's `on_tool_start` is observable.
65
+
66
+ The SDK solves this by reading `subagent_type` from the `task` input before the call executes and embedding it as `__openbox` metadata in the governance event. Your Rego policy then has a clean, explicit handle to target specific subagent types:
67
+
68
+ ```
69
+ Your agent SDK OpenBox Core
70
+ ────────── ─── ───────────
71
+ task(subagent_type="writer")
72
+
73
+ └─ on_tool_start ────────► ActivityStarted Policy engine
74
+ activity_type="task" ───► input.activity_type == "task"
75
+ activity_input=[ some item in input.activity_input
76
+ {description, subagent_type}, item["__openbox"].subagent_name
77
+ {__openbox: { == "writer"
78
+ tool_type: "a2a", ◄─── REQUIRE_APPROVAL
79
+ subagent_name: "writer"
80
+ }}
81
+ ]
82
+
83
+ enforce verdict
84
+ (block / pause for HITL approval)
85
+ ```
86
+
87
+ Your graph code is untouched.
88
+
89
+ > **`create_openbox_deep_agent_handler` is synchronous.** Do not `await` it.
90
+
91
+ ---
92
+
93
+ ## Installation
94
+
95
+ ```bash
96
+ pip install openbox-deepagent-sdk-python
97
+ ```
98
+
99
+ **Requirements:** Python 3.11+, `openbox-langgraph-sdk-python >= 0.1.0`, `langgraph >= 0.2`, `deepagents`
100
+
101
+ ---
102
+
103
+ ## Quickstart
104
+
105
+ ### 1. Create an agent in the dashboard
106
+
107
+ Sign in to [dashboard.openbox.ai](https://dashboard.openbox.ai), create an agent named `"ResearchBot"`, and copy your API key.
108
+
109
+ ### 2. Export credentials
110
+
111
+ ```bash
112
+ export OPENBOX_URL="https://core.openbox.ai"
113
+ export OPENBOX_API_KEY="obx_live_..."
114
+ ```
115
+
116
+ ### 3. Add OpenBox middleware to your agent
117
+
118
+ ```python
119
+ import os
120
+ import asyncio
121
+ from deepagents import create_deep_agent
122
+ from langchain.chat_models import init_chat_model
123
+ from openbox_deepagent import create_openbox_middleware
124
+
125
+ # Create governance middleware
126
+ middleware = create_openbox_middleware(
127
+ api_url=os.environ["OPENBOX_URL"],
128
+ api_key=os.environ["OPENBOX_API_KEY"],
129
+ agent_name="ResearchBot", # must match the agent name in your dashboard
130
+ known_subagents=["researcher", "analyst", "writer", "general-purpose"],
131
+ tool_type_map={"search_web": "http", "export_data": "http"},
132
+ )
133
+
134
+ # Inject middleware into your DeepAgents graph — no wrapper needed
135
+ # IMPORTANT: do NOT pass interrupt_on if using OpenBox HITL (see HITL section)
136
+ agent = create_deep_agent(
137
+ model=init_chat_model("openai:gpt-4o-mini", temperature=0),
138
+ tools=[search_web, write_report, export_data],
139
+ subagents=[
140
+ {"name": "researcher", "description": "Web research.",
141
+ "system_prompt": "You are a research assistant.", "tools": [search_web]},
142
+ {"name": "writer", "description": "Drafting reports.",
143
+ "system_prompt": "You are a professional writer.", "tools": [write_report]},
144
+ ],
145
+ middleware=[middleware], # <-- governance injected here
146
+ )
147
+
148
+ async def main():
149
+ # Invoke directly — no handler wrapper needed
150
+ result = await agent.ainvoke(
151
+ {"messages": [{"role": "user", "content": "Research recent LangGraph papers"}]},
152
+ config={"configurable": {"thread_id": "session-001"}},
153
+ )
154
+ print(result["messages"][-1].content)
155
+
156
+ asyncio.run(main())
157
+ ```
158
+
159
+ > **Migrating from `create_openbox_deep_agent_handler`?** The handler-based approach is deprecated. See [Legacy handler](#legacy-handler-based-approach) below.
160
+
161
+ ---
162
+
163
+ ## Configuration reference
164
+
165
+ `create_openbox_deep_agent_handler` accepts the following keyword arguments:
166
+
167
+ | Parameter | Type | Default | Description |
168
+ |---|---|---|---|
169
+ | `graph` | `CompiledGraph` | **required** | Compiled LangGraph graph from `create_deep_agent()` |
170
+ | `api_url` | `str` | **required** | Base URL of your OpenBox Core instance |
171
+ | `api_key` | `str` | **required** | API key (`obx_live_*` or `obx_test_*`) |
172
+ | `agent_name` | `str` | `None` | Agent name as configured in the dashboard. Used as `workflow_type` on all governance events — **must match exactly** for policies and Behavior Rules to fire |
173
+ | `known_subagents` | `list[str]` | `["general-purpose"]` | Subagent names from `create_deep_agent(subagents=[...])`. Always include `"general-purpose"` if the default subagent is active |
174
+ | `validate` | `bool` | `True` | Validate API key against server on startup |
175
+ | `on_api_error` | `str` | `"fail_open"` | `"fail_open"` (allow on error) or `"fail_closed"` (block on error) |
176
+ | `governance_timeout` | `float` | `30.0` | HTTP timeout in seconds for governance calls |
177
+ | `session_id` | `str` | `None` | Optional session identifier |
178
+ | `hitl` | `dict` | `{}` | Human-in-the-loop config (see [HITL](#human-in-the-loop-hitl)) |
179
+ | `guard_interrupt_on_conflict` | `bool` | `True` | Raise if `interrupt_on` and OpenBox HITL are both enabled |
180
+ | `send_chain_start_event` | `bool` | `True` | Send `WorkflowStarted` event |
181
+ | `send_chain_end_event` | `bool` | `True` | Send `WorkflowCompleted` event |
182
+ | `send_llm_start_event` | `bool` | `True` | Send `LLMStarted` event (enables prompt guardrails + PII redaction) |
183
+ | `send_llm_end_event` | `bool` | `True` | Send `LLMCompleted` event |
184
+ | `tool_type_map` | `dict[str, str]` | `{}` | Map tool names to semantic types for classification |
185
+ | `skip_chain_types` | `set[str]` | `set()` | Chain node names to skip governance for |
186
+ | `skip_tool_types` | `set[str]` | `set()` | Tool names to skip governance for entirely |
187
+ | `sqlalchemy_engine` | `Engine` | `None` | SQLAlchemy Engine instance to instrument for DB governance. Required when the engine is created before the middleware (see [Database governance](#database-governance)) |
188
+
189
+ ---
190
+
191
+ ## Governance features
192
+
193
+ ### Policies (OPA / Rego)
194
+
195
+ Policies live in the OpenBox dashboard and are written in [Rego](https://www.openpolicyagent.org/docs/latest/policy-language/). Before every tool call the SDK sends an `ActivityStarted` event — your policy evaluates the payload and returns a decision.
196
+
197
+ **Fields available in `input`:**
198
+
199
+ | Field | Type | Description |
200
+ |---|---|---|
201
+ | `input.event_type` | `string` | `"ActivityStarted"` or `"ActivityCompleted"` |
202
+ | `input.activity_type` | `string` | Tool name (e.g. `"search_web"`, `"task"`) |
203
+ | `input.activity_input` | `array` | Tool arguments + optional `__openbox` metadata |
204
+ | `input.workflow_type` | `string` | Your `agent_name` |
205
+ | `input.workflow_id` | `string` | Session workflow ID |
206
+ | `input.trust_tier` | `int` | Agent trust tier (1–4) from dashboard |
207
+ | `input.hook_trigger` | `bool` | `true` when event is triggered by an outbound HTTP span |
208
+
209
+ **Example — block a restricted research topic:**
210
+
211
+ ```rego
212
+ package org.openboxai.policy
213
+
214
+ import future.keywords.if
215
+ import future.keywords.in
216
+
217
+ default result = {"decision": "CONTINUE", "reason": null}
218
+
219
+ restricted_terms := {"nuclear weapon", "bioweapon", "chemical weapon", "malware synthesis"}
220
+
221
+ result := {"decision": "BLOCK", "reason": "Search blocked: restricted research topic."} if {
222
+ input.event_type == "ActivityStarted"
223
+ input.activity_type == "search_web"
224
+ not input.hook_trigger
225
+ count(input.activity_input) > 0
226
+ entry := input.activity_input[0]
227
+ is_object(entry)
228
+ query := entry.query
229
+ is_string(query)
230
+ some term in restricted_terms
231
+ contains(lower(query), term)
232
+ }
233
+ ```
234
+
235
+ **Possible decisions:**
236
+
237
+ | Decision | Effect |
238
+ |---|---|
239
+ | `CONTINUE` | Tool executes normally |
240
+ | `BLOCK` | `GovernanceBlockedError` raised — tool does not execute |
241
+ | `REQUIRE_APPROVAL` | Agent pauses; human must approve or reject in dashboard |
242
+ | `HALT` | `GovernanceHaltError` raised — session terminated |
243
+
244
+ > **Always include `not input.hook_trigger`** in `BLOCK` and `REQUIRE_APPROVAL` rules. When a tool makes an outbound HTTP call, the SDK fires a second `ActivityStarted` with `hook_trigger: true`. Without this guard, the rule fires once for the tool call and once for every HTTP request it makes.
245
+
246
+ ---
247
+
248
+ ### Per-subagent policies
249
+
250
+ All subagent dispatches share the same `activity_type: "task"` — so a rule matching on `activity_type` can't distinguish a `writer` dispatch from a `researcher` one. The SDK appends a `__openbox` sentinel to `activity_input` to give your Rego policy that handle:
251
+
252
+ ```json
253
+ "activity_input": [
254
+ {
255
+ "description": "Write a report on AI safety",
256
+ "subagent_type": "writer"
257
+ },
258
+ {
259
+ "__openbox": {
260
+ "tool_type": "a2a",
261
+ "subagent_name": "writer"
262
+ }
263
+ }
264
+ ]
265
+ ```
266
+
267
+ OpenBox Core forwards `activity_input` to OPA unchanged — **no Core changes needed**.
268
+
269
+ **Target a specific subagent:**
270
+
271
+ ```rego
272
+ # All tasks dispatched to the writer subagent require human approval
273
+ result := {"decision": "REQUIRE_APPROVAL", "reason": "Writer subagent tasks require approval."} if {
274
+ input.event_type == "ActivityStarted"
275
+ input.activity_type == "task"
276
+ not input.hook_trigger
277
+ some item in input.activity_input
278
+ meta := item["__openbox"]
279
+ meta.subagent_name == "writer"
280
+ }
281
+ ```
282
+
283
+ **Target all subagent dispatches (any type):**
284
+
285
+ ```rego
286
+ result := {"decision": "BLOCK", "reason": "Subagent calls disabled outside business hours."} if {
287
+ input.event_type == "ActivityStarted"
288
+ input.activity_type == "task"
289
+ not input.hook_trigger
290
+ some item in input.activity_input
291
+ item["__openbox"].tool_type == "a2a"
292
+ # ... add time-based condition
293
+ }
294
+ ```
295
+
296
+ > `subagent_name` is extracted from the `task` tool's `subagent_type` field automatically. If it's missing, the SDK falls back to `"general-purpose"` and logs a warning when `OPENBOX_DEBUG=1` is set.
297
+
298
+ ---
299
+
300
+ ### Guardrails
301
+
302
+ Guardrails screen LLM prompts before the model sees them. Configure them per agent in the dashboard.
303
+
304
+ Before each `ainvoke`, the SDK sends the user's message as an `LLMStarted` event to Core:
305
+
306
+ - **PII redaction** — matched fields are redacted in-place. The original text never reaches the model.
307
+ - **Content block** — `GuardrailsValidationError` is raised and the session halts before the graph starts.
308
+
309
+ Supported guardrail types:
310
+
311
+ | Type | ID | What it detects |
312
+ |---|---|---|
313
+ | PII detection | `1` | Names, emails, phone numbers, SSNs, credit cards |
314
+ | Content filter | `2` | Harmful or unsafe content categories |
315
+ | Toxicity | `3` | Toxic language |
316
+ | Ban words | `4` | Custom word/phrase blocklist |
317
+
318
+ > DeepAgents fires `on_chat_model_start` for every LLM call — including internal subagent calls that contain only system or tool messages. The SDK automatically skips those (no user-turn content = nothing to screen), so you won't hit guardrail parse errors from subagent-internal LLM invocations.
319
+
320
+ ---
321
+
322
+ ### Human-in-the-loop (HITL)
323
+
324
+ When a policy returns `REQUIRE_APPROVAL`, the agent pauses and polls OpenBox until a human approves or rejects from the dashboard.
325
+
326
+ ```python
327
+ governed = create_openbox_deep_agent_handler(
328
+ graph=agent,
329
+ api_url=os.environ["OPENBOX_URL"],
330
+ api_key=os.environ["OPENBOX_API_KEY"],
331
+ agent_name="ResearchBot",
332
+ known_subagents=["researcher", "writer", "general-purpose"],
333
+ hitl={
334
+ "enabled": True,
335
+ "poll_interval_ms": 5_000, # check every 5 seconds
336
+ "max_wait_ms": 300_000, # timeout after 5 minutes
337
+ },
338
+ )
339
+ ```
340
+
341
+ **HITL config options:**
342
+
343
+ | Key | Type | Default | Description |
344
+ |---|---|---|---|
345
+ | `enabled` | `bool` | `False` | Enable HITL polling |
346
+ | `poll_interval_ms` | `int` | `5000` | How often to poll for a decision (ms) |
347
+ | `max_wait_ms` | `int` | `300000` | Total wait before `ApprovalTimeoutError` (ms) |
348
+ | `skip_tool_types` | `set[str]` | `set()` | Tools that never wait for HITL |
349
+
350
+ **On approval:** tool or subagent execution continues normally.
351
+ **On rejection:** `ApprovalRejectedError` is raised → re-raised as `GovernanceHaltError`.
352
+ **On timeout:** `ApprovalTimeoutError` is raised.
353
+
354
+ #### Conflict with DeepAgents `interrupt_on`
355
+
356
+ DeepAgents has its own interrupt mechanism via `interrupt_on` (`HumanInTheLoopMiddleware`). Using both OpenBox HITL and `interrupt_on` simultaneously causes double-pausing and unpredictable execution.
357
+
358
+ The SDK detects this at construction time and raises a `ValueError`:
359
+
360
+ ```
361
+ [OpenBox] DeepAgents graph has interrupt_on configured AND OpenBox HITL is enabled.
362
+ These conflict — OpenBox must own the HITL flow.
363
+ Remove interrupt_on from create_deep_agent, or set guard_interrupt_on_conflict=False to suppress this check.
364
+ ```
365
+
366
+ If you want OpenBox to own HITL (which gives you the full dashboard + audit trail), remove `interrupt_on`:
367
+
368
+ ```python
369
+ # ✅ OpenBox owns HITL
370
+ agent = create_deep_agent(model="gpt-4o-mini", tools=[...], subagents=[...])
371
+
372
+ # ❌ conflict
373
+ agent = create_deep_agent(model="gpt-4o-mini", tools=[...], interrupt_on=["task"])
374
+ ```
375
+
376
+ ---
377
+
378
+ ### Behavior Rules (AGE)
379
+
380
+ > ⚠️ **Read [Known Limitations — Behavior Rules](#behavior-rules-count-task-dispatches-not-subagent-tool-calls) before setting these up.** The semantics are materially different from the Temporal SDK, especially for DeepAgents.
381
+
382
+ Behavior Rules detect patterns across tool call sequences within a session — rate limits, unusual sequences, repeated high-risk dispatches. They're configured in the dashboard and enforced by the OpenBox Activity Governance Engine (AGE).
383
+
384
+ The SDK instruments `httpx` at startup (one-time idempotent patch). Any `httpx` call a tool makes is captured as a span and attached to that tool's `ActivityCompleted` event.
385
+
386
+ ---
387
+
388
+ ### Tool classification
389
+
390
+ Map your non-subagent tools to semantic types so your Rego policies can target whole categories instead of listing every tool name.
391
+
392
+ ```python
393
+ governed = create_openbox_deep_agent_handler(
394
+ graph=agent,
395
+ agent_name="ResearchBot",
396
+ known_subagents=["researcher", "writer", "general-purpose"],
397
+ tool_type_map={
398
+ "search_web": "http",
399
+ "export_data": "http",
400
+ "query_db": "database",
401
+ },
402
+ ...
403
+ )
404
+ ```
405
+
406
+ **Supported `tool_type` values:** `"http"`, `"database"`, `"builtin"`, `"a2a"`
407
+
408
+ > `"a2a"` is set automatically on every `task` call when `subagent_name` is resolved. Don't add `"task"` to `tool_type_map`.
409
+
410
+ When a type is set, the SDK appends an `__openbox` sentinel to `activity_input`:
411
+
412
+ ```json
413
+ {"__openbox": {"tool_type": "http"}}
414
+ ```
415
+
416
+ Rego can match on it:
417
+
418
+ ```rego
419
+ result := {"decision": "REQUIRE_APPROVAL", "reason": "HTTP calls require approval."} if {
420
+ input.event_type == "ActivityStarted"
421
+ not input.hook_trigger
422
+ some item in input.activity_input
423
+ item["__openbox"].tool_type == "http"
424
+ }
425
+ ```
426
+
427
+ ---
428
+
429
+ ### Database governance
430
+
431
+ The SDK instruments database operations via OpenTelemetry. Supported libraries: psycopg2, asyncpg, mysql, pymysql, sqlite3, pymongo, redis, sqlalchemy.
432
+
433
+ Install the instrumentor for your database:
434
+
435
+ ```bash
436
+ pip install opentelemetry-instrumentation-sqlite3 # SQLite
437
+ pip install opentelemetry-instrumentation-psycopg2 # PostgreSQL
438
+ pip install opentelemetry-instrumentation-sqlalchemy # SQLAlchemy ORM
439
+ ```
440
+
441
+ **Important: initialization order.** If your database connection or SQLAlchemy engine is created **before** `create_openbox_middleware()`, pass the engine explicitly:
442
+
443
+ ```python
444
+ from langchain_community.utilities import SQLDatabase
445
+
446
+ # Engine created here (before middleware)
447
+ db = SQLDatabase.from_uri("sqlite:///Chinook.db")
448
+
449
+ # Pass engine so the SDK can instrument it retroactively
450
+ middleware = create_openbox_middleware(
451
+ api_url=os.environ["OPENBOX_URL"],
452
+ api_key=os.environ["OPENBOX_API_KEY"],
453
+ agent_name="TextToSQL",
454
+ sqlalchemy_engine=db._engine, # <-- instrument existing engine
455
+ )
456
+ ```
457
+
458
+ Without `sqlalchemy_engine=`, only engines created **after** middleware initialization are instrumented. Check startup logs for `Instrumented: sqlalchemy (existing engine)` vs `Instrumented: sqlalchemy (future engines)` to confirm.
459
+
460
+ ---
461
+
462
+ ## Error handling
463
+
464
+ All governance exceptions are importable from `openbox_deepagent`:
465
+
466
+ ```python
467
+ from openbox_deepagent import (
468
+ GovernanceBlockedError,
469
+ GovernanceHaltError,
470
+ GuardrailsValidationError,
471
+ ApprovalRejectedError,
472
+ ApprovalTimeoutError,
473
+ )
474
+
475
+ try:
476
+ result = await governed.ainvoke({"messages": [...]}, config=...)
477
+ except GovernanceBlockedError as e:
478
+ # Policy returned BLOCK — tool or subagent dispatch did not execute
479
+ print(f"Blocked: {e}")
480
+ except GovernanceHaltError as e:
481
+ # Policy returned HALT, or a HITL decision was rejected/expired
482
+ print(f"Session halted: {e}")
483
+ except GuardrailsValidationError as e:
484
+ # Guardrail fired on the user prompt — graph never started
485
+ print(f"Guardrail: {e}")
486
+ except ApprovalRejectedError as e:
487
+ print(f"Rejected by reviewer: {e}")
488
+ except ApprovalTimeoutError as e:
489
+ print(f"HITL timed out: {e}")
490
+ ```
491
+
492
+ | Exception | Raised when |
493
+ |---|---|
494
+ | `GovernanceBlockedError` | Policy returned `BLOCK` |
495
+ | `GovernanceHaltError` | Policy returned `HALT`, or a HITL decision was rejected or expired |
496
+ | `GuardrailsValidationError` | A guardrail fired on the user prompt |
497
+ | `ApprovalRejectedError` | A human explicitly rejected a `REQUIRE_APPROVAL` decision |
498
+ | `ApprovalTimeoutError` | HITL polling exceeded `max_wait_ms` |
499
+ | `OpenBoxAuthError` | API key is invalid or unauthorized |
500
+
501
+ ---
502
+
503
+ ## Advanced usage
504
+
505
+ ### Streaming
506
+
507
+ Use `astream_governed` to stream graph updates with governance applied inline:
508
+
509
+ ```python
510
+ async for chunk in governed.astream_governed(
511
+ {"messages": [{"role": "user", "content": "Research quantum computing"}]},
512
+ config={"configurable": {"thread_id": "session-001"}},
513
+ stream_mode="values",
514
+ ):
515
+ process(chunk)
516
+ ```
517
+
518
+ `astream` and `astream_events` are also available as thin proxies, useful for `langgraph dev` and other tooling that expects a `CompiledStateGraph`.
519
+
520
+ ### `langgraph dev` compatibility
521
+
522
+ Export the governed handler as a module-level variable:
523
+
524
+ ```python
525
+ # agent.py
526
+ import os
527
+ from deepagents import create_deep_agent
528
+ from openbox_deepagent import create_openbox_deep_agent_handler
529
+
530
+ _agent = create_deep_agent(model="...", tools=[...], subagents=[...])
531
+
532
+ graph = create_openbox_deep_agent_handler(
533
+ graph=_agent,
534
+ api_url=os.environ["OPENBOX_URL"],
535
+ api_key=os.environ["OPENBOX_API_KEY"],
536
+ agent_name="ResearchBot",
537
+ known_subagents=["researcher", "writer", "general-purpose"],
538
+ )
539
+ ```
540
+
541
+ ```json
542
+ // langgraph.json
543
+ {
544
+ "graphs": {
545
+ "agent": "./agent.py:graph"
546
+ }
547
+ }
548
+ ```
549
+
550
+ ### Multi-turn sessions
551
+
552
+ Use a stable `thread_id` across turns. The SDK generates a fresh `workflow_id` per call internally — your code just passes the same `thread_id`:
553
+
554
+ ```python
555
+ config = {"configurable": {"thread_id": "user-42-session-7"}}
556
+
557
+ await governed.ainvoke({"messages": [{"role": "user", "content": "Research LangGraph"}]}, config=config)
558
+ await governed.ainvoke({"messages": [{"role": "user", "content": "Now write a report on it"}]}, config=config)
559
+ ```
560
+
561
+ ### Inspecting registered subagents
562
+
563
+ ```python
564
+ print(governed.get_known_subagents())
565
+ # ['analyst', 'general-purpose', 'researcher', 'writer']
566
+ ```
567
+
568
+ ### `fail_closed` mode
569
+
570
+ ```python
571
+ governed = create_openbox_deep_agent_handler(
572
+ graph=agent,
573
+ on_api_error="fail_closed",
574
+ ...
575
+ )
576
+ ```
577
+
578
+ ### Reducing governance noise from DeepAgents middleware nodes
579
+
580
+ `create_deep_agent()` fires `on_chain_start` for every middleware node. None of these need governance — suppress them:
581
+
582
+ ```python
583
+ governed = create_openbox_deep_agent_handler(
584
+ graph=agent,
585
+ skip_chain_types={
586
+ "model",
587
+ "tools",
588
+ "PatchToolCallsMiddleware.before_agent",
589
+ "TodoListMiddleware.after_model",
590
+ "FilesystemMiddleware.before_agent",
591
+ "SummarizationMiddleware.before_agent",
592
+ "AnthropicPromptCachingMiddleware.before_agent",
593
+ "SubAgentMiddleware.before_agent",
594
+ "MemoryMiddleware.before_agent",
595
+ "SkillsMiddleware.before_agent",
596
+ },
597
+ ...
598
+ )
599
+ ```
600
+
601
+ Skipping `"tools"` or `"model"` does **not** suppress tool or LLM governance — those events come from `on_tool_start` and `on_chat_model_start` inside those nodes, which are separate.
602
+
603
+ Run with `OPENBOX_DEBUG=1` and look for `[OBX_EVENT]` lines to find the exact node names your graph emits.
604
+
605
+ ---
606
+
607
+ ## Known limitations
608
+
609
+ These constraints come from how DeepAgents and LangGraph work at runtime. The base limitations (Behavior Rules, `httpx`-only spans, `ainvoke` session scoping) are covered in the [openbox-langgraph-sdk README](../sdk-langgraph-python/README.md#known-limitations). These are the DeepAgents-specific additions.
610
+
611
+ ### Subagent internals are invisible to governance
612
+
613
+ Subagents execute *inside* the `task` tool body via `subagent.invoke()`. Their internal tool calls and LLM calls are not surfaced in the outer `astream_events` stream. From the governance layer, the `task` call is a single atomic unit.
614
+
615
+ **What this means concretely:**
616
+ - A `search_web` call made by the `researcher` subagent is not a separate `ActivityStarted` event — you cannot write a Rego policy that targets it
617
+ - You cannot apply HITL to a tool call a subagent makes — only to the `task` dispatch itself
618
+ - The `ActivityCompleted` for `task` carries the final output, but not a breakdown of what the subagent did internally
619
+
620
+ **What you can govern:**
621
+ - Whether a specific subagent type is dispatched at all (`BLOCK` / `REQUIRE_APPROVAL` on `activity_type == "task"` with `subagent_name` matching)
622
+ - Patterns in how many times each subagent type is dispatched per session
623
+
624
+ **Workaround:** If you need to govern a specific tool call regardless of which subagent triggers it, add it to the outer agent's tool list as well. The outer agent's tool calls are fully governed.
625
+
626
+ **Contrast with Temporal:** In the Temporal SDK, every activity — including ones inside a "subagent" — runs as an independent governed unit. DeepAgents has no equivalent. Subagent execution is opaque to the outer event stream.
627
+
628
+ ---
629
+
630
+ ### Behavior Rules count `task` dispatches, not subagent-internal tool calls
631
+
632
+ The AGE sees `task(subagent_type="researcher")` as one `ActivityStarted` + one `ActivityCompleted`. The researcher then calling `search_web` five times internally is invisible.
633
+
634
+ A rule like "block if `search_web` exceeds 10 calls per session" only counts direct `search_web` calls from the outer agent — not from subagents.
635
+
636
+ **What works reliably for DeepAgents:**
637
+ - Rate-limiting `task` dispatches per subagent type (e.g. researcher called more than 5 times)
638
+ - Rate-limiting total subagent dispatches
639
+ - Detecting unusual outer-agent tool sequences
640
+
641
+ ---
642
+
643
+ ### HTTP spans are captured for outer agent tools only
644
+
645
+ The `httpx` instrumentation captures calls made during outer agent tool execution. HTTP calls inside subagent tool bodies run in a separate async context and are not captured as spans on the `task` `ActivityCompleted`.
646
+
647
+ ---
648
+
649
+ ### Behavior Rules don't span `ainvoke` calls
650
+
651
+ Each `ainvoke` is a separate governance session with a new `workflow_id`. Behavior Rules track patterns **within a single invocation only**. Cross-turn pattern detection is not yet supported.
652
+
653
+ ---
654
+
655
+ ### Subagent model providers must be configured independently
656
+
657
+ Each subagent can specify its own `model`. If a subagent uses a different provider from the outer agent (e.g. outer agent uses OpenAI, subagent uses Anthropic), you need the corresponding API key configured in your environment. The SDK doesn't validate subagent model configs at startup.
658
+
659
+ ---
660
+
661
+ ## Debugging
662
+
663
+ ```bash
664
+ OPENBOX_DEBUG=1 python agent.py
665
+ ```
666
+
667
+ This enables two log streams:
668
+
669
+ **`[OBX_EVENT]`** — every raw LangGraph event (stderr):
670
+ ```
671
+ [OBX_EVENT] on_chain_start name='LangGraph' node=None
672
+ [OBX_EVENT] on_chain_start name='SubAgentMiddleware...' node='SubAgentMiddleware...'
673
+ [OBX_EVENT] on_chat_model_start name='ChatOpenAI' node='model'
674
+ [OBX_EVENT] on_tool_start name='task' node='tools'
675
+ [OBX_EVENT] on_tool_start name='search_web' node='tools'
676
+ ```
677
+
678
+ **`[OpenBox Debug]`** — every governance request and response (stdout):
679
+ ```
680
+ [OpenBox Debug] governance request: {
681
+ "event_type": "ActivityStarted",
682
+ "activity_type": "task",
683
+ "workflow_type": "ResearchBot",
684
+ "activity_input": [
685
+ {"description": "Write a report on AI safety", "subagent_type": "writer"},
686
+ {"__openbox": {"tool_type": "a2a", "subagent_name": "writer"}}
687
+ ],
688
+ "hook_trigger": false
689
+ }
690
+ [OpenBox Debug] governance response: { "verdict": "require_approval", "reason": "Writer tasks require approval." }
691
+ ```
692
+
693
+ **If things aren't working, check for these:**
694
+
695
+ - `workflow_type` doesn't match your dashboard agent name → policies never fire
696
+ - `subagent_name` is `"general-purpose"` when you expected something else → `subagent_type` was missing from the `task` input; look for a `task tool input missing subagent_type` warning in the debug output
697
+ - A rule is double-triggering → you're missing `not input.hook_trigger` in your Rego
698
+ - Warning at startup about `known_subagents` → you passed an empty list; include at least `["general-purpose"]`
699
+
700
+ ---
701
+
702
+ ## Legacy handler-based approach
703
+
704
+ > **Deprecated.** Use `create_openbox_middleware()` with `create_deep_agent(middleware=[...])` instead.
705
+
706
+ The handler-based approach wraps the compiled graph and processes events via `astream_events`:
707
+
708
+ ```python
709
+ from openbox_deepagent import create_openbox_deep_agent_handler
710
+
711
+ agent = create_deep_agent(model=init_chat_model("openai:gpt-4o-mini"), tools=[...], subagents=[...])
712
+
713
+ # Deprecated — emits DeprecationWarning
714
+ governed = create_openbox_deep_agent_handler(
715
+ graph=agent,
716
+ api_url=os.environ["OPENBOX_URL"],
717
+ api_key=os.environ["OPENBOX_API_KEY"],
718
+ agent_name="ResearchBot",
719
+ known_subagents=["researcher", "analyst", "writer", "general-purpose"],
720
+ )
721
+
722
+ result = await governed.ainvoke(
723
+ {"messages": [{"role": "user", "content": "Research LangGraph papers"}]},
724
+ config={"configurable": {"thread_id": "session-001"}},
725
+ )
726
+ ```
727
+
728
+ ---
729
+
730
+ ## Contributing
731
+
732
+ Contributions are welcome! Please open an issue before submitting a large pull request.
733
+
734
+ ```bash
735
+ git clone https://github.com/openbox-ai/openbox-langchain-sdk
736
+ cd sdk-deepagent-python
737
+ uv sync --extra dev
738
+ uv run pytest tests/ -v
739
+ ```