lodestar-langgraph 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,28 @@
1
+ node_modules/
2
+ .lodestar/
3
+ dist/
4
+ build/
5
+ _site/
6
+ *.tsbuildinfo
7
+
8
+ # Claude Code / agent-tool per-machine settings — keep `.claude/` tracked
9
+ # for the agent guidance and slash commands, but never commit the
10
+ # per-machine bash-permission allowlists.
11
+ .claude/settings.local.json
12
+
13
+ # Local-only working notes and scratch files — never committed.
14
+ .claude/local/
15
+
16
+ # Internal planning docs — kept local, not committed to this repo. The
17
+ # cast-build tooling under walkthrough/ stays tracked.
18
+ /docs/strategy/
19
+ /docs/internal/*
20
+ !/docs/internal/walkthrough/
21
+ /docs/internal/walkthrough/*
22
+ !/docs/internal/walkthrough/build-poison-cast.ts
23
+
24
+ # Python bytecode caches + build artifacts (runtimes/ — the LangGraph/CrewAI hooks)
25
+ __pycache__/
26
+ *.pyc
27
+ *.pyo
28
+ *.egg-info/
@@ -0,0 +1,89 @@
1
+ Metadata-Version: 2.4
2
+ Name: lodestar-langgraph
3
+ Version: 0.3.0
4
+ Summary: Govern a LangGraph agent's native tool calls with Lodestar — the thin native hook that remotes each tool call through the Lodestar Action Kernel over NDJSON-RPC (ADR-0024).
5
+ Project-URL: Homepage, https://qmilab.com/lodestar
6
+ Project-URL: Repository, https://github.com/qmilab/lodestar
7
+ Project-URL: Issues, https://github.com/qmilab/lodestar/issues
8
+ Author-email: QMI Lab <hello@qmilab.com>
9
+ License: Apache-2.0
10
+ Keywords: agents,ai-agents,governance,langgraph,lodestar,trust
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: License :: OSI Approved :: Apache Software License
13
+ Classifier: Programming Language :: Python :: 3
14
+ Requires-Python: >=3.10
15
+ Requires-Dist: lodestar-runtime-client==0.3.0
16
+ Provides-Extra: dev
17
+ Requires-Dist: langchain-core>=0.3.0; extra == 'dev'
18
+ Requires-Dist: langgraph>=0.2.0; extra == 'dev'
19
+ Requires-Dist: pytest>=8.0; extra == 'dev'
20
+ Provides-Extra: langgraph
21
+ Requires-Dist: langchain-core>=0.3.0; extra == 'langgraph'
22
+ Requires-Dist: langgraph>=0.2.0; extra == 'langgraph'
23
+ Description-Content-Type: text/markdown
24
+
25
+ # lodestar-langgraph
26
+
27
+ Govern a **LangGraph** agent's native tool calls with
28
+ [Lodestar](https://qmilab.com/lodestar) — the open epistemic-governance framework
29
+ for AI agents.
30
+
31
+ LangGraph runs tools natively in-process and does not speak MCP, so the MCP proxy
32
+ cannot wrap it. This package is the **thin native hook** (ADR-0024): it spawns the
33
+ TypeScript **governance-gate sidecar** (`lodestar runtime gate`) and remotes each
34
+ native tool call through the Lodestar Action Kernel over newline-delimited
35
+ JSON-RPC. The same machinery the MCP proxy runs — two-phase `propose → arbitrate →
36
+ execute`, the signed policy gate, cognitive-core ingestion (external-document
37
+ content can't auto-promote), sentinel arbitration, and the signed-approval L4 hold
38
+ path — now applies to LangGraph, with no change to the engine.
39
+
40
+ The tool body runs **only** inside the gate's execute phase, reached only after
41
+ the gate (and any approval hold) clears: "tools that do work before approval are
42
+ bugs" — across the Python↔TS boundary.
43
+
44
+ ## Install
45
+
46
+ ```bash
47
+ pip install "lodestar-langgraph[langgraph]"
48
+ # and the Lodestar CLI (Bun/npm), which provides `lodestar runtime gate`:
49
+ npm install -g @qmilab/lodestar-cli # or: bun add -g @qmilab/lodestar-cli
50
+ ```
51
+
52
+ ## Use
53
+
54
+ ```python
55
+ from langgraph.prebuilt import ToolNode
56
+ from lodestar_langgraph import GateClient, govern_tools, governed_call
57
+
58
+ with GateClient("runtime-gate.config.json") as gate:
59
+ governed = govern_tools(gate, my_tools) # register + wrap the toolset
60
+ llm = llm.bind_tools(governed) # the model only sees governed tools
61
+ tool_node = ToolNode(governed) # and so does the executor
62
+ # ... build and run your graph as usual; every tool call is now governed.
63
+
64
+ # A custom node invokes a governed tool through the helper, never raw:
65
+ result = governed_call(gate, "search_web", {"q": "lodestar"})
66
+ ```
67
+
68
+ The gate's config (`runtime-gate.config.json`) is a `RuntimeGateConfig` — the
69
+ signed policy document, approver keys, sentinel ids, persistence, and durable log
70
+ root all live there. The hook never holds credentials or policy.
71
+
72
+ ## Scope (honest, ADR-0004 lineage)
73
+
74
+ This is **governance over declared actions, not OS containment of the process.**
75
+ Raw I/O performed *outside* the tool abstraction — a custom node that calls
76
+ `requests.get()` directly instead of a registered tool — is outside the governed
77
+ surface, exactly as `guard.wrap()` and the MCP proxy only govern the tools they
78
+ are given. A call for an unregistered tool is **denied** (fail closed). Pair the
79
+ adapter with network/filesystem controls for defense in depth.
80
+
81
+ ## Holds
82
+
83
+ An L4 action the trust-ladder floor parks for approval is resolved by
84
+ block-polling the gate up to the deadline for a *signed* approval (`hold_wait_ms`)
85
+ — the headless default. For the idiomatic LangGraph `interrupt()` integration,
86
+ call `gate.govern(...)` directly, raise `interrupt` with the returned `action_id`
87
+ / `request_id`, and call `gate.resume(...)` on `Command(resume=…)`.
88
+
89
+ Apache-2.0. Part of the Lodestar monorepo (`runtimes/langgraph/`).
@@ -0,0 +1,65 @@
1
+ # lodestar-langgraph
2
+
3
+ Govern a **LangGraph** agent's native tool calls with
4
+ [Lodestar](https://qmilab.com/lodestar) — the open epistemic-governance framework
5
+ for AI agents.
6
+
7
+ LangGraph runs tools natively in-process and does not speak MCP, so the MCP proxy
8
+ cannot wrap it. This package is the **thin native hook** (ADR-0024): it spawns the
9
+ TypeScript **governance-gate sidecar** (`lodestar runtime gate`) and remotes each
10
+ native tool call through the Lodestar Action Kernel over newline-delimited
11
+ JSON-RPC. The same machinery the MCP proxy runs — two-phase `propose → arbitrate →
12
+ execute`, the signed policy gate, cognitive-core ingestion (external-document
13
+ content can't auto-promote), sentinel arbitration, and the signed-approval L4 hold
14
+ path — now applies to LangGraph, with no change to the engine.
15
+
16
+ The tool body runs **only** inside the gate's execute phase, reached only after
17
+ the gate (and any approval hold) clears: "tools that do work before approval are
18
+ bugs" — across the Python↔TS boundary.
19
+
20
+ ## Install
21
+
22
+ ```bash
23
+ pip install "lodestar-langgraph[langgraph]"
24
+ # and the Lodestar CLI (Bun/npm), which provides `lodestar runtime gate`:
25
+ npm install -g @qmilab/lodestar-cli # or: bun add -g @qmilab/lodestar-cli
26
+ ```
27
+
28
+ ## Use
29
+
30
+ ```python
31
+ from langgraph.prebuilt import ToolNode
32
+ from lodestar_langgraph import GateClient, govern_tools, governed_call
33
+
34
+ with GateClient("runtime-gate.config.json") as gate:
35
+ governed = govern_tools(gate, my_tools) # register + wrap the toolset
36
+ llm = llm.bind_tools(governed) # the model only sees governed tools
37
+ tool_node = ToolNode(governed) # and so does the executor
38
+ # ... build and run your graph as usual; every tool call is now governed.
39
+
40
+ # A custom node invokes a governed tool through the helper, never raw:
41
+ result = governed_call(gate, "search_web", {"q": "lodestar"})
42
+ ```
43
+
44
+ The gate's config (`runtime-gate.config.json`) is a `RuntimeGateConfig` — the
45
+ signed policy document, approver keys, sentinel ids, persistence, and durable log
46
+ root all live there. The hook never holds credentials or policy.
47
+
48
+ ## Scope (honest, ADR-0004 lineage)
49
+
50
+ This is **governance over declared actions, not OS containment of the process.**
51
+ Raw I/O performed *outside* the tool abstraction — a custom node that calls
52
+ `requests.get()` directly instead of a registered tool — is outside the governed
53
+ surface, exactly as `guard.wrap()` and the MCP proxy only govern the tools they
54
+ are given. A call for an unregistered tool is **denied** (fail closed). Pair the
55
+ adapter with network/filesystem controls for defense in depth.
56
+
57
+ ## Holds
58
+
59
+ An L4 action the trust-ladder floor parks for approval is resolved by
60
+ block-polling the gate up to the deadline for a *signed* approval (`hold_wait_ms`)
61
+ — the headless default. For the idiomatic LangGraph `interrupt()` integration,
62
+ call `gate.govern(...)` directly, raise `interrupt` with the returned `action_id`
63
+ / `request_id`, and call `gate.resume(...)` on `Command(resume=…)`.
64
+
65
+ Apache-2.0. Part of the Lodestar monorepo (`runtimes/langgraph/`).
@@ -0,0 +1,38 @@
1
+ """lodestar-langgraph — govern a LangGraph agent's native tool calls with Lodestar.
2
+
3
+ The thin native hook of the runtime-adapter epic (ADR-0024). It spawns the
4
+ TypeScript governance-gate sidecar (``lodestar runtime gate``) and remotes each
5
+ native LangGraph tool call through the Action Kernel over NDJSON-RPC — so the
6
+ same two-phase execution, policy gate, cognitive-core ingestion, sentinel
7
+ arbitration, and signed-approval hold path the MCP proxy runs now apply to a
8
+ framework that does not speak MCP.
9
+
10
+ Quick start::
11
+
12
+ from lodestar_langgraph import GateClient, govern_tools, governed_call
13
+
14
+ with GateClient("runtime-gate.config.json") as gate:
15
+ governed = govern_tools(gate, my_tools)
16
+ llm = llm.bind_tools(governed)
17
+ tool_node = ToolNode(governed)
18
+ # ... build and run your graph as usual ...
19
+ """
20
+
21
+ from .adapter import (
22
+ DEFAULT_HOLD_WAIT_MS,
23
+ LodestarDenied,
24
+ govern_tools,
25
+ governed_call,
26
+ )
27
+ from lodestar_runtime_client import GateClient, GateError
28
+
29
+ __all__ = [
30
+ "GateClient",
31
+ "GateError",
32
+ "govern_tools",
33
+ "governed_call",
34
+ "LodestarDenied",
35
+ "DEFAULT_HOLD_WAIT_MS",
36
+ ]
37
+
38
+ __version__ = "0.3.0"
@@ -0,0 +1,169 @@
1
+ """LangGraph / LangChain integration for the Lodestar governance gate (ADR-0024).
2
+
3
+ The adapter governs the framework's **tool-invocation surface** and nothing
4
+ implicitly (ADR-0024 §3, one closed fail-closed surface):
5
+
6
+ * :func:`govern_tools` registers every bound tool with the gate and returns
7
+ *wrapped* tools to hand to ``bind_tools`` AND the prebuilt ``ToolNode`` — a
8
+ governed wrapper is the only object the agent ever holds for a governed
9
+ capability. The wrapper routes each call through the gate (``propose →
10
+ arbitrate``); only on an ``allow`` does the gate remote the body back to run.
11
+ * :func:`governed_call` is the helper a **custom node** uses to invoke a governed
12
+ tool — never a raw tool function.
13
+ * A call for a tool that was never registered is **denied by the gate** (fail
14
+ closed). Raw I/O performed outside the tool abstraction is outside the governed
15
+ surface, exactly as ``guard.wrap()`` and the MCP proxy only govern the tools
16
+ they are given — pair with network/filesystem controls for defense in depth.
17
+
18
+ Holds (an L4 action the trust-ladder floor parks for approval) are resolved by
19
+ **block-polling** the gate up to the deadline for a *signed* approval
20
+ (``hold_wait_ms``) — the headless default the ADR sanctions. For the idiomatic
21
+ LangGraph ``interrupt()`` integration, call :func:`GateClient.govern` directly
22
+ and raise ``interrupt`` with the returned ``action_id`` / ``request_id``, then
23
+ :func:`GateClient.resume` on ``Command(resume=…)``.
24
+ """
25
+
26
+ from __future__ import annotations
27
+
28
+ import asyncio
29
+ from typing import Any, Callable, Iterable, Optional
30
+
31
+ from lodestar_runtime_client import GateClient
32
+
33
+ # Default block-poll budget for a held action, in ms. Keep comfortably under the
34
+ # graph/client timeout; 0 means "don't wait" (surface the hold immediately).
35
+ DEFAULT_HOLD_WAIT_MS = 60_000
36
+
37
+
38
+ class LodestarDenied(Exception):
39
+ """A governed tool call was denied, held-then-timed-out, or failed.
40
+
41
+ ``kind`` is the machine tag from the gate (``policy_denied``,
42
+ ``approval_denied``, ``approval_timeout``, ``unregistered_tool``,
43
+ ``precondition_failed``, ``execution_failed``).
44
+ """
45
+
46
+ def __init__(self, reason: str, kind: str, action_id: Optional[str] = None) -> None:
47
+ super().__init__(reason)
48
+ self.reason = reason
49
+ self.kind = kind
50
+ self.action_id = action_id
51
+
52
+
53
+ def governed_call(
54
+ client: GateClient,
55
+ tool: str,
56
+ args: dict,
57
+ *,
58
+ hold_wait_ms: int = DEFAULT_HOLD_WAIT_MS,
59
+ ) -> Any:
60
+ """Invoke a governed tool through the gate and return its output.
61
+
62
+ Drives the full two-phase flow: ``govern``; on a hold, block-poll ``resume``
63
+ up to ``hold_wait_ms`` for a signed approval; on completion, return the tool
64
+ output. Raises :class:`LodestarDenied` on any non-completion (including an
65
+ unregistered tool — fail closed). This is the helper a custom LangGraph node
66
+ calls; never invoke a raw tool function from a custom node.
67
+ """
68
+ result = client.govern(tool, args)
69
+ if result.get("phase") == "pending_approval":
70
+ result = client.resume(
71
+ str(result.get("action_id")),
72
+ str(result.get("request_id")),
73
+ wait_ms=hold_wait_ms,
74
+ )
75
+ phase = result.get("phase")
76
+ if phase == "completed":
77
+ return result.get("output")
78
+ raise LodestarDenied(
79
+ str(result.get("reason") or "governed tool call was not allowed"),
80
+ str(result.get("kind") or phase or "denied"),
81
+ result.get("action_id"),
82
+ )
83
+
84
+
85
+ def govern_tools(
86
+ client: GateClient,
87
+ tools: Iterable[Any],
88
+ *,
89
+ hold_wait_ms: int = DEFAULT_HOLD_WAIT_MS,
90
+ on_denied: Optional[Callable[[LodestarDenied], Any]] = None,
91
+ ) -> list[Any]:
92
+ """Register and wrap a LangChain toolset for governance.
93
+
94
+ Returns governed ``StructuredTool``s to pass to BOTH ``llm.bind_tools(...)``
95
+ and the prebuilt ``ToolNode(...)``, so the agent never holds an ungoverned
96
+ handle. Each wrapper runs the call through the gate; the gate remotes the
97
+ *original* tool body back to run only inside its execute phase.
98
+
99
+ ``on_denied`` maps a :class:`LodestarDenied` to a tool return value (so a
100
+ ``ToolNode`` surfaces a re-plannable message rather than raising); the default
101
+ re-raises as a ``ToolException`` so the framework's own error handling
102
+ applies.
103
+ """
104
+ # Imported lazily so `from lodestar_langgraph import GateClient` works without
105
+ # langchain installed (the client is pure stdlib).
106
+ from langchain_core.tools import StructuredTool, ToolException
107
+
108
+ governed: list[Any] = []
109
+ for tool in tools:
110
+ name = tool.name
111
+ # Bind the ORIGINAL tool body for the gate's remoted execute. Using the
112
+ # original (not the wrapper) is what prevents recursion.
113
+ client.register_tool(name, _body_for(tool))
114
+ governed.append(_wrap_tool(client, tool, hold_wait_ms, on_denied, StructuredTool, ToolException))
115
+ return governed
116
+
117
+
118
+ def _body_for(tool: Any) -> Callable[[dict], dict]:
119
+ """A run_tool body that executes the real LangChain tool and wraps its result
120
+ into the gate's tool-result shape.
121
+
122
+ The gate remotes the body on a worker thread with no running event loop, so we
123
+ use the synchronous ``tool.invoke`` for a tool with a sync implementation, and
124
+ fall back to running the coroutine for an **async-only** tool (one defined with
125
+ a ``coroutine`` and no sync ``func``): for such a tool ``invoke`` raises
126
+ ``NotImplementedError``, so it must go through ``ainvoke``. This serves both
127
+ sync and async tools regardless of whether the graph drove ``invoke`` or
128
+ ``ainvoke``.
129
+ """
130
+
131
+ def body(args: dict) -> dict:
132
+ try:
133
+ output = tool.invoke(args)
134
+ except NotImplementedError:
135
+ # Async-only tool: run its coroutine in this (loop-less) worker thread.
136
+ output = asyncio.run(tool.ainvoke(args))
137
+ documents: list[dict] = []
138
+ # A tool may surface untrusted document content for external_document
139
+ # evidence by returning {"output": ..., "_lodestar_documents": [...]}.
140
+ if isinstance(output, dict) and "_lodestar_documents" in output:
141
+ documents = list(output.get("_lodestar_documents") or [])
142
+ output = output.get("output")
143
+ return {"output": output, "documents": documents}
144
+
145
+ return body
146
+
147
+
148
+ def _wrap_tool(
149
+ client: GateClient,
150
+ tool: Any,
151
+ hold_wait_ms: int,
152
+ on_denied: Optional[Callable[[LodestarDenied], Any]],
153
+ structured_tool_cls: Any,
154
+ tool_exception_cls: Any,
155
+ ) -> Any:
156
+ def governed_func(**kwargs: Any) -> Any:
157
+ try:
158
+ return governed_call(client, tool.name, kwargs, hold_wait_ms=hold_wait_ms)
159
+ except LodestarDenied as denied:
160
+ if on_denied is not None:
161
+ return on_denied(denied)
162
+ raise tool_exception_cls(f"[lodestar:{denied.kind}] {denied.reason}") from denied
163
+
164
+ return structured_tool_cls.from_function(
165
+ func=governed_func,
166
+ name=tool.name,
167
+ description=getattr(tool, "description", "") or tool.name,
168
+ args_schema=getattr(tool, "args_schema", None),
169
+ )
@@ -0,0 +1,35 @@
1
+ [build-system]
2
+ requires = ["hatchling"]
3
+ build-backend = "hatchling.build"
4
+
5
+ [project]
6
+ name = "lodestar-langgraph"
7
+ version = "0.3.0"
8
+ description = "Govern a LangGraph agent's native tool calls with Lodestar — the thin native hook that remotes each tool call through the Lodestar Action Kernel over NDJSON-RPC (ADR-0024)."
9
+ readme = "README.md"
10
+ requires-python = ">=3.10"
11
+ license = { text = "Apache-2.0" }
12
+ authors = [{ name = "QMI Lab", email = "hello@qmilab.com" }]
13
+ keywords = ["ai-agents", "langgraph", "governance", "lodestar", "trust", "agents"]
14
+ classifiers = [
15
+ "License :: OSI Approved :: Apache Software License",
16
+ "Programming Language :: Python :: 3",
17
+ "Intended Audience :: Developers",
18
+ ]
19
+ # The Lodestar RPC client (spawns the TS gate, speaks NDJSON over stdio) is the
20
+ # pure-stdlib `lodestar-runtime-client`, shared across the runtime hooks (#128,
21
+ # ADR-0028) and pinned in lockstep with this package. The LangGraph/LangChain
22
+ # integration in `adapter` imports langchain lazily; install the `langgraph` extra.
23
+ dependencies = ["lodestar-runtime-client==0.3.0"]
24
+
25
+ [project.optional-dependencies]
26
+ langgraph = ["langgraph>=0.2.0", "langchain-core>=0.3.0"]
27
+ dev = ["langgraph>=0.2.0", "langchain-core>=0.3.0", "pytest>=8.0"]
28
+
29
+ [project.urls]
30
+ Homepage = "https://qmilab.com/lodestar"
31
+ Repository = "https://github.com/qmilab/lodestar"
32
+ Issues = "https://github.com/qmilab/lodestar/issues"
33
+
34
+ [tool.hatch.build.targets.wheel]
35
+ packages = ["lodestar_langgraph"]
@@ -0,0 +1,281 @@
1
+ #!/usr/bin/env python3
2
+ """End-to-end driver for the LangGraph runtime adapter (ADR-0024 §8).
3
+
4
+ Drives a REAL Python LangGraph loop through the `lodestar-langgraph` hook and the
5
+ TypeScript governance-gate sidecar, exercising the real-runtime cases the in-TS
6
+ `runtime-gate-enforces-two-phase` probe cannot:
7
+
8
+ 1. the prebuilt ``ToolNode`` runs only governed wrappers (a governed L1 call
9
+ executes; the body runs exactly once, remoted back from the gate);
10
+ 2. a custom node invokes a governed tool via ``governed_call``;
11
+ 3. async invocation (``ToolNode.ainvoke``);
12
+ 4. batch / parallel invocation (``ToolNode.batch``) — correlated correctly;
13
+ 5. an L4 tool is HELD (two-phase across the boundary): with no approver it
14
+ times out and the body NEVER runs;
15
+ 6. a tool that was never registered is DENIED — fail closed.
16
+
17
+ Spawns the gate via ``bun run <repo>/packages/cli/src/index.ts runtime gate``.
18
+ Invoked by the runtime-gated ``langgraph-tool-calls-are-governed`` probe, which
19
+ skips loudly when Python / LangGraph is absent. Exit 0 = pass, 1 = fail.
20
+ """
21
+
22
+ from __future__ import annotations
23
+
24
+ import asyncio
25
+ import json
26
+ import math
27
+ import sys
28
+ import tempfile
29
+ import time
30
+ from pathlib import Path
31
+
32
+ REPO_ROOT = Path(__file__).resolve().parents[3]
33
+ CLI_INDEX = REPO_ROOT / "packages" / "cli" / "src" / "index.ts"
34
+
35
+ # Prefer the INSTALLED hook so CI (which pip-installs runtimes/langgraph) actually
36
+ # exercises the packaged artifact and its pyproject exports. Only fall back to the
37
+ # source tree for a local run where the hook isn't installed.
38
+ try:
39
+ from lodestar_langgraph import ( # noqa: E402
40
+ GateClient,
41
+ GateError,
42
+ LodestarDenied,
43
+ govern_tools,
44
+ governed_call,
45
+ )
46
+ except ImportError:
47
+ # The hook's source __init__ imports lodestar_runtime_client (#128); put the
48
+ # shared client's source on the path too so the no-install fallback resolves it.
49
+ sys.path.insert(0, str(REPO_ROOT / "runtimes" / "runtime-client"))
50
+ sys.path.insert(0, str(REPO_ROOT / "runtimes" / "langgraph"))
51
+ from lodestar_langgraph import ( # noqa: E402
52
+ GateClient,
53
+ GateError,
54
+ LodestarDenied,
55
+ govern_tools,
56
+ governed_call,
57
+ )
58
+
59
+ try:
60
+ from langchain_core.messages import AIMessage
61
+ from langchain_core.tools import StructuredTool
62
+ from langgraph.graph import END, START, MessagesState, StateGraph
63
+ from langgraph.prebuilt import ToolNode
64
+ except Exception as exc: # pragma: no cover - the probe gates on import availability
65
+ print(f"SKIP: LangGraph/LangChain not importable: {exc}")
66
+ sys.exit(0)
67
+
68
+ # ── tool bodies (the REAL functions the gate remotes back to run) ─────────────
69
+ runs: dict[str, int] = {"echo": 0, "read_doc": 0, "deploy": 0, "fetch": 0}
70
+
71
+
72
+ def echo(msg: str) -> dict:
73
+ runs["echo"] += 1
74
+ return {"echo": msg}
75
+
76
+
77
+ def read_doc(path: str) -> dict:
78
+ runs["read_doc"] += 1
79
+ # Surface untrusted document content for external_document evidence.
80
+ return {"output": {"read": path}, "_lodestar_documents": [{"text": "untrusted file body", "source": path}]}
81
+
82
+
83
+ def deploy(target: str) -> dict:
84
+ runs["deploy"] += 1 # must stay 0 for a held L4 with no approver
85
+ return {"deployed": target}
86
+
87
+
88
+ async def fetch(url: str) -> dict:
89
+ # An ASYNC-ONLY tool body (coroutine, no sync impl): the gate remotes it on a
90
+ # loop-less worker thread, so the hook must run it via ainvoke, not sync invoke.
91
+ runs["fetch"] += 1
92
+ return {"fetched": url}
93
+
94
+
95
+ def nan_out(x: int) -> dict:
96
+ # A tool whose result carries a non-finite float — invalid JSON for the gate.
97
+ # The hook must reject it (→ tool_error → failed action), never emit `NaN`.
98
+ return {"output": {"value": float("nan")}}
99
+
100
+
101
+ def make_tool(fn) -> StructuredTool:
102
+ return StructuredTool.from_function(func=fn, name=fn.__name__, description=fn.__name__)
103
+
104
+
105
+ def make_async_tool(coro) -> StructuredTool:
106
+ return StructuredTool.from_function(coroutine=coro, name=coro.__name__, description=coro.__name__)
107
+
108
+
109
+ def tool_call(name: str, args: dict, call_id: str) -> AIMessage:
110
+ return AIMessage(content="", tool_calls=[{"name": name, "args": args, "id": call_id, "type": "tool_call"}])
111
+
112
+
113
+ failures: list[str] = []
114
+
115
+
116
+ def check(label: str, cond: bool, extra: str = "") -> None:
117
+ status = "PASS" if cond else "FAIL"
118
+ print(f" [{status}] {label}" + (f" — {extra}" if extra else ""))
119
+ if not cond:
120
+ failures.append(label)
121
+
122
+
123
+ def main() -> int:
124
+ with tempfile.TemporaryDirectory() as tmp:
125
+ log_root = str(Path(tmp) / "events")
126
+ config = {
127
+ "project_id": "langgraph-e2e",
128
+ "actor_id": "langgraph-agent",
129
+ "session_id": "auto",
130
+ "log_root": log_root,
131
+ "default_scope": {"level": "session", "identifier": "langgraph-e2e"},
132
+ "default_sensitivity": "internal",
133
+ "auto_approve_ceiling": 3,
134
+ # An L4 hold parks; with no approver it must time out fast here.
135
+ "approval_timeout_ms": 300,
136
+ "approvals": {"allow_unsigned": True},
137
+ "tool_defaults": {
138
+ "echo": {"required_trust_level": 1, "reversibility": "reversible", "sandbox": "read", "permissions": [], "blast_radius": "session"},
139
+ "read_doc": {"required_trust_level": 1, "reversibility": "reversible", "sandbox": "read", "permissions": [], "blast_radius": "session"},
140
+ "deploy": {"required_trust_level": 4, "reversibility": "irreversible", "sandbox": "controlled-shell", "permissions": [], "blast_radius": "external"},
141
+ "fetch": {"required_trust_level": 1, "reversibility": "reversible", "sandbox": "read", "permissions": [], "blast_radius": "session"},
142
+ "nan_out": {"required_trust_level": 1, "reversibility": "reversible", "sandbox": "read", "permissions": [], "blast_radius": "session"},
143
+ },
144
+ }
145
+ config_path = Path(tmp) / "runtime-gate.config.json"
146
+ config_path.write_text(json.dumps(config))
147
+
148
+ # P3: a sidecar that exits before `ready` (here: an invalid config the CLI
149
+ # rejects) must fail construction FAST with a useful message, not block the
150
+ # full ready timeout.
151
+ bad_config = Path(tmp) / "bad.config.json"
152
+ bad_config.write_text("{}") # missing required fields → the CLI exits 1
153
+ t0 = time.monotonic()
154
+ startup_err = None
155
+ try:
156
+ GateClient(str(bad_config), launcher=["bun", "run", str(CLI_INDEX)], ready_timeout_s=20)
157
+ except GateError as exc:
158
+ startup_err = str(exc)
159
+ elapsed = time.monotonic() - t0
160
+ check("P3: bad-config startup fails fast (not after the ready timeout)", startup_err is not None and elapsed < 10, f"{elapsed:.1f}s")
161
+ check("P3: the failure reports the gate exited before ready", startup_err is not None and "before signalling ready" in startup_err, str(startup_err))
162
+
163
+ with GateClient(str(config_path), launcher=["bun", "run", str(CLI_INDEX)]) as gate:
164
+ print("─" * 72)
165
+ print("langgraph-tool-calls-are-governed (real LangGraph ToolNode + hook + gate)")
166
+ print("─" * 72)
167
+
168
+ tools = [
169
+ make_tool(echo),
170
+ make_tool(read_doc),
171
+ make_tool(deploy),
172
+ make_async_tool(fetch),
173
+ make_tool(nan_out),
174
+ ]
175
+ governed = govern_tools(gate, tools, hold_wait_ms=2_000)
176
+ governed_by_name = {t.name: t for t in governed}
177
+ check("0: only governed wrappers are exposed", set(governed_by_name) == {"echo", "read_doc", "deploy", "fetch", "nan_out"}, str(set(governed_by_name)))
178
+
179
+ # Build a real compiled LangGraph with the prebuilt ToolNode. Driving
180
+ # the node through a compiled graph (rather than a bare ToolNode.invoke)
181
+ # is the faithful "LangGraph loop" — it provides the runtime config the
182
+ # node needs and is how an agent actually runs it.
183
+ tool_node = ToolNode(governed)
184
+ graph = StateGraph(MessagesState)
185
+ graph.add_node("tools", tool_node)
186
+ graph.add_edge(START, "tools")
187
+ graph.add_edge("tools", END)
188
+ app = graph.compile()
189
+
190
+ def last_content(out: dict) -> str:
191
+ return str(out["messages"][-1].content)
192
+
193
+ # 1. the compiled graph's ToolNode runs a governed L1 tool; body once.
194
+ out = app.invoke({"messages": [tool_call("echo", {"msg": "hi"}, "c1")]})
195
+ check("1: ToolNode (compiled graph) ran the governed echo", "hi" in last_content(out), last_content(out))
196
+ check("1: echo body ran exactly once", runs["echo"] == 1, str(runs["echo"]))
197
+
198
+ # 2. custom node via governed_call.
199
+ res = governed_call(gate, "echo", {"msg": "from-node"})
200
+ check("2: governed_call returned the tool output", res == {"echo": "from-node"}, str(res))
201
+ check("2: echo body ran again exactly once", runs["echo"] == 2, str(runs["echo"]))
202
+
203
+ # 3. async invocation through the compiled graph (ainvoke).
204
+ aout = asyncio.run(app.ainvoke({"messages": [tool_call("read_doc", {"path": "/x"}, "c2")]}))
205
+ check("3: ainvoke ran the governed read_doc", "read" in last_content(aout), last_content(aout))
206
+ check("3: read_doc body ran once (async)", runs["read_doc"] == 1, str(runs["read_doc"]))
207
+
208
+ # 4. batch / parallel invocation — each correlated to its own result.
209
+ before = runs["echo"]
210
+ batch_out = app.batch(
211
+ [
212
+ {"messages": [tool_call("echo", {"msg": "B1"}, "b1")]},
213
+ {"messages": [tool_call("echo", {"msg": "B2"}, "b2")]},
214
+ ]
215
+ )
216
+ contents = [last_content(o) for o in batch_out]
217
+ check("4: batch call B1 correlated", any("B1" in c for c in contents), str(contents))
218
+ check("4: batch call B2 correlated", any("B2" in c for c in contents), str(contents))
219
+ check("4: both batch bodies ran", runs["echo"] - before == 2, str(runs["echo"] - before))
220
+
221
+ # 5. L4 tool is HELD (two-phase across the boundary): with no approver
222
+ # it times out and the body NEVER runs.
223
+ deploy_before = runs["deploy"]
224
+ denied_kind = None
225
+ try:
226
+ governed_call(gate, "deploy", {"target": "prod"}, hold_wait_ms=2_000)
227
+ except LodestarDenied as denied:
228
+ denied_kind = denied.kind
229
+ check("5: L4 deploy was held then denied", denied_kind == "approval_timeout", str(denied_kind))
230
+ check("5: deploy body NEVER ran (no work before approval)", runs["deploy"] - deploy_before == 0, str(runs["deploy"] - deploy_before))
231
+
232
+ # 6. a tool that was never registered is denied — fail closed.
233
+ ghost_kind = None
234
+ try:
235
+ governed_call(gate, "never_registered", {})
236
+ except LodestarDenied as denied:
237
+ ghost_kind = denied.kind
238
+ check("6: unregistered tool denied (fail closed)", ghost_kind == "unregistered_tool", str(ghost_kind))
239
+
240
+ # 7. an ASYNC-ONLY tool body runs through the gate's remoted execute
241
+ # (the hook must ainvoke it on the loop-less worker thread, not the
242
+ # failing sync invoke path). Exercised via the compiled graph's
243
+ # ainvoke AND a direct governed_call.
244
+ afetch = asyncio.run(app.ainvoke({"messages": [tool_call("fetch", {"url": "https://x"}, "c3")]}))
245
+ check("7: async-only tool ran via ToolNode ainvoke", "fetched" in last_content(afetch), last_content(afetch))
246
+ res_async = governed_call(gate, "fetch", {"url": "https://y"})
247
+ check("7: async-only tool ran via governed_call", res_async == {"fetched": "https://y"}, str(res_async))
248
+ check("7: async tool body ran exactly twice", runs["fetch"] == 2, str(runs["fetch"]))
249
+
250
+ # 8. Non-finite floats are rejected before they corrupt the JSON wire,
251
+ # so a NaN in args or a tool result fails the call rather than hanging
252
+ # it (Python's json.dumps would otherwise emit invalid `NaN`).
253
+ arg_nan_err = None
254
+ try:
255
+ governed_call(gate, "echo", {"msg": math.nan})
256
+ except GateError:
257
+ arg_nan_err = "gate_error"
258
+ except LodestarDenied as denied:
259
+ arg_nan_err = denied.kind
260
+ check("8: a NaN argument is rejected, not silently hung", arg_nan_err is not None, str(arg_nan_err))
261
+
262
+ out_nan_err = None
263
+ try:
264
+ governed_call(gate, "nan_out", {"x": 1})
265
+ except LodestarDenied as denied:
266
+ out_nan_err = denied.kind
267
+ except GateError:
268
+ out_nan_err = "gate_error"
269
+ check("8: a NaN tool result fails the action, not silently hung", out_nan_err is not None, str(out_nan_err))
270
+
271
+ print("─" * 72)
272
+ if failures:
273
+ print(f"RESULT: FAIL ({len(failures)} check(s) failed)")
274
+ else:
275
+ print("RESULT: PASS — LangGraph native tool calls are governed end-to-end")
276
+ print("─" * 72)
277
+ return 1 if failures else 0
278
+
279
+
280
+ if __name__ == "__main__":
281
+ sys.exit(main())