guardloop 0.2.0__tar.gz → 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. guardloop-0.3.0/CHANGELOG.md +82 -0
  2. {guardloop-0.2.0 → guardloop-0.3.0}/PKG-INFO +90 -20
  3. guardloop-0.3.0/README.md +218 -0
  4. {guardloop-0.2.0 → guardloop-0.3.0}/docs/design.md +31 -2
  5. guardloop-0.3.0/docs/project-overview.md +599 -0
  6. guardloop-0.3.0/docs/pypi-publishing.md +61 -0
  7. guardloop-0.3.0/docs/roadmap.md +59 -0
  8. guardloop-0.3.0/examples/verifier_retry_loop.py +54 -0
  9. {guardloop-0.2.0 → guardloop-0.3.0}/pyproject.toml +6 -3
  10. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/__init__.py +22 -0
  11. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/context.py +11 -1
  12. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/exceptions.py +34 -0
  13. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/models.py +3 -0
  14. guardloop-0.3.0/src/guardloop/runtime.py +385 -0
  15. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/telemetry/conventions.py +30 -0
  16. guardloop-0.3.0/src/guardloop/verifier.py +248 -0
  17. guardloop-0.3.0/tests/test_verifier.py +537 -0
  18. {guardloop-0.2.0 → guardloop-0.3.0}/uv.lock +1 -1
  19. guardloop-0.2.0/README.md +0 -148
  20. guardloop-0.2.0/docs/pypi-publishing.md +0 -66
  21. guardloop-0.2.0/docs/roadmap.md +0 -27
  22. guardloop-0.2.0/src/guardloop/runtime.py +0 -190
  23. {guardloop-0.2.0 → guardloop-0.3.0}/.github/workflows/publish-pypi.yml +0 -0
  24. {guardloop-0.2.0 → guardloop-0.3.0}/.gitignore +0 -0
  25. {guardloop-0.2.0 → guardloop-0.3.0}/LICENSE +0 -0
  26. {guardloop-0.2.0 → guardloop-0.3.0}/examples/live_anthropic_basic.py +0 -0
  27. {guardloop-0.2.0 → guardloop-0.3.0}/examples/live_openai_basic.py +0 -0
  28. {guardloop-0.2.0 → guardloop-0.3.0}/examples/runaway_cost_prevention.py +0 -0
  29. {guardloop-0.2.0 → guardloop-0.3.0}/examples/tool_circuit_breaker.py +0 -0
  30. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/budget.py +0 -0
  31. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/circuit_breaker.py +0 -0
  32. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/pricing.py +0 -0
  33. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/providers/__init__.py +0 -0
  34. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/providers/anthropic.py +0 -0
  35. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/providers/openai.py +0 -0
  36. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/py.typed +0 -0
  37. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/telemetry/__init__.py +0 -0
  38. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/telemetry/tracer.py +0 -0
  39. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/tokenization.py +0 -0
  40. {guardloop-0.2.0 → guardloop-0.3.0}/src/guardloop/tools.py +0 -0
  41. {guardloop-0.2.0 → guardloop-0.3.0}/tests/__init__.py +0 -0
  42. {guardloop-0.2.0 → guardloop-0.3.0}/tests/fakes.py +0 -0
  43. {guardloop-0.2.0 → guardloop-0.3.0}/tests/test_budget.py +0 -0
  44. {guardloop-0.2.0 → guardloop-0.3.0}/tests/test_circuit_breaker.py +0 -0
  45. {guardloop-0.2.0 → guardloop-0.3.0}/tests/test_providers.py +0 -0
  46. {guardloop-0.2.0 → guardloop-0.3.0}/tests/test_runtime.py +0 -0
  47. {guardloop-0.2.0 → guardloop-0.3.0}/tests/test_telemetry.py +0 -0
@@ -0,0 +1,82 @@
1
+ # Changelog
2
+
3
+ All notable changes to GuardLoop are documented here. The format is based on
4
+ [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project
5
+ follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html) (pre-1.0:
6
+ minor releases may include breaking changes).
7
+
8
+ ## [0.3.0] - 2026-05-10
9
+
10
+ ### Added
11
+
12
+ - **Verifier retry loop (Pillar 3 / self-healing).** After an agent finishes,
13
+ GuardLoop can run a chain of verifiers against the output; on rejection it
14
+ appends the verifier's feedback to `RunContext.retry_feedback` and re-invokes
15
+ the agent, bounded by `VerifierConfig.max_retries`. All attempts share the
16
+ same budget (cost / tokens / time / tool calls) and the run's single
17
+ `asyncio.timeout`, so a verifier loop cannot bypass any guardrail.
18
+ - New module `guardloop.verifier` with public exports: `Verifier` (callable
19
+ type alias — sync or async, returning `VerifierResult`, `bool`, or `None`),
20
+ `VerifierResult`, `VerifierContext`, `VerifierConfig`, and `VerifierChain`.
21
+ - Built-in rule-based verifier factories: `non_empty()`, `matches_regex(...)`,
22
+ `is_json_object(required_keys=...)`.
23
+ - `GuardLoop(verifiers=[...], verifier_config=VerifierConfig(...))` constructor
24
+ parameters and `GuardLoop.add_verifier(fn)`.
25
+ - `RunResult` fields: `verification_passed: bool | None`,
26
+ `verification_attempts: int`, `verification_feedback: list[str]`.
27
+ - `RunContext.retry_feedback: list[str]` and `RunContext.attempt: int`.
28
+ - New exceptions `VerificationFailed` (`terminated_reason="verification_failed"`,
29
+ raised only in strict mode) and `VerifierExecutionError`
30
+ (`terminated_reason="verifier_error"`, raised when a verifier itself throws).
31
+ - OpenTelemetry: `verifier_run <name>` child spans, `agent_run` attributes
32
+ `guardloop.verification.passed` / `guardloop.verification.attempts`, and
33
+ `guardloop.verification.failed` / `.retrying` / `.exhausted` span events.
34
+ - No-key demo `examples/verifier_retry_loop.py`.
35
+
36
+ ### Changed
37
+
38
+ - When verification ultimately fails (retries exhausted), `RunResult.success`
39
+ is `False` with `terminated_reason="verification_failed"`, but `output` still
40
+ holds the last attempt's text — consistent with how budget/timeout stops
41
+ report. Set `VerifierConfig(raise_on_failure=True)` for strict behavior
42
+ (surfaces a `VerificationFailed` with `output=None` and details in
43
+ `metadata`).
44
+ - `pyproject.toml`: `Changelog` URL now points at this file.
45
+
46
+ ## [0.2.0] - 2026
47
+
48
+ ### Added
49
+
50
+ - Per-tool circuit breakers with `closed` / `open` / `half_open` states, a
51
+ global default policy plus per-tool overrides, breaker state that persists on
52
+ the `GuardLoop` instance across runs, and `runtime.circuit_breaker_snapshots()`
53
+ / `runtime.reset_circuit_breakers()`.
54
+ - `ctx.call_tool(...)` / `ctx.wrap_tool(...)` route tool calls through the
55
+ breaker before the tool-call budget is incremented.
56
+ - `CircuitBreakerOpen` exception and circuit-breaker OpenTelemetry attributes
57
+ on tool spans.
58
+ - No-key demo `examples/tool_circuit_breaker.py`.
59
+
60
+ ## [0.1.0] - 2026
61
+
62
+ ### Added
63
+
64
+ - Async runtime wrapper: `GuardLoop.run(agent, ...)` returns a structured
65
+ `RunResult`; controlled stops become `success=False` with a
66
+ `terminated_reason` instead of raised exceptions.
67
+ - Hard budget caps for cost (`Decimal`), tokens, wall-clock time, and tool
68
+ calls, enforced pre-flight before each LLM request.
69
+ - Direct wrappers for `AsyncOpenAI.responses.create` and
70
+ `AsyncAnthropic.messages.create` with usage accounting and pricing.
71
+ - OpenTelemetry spans for agent runs, LLM calls, and tool calls (core depends
72
+ only on `opentelemetry-api`; exporters via the `otel` extra).
73
+ - Public exception hierarchy: `GuardLoopError`, `BudgetExceeded`,
74
+ `TokenLimitExceeded`, `ToolCallLimitExceeded`, `TimeLimitExceeded`,
75
+ `ModelPricingMissing`, `TokenLimitMissing`; `AgentRuntime` / `AgentRuntimeError`
76
+ compatibility aliases.
77
+ - No-key demo `examples/runaway_cost_prevention.py`; packaged and published to
78
+ PyPI via GitHub Actions OIDC Trusted Publishing.
79
+
80
+ [0.3.0]: https://github.com/awesome-pro/guardloop/releases/tag/v0.3.0
81
+ [0.2.0]: https://github.com/awesome-pro/guardloop/releases/tag/v0.2.0
82
+ [0.1.0]: https://github.com/awesome-pro/guardloop/releases/tag/v0.1.0
@@ -1,17 +1,17 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: guardloop
3
- Version: 0.2.0
4
- Summary: A production runtime guardrail for AI agents: budget caps, timeouts, tool limits, and OpenTelemetry traces.
3
+ Version: 0.3.0
4
+ Summary: A production runtime guardrail for AI agents: budget caps, timeouts, tool limits, circuit breakers, verifier retries, and OpenTelemetry traces.
5
5
  Project-URL: Homepage, https://github.com/awesome-pro/guardloop
6
6
  Project-URL: Documentation, https://github.com/awesome-pro/guardloop#readme
7
7
  Project-URL: Repository, https://github.com/awesome-pro/guardloop
8
8
  Project-URL: Issues, https://github.com/awesome-pro/guardloop/issues
9
- Project-URL: Changelog, https://github.com/awesome-pro/guardloop/releases
9
+ Project-URL: Changelog, https://github.com/awesome-pro/guardloop/blob/main/CHANGELOG.md
10
10
  Author-email: awesome-pro <147910430+awesome-pro@users.noreply.github.com>
11
11
  Maintainer-email: awesome-pro <147910430+awesome-pro@users.noreply.github.com>
12
12
  License-Expression: MIT
13
13
  License-File: LICENSE
14
- Keywords: agentic-ai,ai-agents,ai-safety,anthropic,circuit-breaker,llm,mlops,openai,opentelemetry,runtime-guardrails
14
+ Keywords: agentic-ai,ai-agents,ai-safety,anthropic,circuit-breaker,llm,mlops,openai,opentelemetry,retry,runtime-guardrails,self-healing,verifier
15
15
  Classifier: Development Status :: 3 - Alpha
16
16
  Classifier: Framework :: AsyncIO
17
17
  Classifier: Intended Audience :: Developers
@@ -42,12 +42,15 @@ Description-Content-Type: text/markdown
42
42
 
43
43
  GuardLoop is a production runtime guardrail for AI agents. It wraps model
44
44
  clients and tools with hard budget caps, timeout control, tool-call limits, and
45
- per-tool circuit breakers, with OpenTelemetry traces for every protected call.
46
- Runaway agent loops can be stopped before they burn through money, and flaky
47
- tools can be cut off before an agent retries them into a bigger incident.
45
+ per-tool circuit breakers, re-runs an agent against verifiers until the output
46
+ passes, and emits OpenTelemetry traces for every protected call. Runaway agent
47
+ loops can be stopped before they burn through money, flaky tools can be cut off
48
+ before an agent retries them into a bigger incident, and confidently-wrong
49
+ answers get a second pass.
48
50
 
49
- The v0.2 focus is intentionally sharp: **runtime guardrails for async Python
50
- agents** using direct OpenAI and Anthropic wrappers plus protected tool calls.
51
+ The v0.3 focus is intentionally sharp: **runtime guardrails for async Python
52
+ agents** direct OpenAI and Anthropic wrappers, protected tool calls, per-tool
53
+ circuit breakers, and a verify-fix-retry loop.
51
54
 
52
55
  ```python
53
56
  from guardloop import (
@@ -56,6 +59,8 @@ from guardloop import (
56
59
  CircuitBreakerConfig,
57
60
  CircuitBreakerPolicy,
58
61
  RunContext,
62
+ VerifierConfig,
63
+ is_json_object,
59
64
  )
60
65
 
61
66
  runtime = GuardLoop(
@@ -71,13 +76,18 @@ runtime = GuardLoop(
71
76
  recovery_timeout_seconds=30,
72
77
  )
73
78
  ),
79
+ verifiers=[is_json_object(required_keys=["answer"])],
80
+ verifier_config=VerifierConfig(max_retries=2),
74
81
  )
75
82
 
76
83
 
77
84
  async def agent(ctx: RunContext, prompt: str) -> str:
85
+ instructions = prompt
86
+ if ctx.retry_feedback:
87
+ instructions += "\n\nFix the previous attempt: " + "; ".join(ctx.retry_feedback)
78
88
  response = await ctx.openai.responses.create(
79
89
  model="gpt-5.2",
80
- input=prompt,
90
+ input=instructions,
81
91
  max_output_tokens=300,
82
92
  )
83
93
  return str(response.output_text)
@@ -98,16 +108,60 @@ flowchart LR
98
108
  U["User code"] --> R["GuardLoop"]
99
109
  R --> B["BudgetController"]
100
110
  R --> CB["CircuitBreakerRegistry"]
111
+ R --> V["VerifierChain"]
101
112
  R --> T["OpenTelemetry spans"]
102
113
  R --> C["RunContext"]
103
114
  C --> O["Wrapped OpenAI client"]
104
115
  C --> A["Wrapped Anthropic client"]
105
116
  C --> W["Wrapped tools"]
117
+ V -. "feedback on retry" .-> C
106
118
  ```
107
119
 
120
+ ## Verifier Retry Loop
121
+
122
+ Agents can return confidently wrong answers. Attach verifiers — plain callables,
123
+ sync or async — and GuardLoop runs them after the agent finishes. On rejection
124
+ it feeds the verifier's feedback into `ctx.retry_feedback` and re-invokes the
125
+ agent, up to `VerifierConfig.max_retries` times. Every attempt shares the same
126
+ budget and the run's timeout, so the retry loop can never spend past a cap.
127
+
128
+ ```python
129
+ from guardloop import GuardLoop, RunContext, VerifierConfig, VerifierContext, VerifierResult
130
+
131
+
132
+ def no_todo(output: object, ctx: VerifierContext) -> VerifierResult:
133
+ if "TODO" in str(output):
134
+ return VerifierResult(passed=False, feedback="Replace the TODO placeholder.")
135
+ return VerifierResult(passed=True)
136
+
137
+
138
+ runtime = GuardLoop(verifiers=[no_todo], verifier_config=VerifierConfig(max_retries=2))
139
+
140
+
141
+ async def agent(ctx: RunContext, task: str) -> str:
142
+ # On a retry, ctx.retry_feedback holds the verifier's complaints — read it.
143
+ ...
144
+
145
+
146
+ result = await runtime.run(agent, "draft the release notes")
147
+ print(result.verification_passed, result.verification_attempts, result.verification_feedback)
148
+ ```
149
+
150
+ Built-in rule-based verifiers ship in `guardloop`: `non_empty()`,
151
+ `matches_regex(...)`, `is_json_object(required_keys=...)`. By default an output
152
+ that fails every retry comes back as `success=False` with
153
+ `terminated_reason="verification_failed"` but with `output` still populated;
154
+ set `VerifierConfig(raise_on_failure=True)` for a hard stop.
155
+
156
+ ## Project Guide
157
+
158
+ For a deeper walkthrough of what has been implemented, how the code is
159
+ organized, and what the next roadmap goals are, read
160
+ [docs/project-overview.md](docs/project-overview.md).
161
+
108
162
  ## Install
109
163
 
110
- After the first PyPI release is published:
164
+ Install from PyPI:
111
165
 
112
166
  ```bash
113
167
  pip install guardloop
@@ -147,6 +201,15 @@ uv run python examples/tool_circuit_breaker.py
147
201
  This demo uses a failing fake tool. GuardLoop allows the first failures,
148
202
  opens the circuit breaker, then rejects the next call without invoking the tool.
149
203
 
204
+ ```bash
205
+ uv run python examples/verifier_retry_loop.py
206
+ ```
207
+
208
+ This demo's agent first returns a bad answer (a `TODO` placeholder, then
209
+ malformed JSON). A verifier chain rejects it with feedback, the agent reads
210
+ `ctx.retry_feedback` and self-corrects, and the run ends with
211
+ `verification_passed: true` after three attempts.
212
+
150
213
  ## Live Provider Smoke Tests
151
214
 
152
215
  ```bash
@@ -169,20 +232,27 @@ uv run ruff format --check .
169
232
  uv run pyright
170
233
  ```
171
234
 
172
- ## v0.2 Scope
235
+ ## v0.3 Scope
173
236
 
174
237
  - Async Python runtime with `src/` package layout.
175
238
  - Hard caps for cost, tokens, time, and tool calls.
176
- - Per-tool circuit breakers with closed, open, and half-open states.
177
- - Global default breaker policy plus per-tool overrides.
178
- - Direct wrappers for `AsyncOpenAI.responses.create`.
179
- - Direct wrappers for `AsyncAnthropic.messages.create`.
180
- - OpenTelemetry spans for agent runs, LLM calls, and tools.
239
+ - Per-tool circuit breakers with closed, open, and half-open states; global
240
+ default breaker policy plus per-tool overrides.
241
+ - Verify-fix-retry loop: sync or async output verifiers, fail-fast chains,
242
+ built-in rule-based verifiers, feedback into `ctx.retry_feedback`, and an
243
+ opt-in strict mode all attempts share one budget and the run timeout.
244
+ - Direct wrappers for `AsyncOpenAI.responses.create` and
245
+ `AsyncAnthropic.messages.create`.
246
+ - OpenTelemetry spans for agent runs, LLM calls, tools, and verifiers.
181
247
  - Fake-client tests and demos that do not require API keys.
182
248
 
183
249
  ## Roadmap
184
250
 
185
- - v0.2: per-tool circuit breakers.
186
- - v0.3: verifier/self-healing retry loop.
251
+ - v0.2: per-tool circuit breakers.
252
+ - v0.3: verify-fix-retry loop.
187
253
  - v0.4: LangGraph and OpenAI Agents SDK adapters.
188
- - v0.5: Jaeger/Phoenix trace screenshots, blog post, and GitHub release.
254
+ - v0.5: Jaeger/Phoenix trace screenshots, demo video, and blog post.
255
+ - v0.6: persistent breaker state, YAML/TOML policy, multi-model pricing, loop detection.
256
+ - v1.0: stable API, changelog, docs site, release checklist.
257
+
258
+ See [docs/roadmap.md](docs/roadmap.md) for details.
@@ -0,0 +1,218 @@
1
+ # GuardLoop
2
+
3
+ GuardLoop is a production runtime guardrail for AI agents. It wraps model
4
+ clients and tools with hard budget caps, timeout control, tool-call limits, and
5
+ per-tool circuit breakers, re-runs an agent against verifiers until the output
6
+ passes, and emits OpenTelemetry traces for every protected call. Runaway agent
7
+ loops can be stopped before they burn through money, flaky tools can be cut off
8
+ before an agent retries them into a bigger incident, and confidently-wrong
9
+ answers get a second pass.
10
+
11
+ The v0.3 focus is intentionally sharp: **runtime guardrails for async Python
12
+ agents** — direct OpenAI and Anthropic wrappers, protected tool calls, per-tool
13
+ circuit breakers, and a verify-fix-retry loop.
14
+
15
+ ```python
16
+ from guardloop import (
17
+ GuardLoop,
18
+ BudgetConfig,
19
+ CircuitBreakerConfig,
20
+ CircuitBreakerPolicy,
21
+ RunContext,
22
+ VerifierConfig,
23
+ is_json_object,
24
+ )
25
+
26
+ runtime = GuardLoop(
27
+ budget=BudgetConfig(
28
+ cost_limit_usd="0.10",
29
+ token_limit=10_000,
30
+ time_limit_seconds=60,
31
+ tool_call_limit=20,
32
+ ),
33
+ circuit_breakers=CircuitBreakerConfig(
34
+ default=CircuitBreakerPolicy(
35
+ failure_threshold=3,
36
+ recovery_timeout_seconds=30,
37
+ )
38
+ ),
39
+ verifiers=[is_json_object(required_keys=["answer"])],
40
+ verifier_config=VerifierConfig(max_retries=2),
41
+ )
42
+
43
+
44
+ async def agent(ctx: RunContext, prompt: str) -> str:
45
+ instructions = prompt
46
+ if ctx.retry_feedback:
47
+ instructions += "\n\nFix the previous attempt: " + "; ".join(ctx.retry_feedback)
48
+ response = await ctx.openai.responses.create(
49
+ model="gpt-5.2",
50
+ input=instructions,
51
+ max_output_tokens=300,
52
+ )
53
+ return str(response.output_text)
54
+
55
+
56
+ result = await runtime.run(agent, "research agent runtime safety")
57
+ print(result.model_dump_json(indent=2))
58
+ ```
59
+
60
+ ## Why This Exists
61
+
62
+ Agents are loops around probabilistic systems. When they go wrong, they can call
63
+ the same model or tool repeatedly, spend unexpected money, and fail without a
64
+ clear trace. GuardLoop puts an explicit execution layer around that loop:
65
+
66
+ ```mermaid
67
+ flowchart LR
68
+ U["User code"] --> R["GuardLoop"]
69
+ R --> B["BudgetController"]
70
+ R --> CB["CircuitBreakerRegistry"]
71
+ R --> V["VerifierChain"]
72
+ R --> T["OpenTelemetry spans"]
73
+ R --> C["RunContext"]
74
+ C --> O["Wrapped OpenAI client"]
75
+ C --> A["Wrapped Anthropic client"]
76
+ C --> W["Wrapped tools"]
77
+ V -. "feedback on retry" .-> C
78
+ ```
79
+
80
+ ## Verifier Retry Loop
81
+
82
+ Agents can return confidently wrong answers. Attach verifiers — plain callables,
83
+ sync or async — and GuardLoop runs them after the agent finishes. On rejection
84
+ it feeds the verifier's feedback into `ctx.retry_feedback` and re-invokes the
85
+ agent, up to `VerifierConfig.max_retries` times. Every attempt shares the same
86
+ budget and the run's timeout, so the retry loop can never spend past a cap.
87
+
88
+ ```python
89
+ from guardloop import GuardLoop, RunContext, VerifierConfig, VerifierContext, VerifierResult
90
+
91
+
92
+ def no_todo(output: object, ctx: VerifierContext) -> VerifierResult:
93
+ if "TODO" in str(output):
94
+ return VerifierResult(passed=False, feedback="Replace the TODO placeholder.")
95
+ return VerifierResult(passed=True)
96
+
97
+
98
+ runtime = GuardLoop(verifiers=[no_todo], verifier_config=VerifierConfig(max_retries=2))
99
+
100
+
101
+ async def agent(ctx: RunContext, task: str) -> str:
102
+ # On a retry, ctx.retry_feedback holds the verifier's complaints — read it.
103
+ ...
104
+
105
+
106
+ result = await runtime.run(agent, "draft the release notes")
107
+ print(result.verification_passed, result.verification_attempts, result.verification_feedback)
108
+ ```
109
+
110
+ Built-in rule-based verifiers ship in `guardloop`: `non_empty()`,
111
+ `matches_regex(...)`, `is_json_object(required_keys=...)`. By default an output
112
+ that fails every retry comes back as `success=False` with
113
+ `terminated_reason="verification_failed"` but with `output` still populated;
114
+ set `VerifierConfig(raise_on_failure=True)` for a hard stop.
115
+
116
+ ## Project Guide
117
+
118
+ For a deeper walkthrough of what has been implemented, how the code is
119
+ organized, and what the next roadmap goals are, read
120
+ [docs/project-overview.md](docs/project-overview.md).
121
+
122
+ ## Install
123
+
124
+ Install from PyPI:
125
+
126
+ ```bash
127
+ pip install guardloop
128
+ ```
129
+
130
+ For local development:
131
+
132
+ ```bash
133
+ uv sync
134
+ ```
135
+
136
+ Optional OpenTelemetry exporters are available through the `otel` extra:
137
+
138
+ ```bash
139
+ pip install "guardloop[otel]"
140
+ ```
141
+
142
+ For local development with the extra:
143
+
144
+ ```bash
145
+ uv sync --extra otel
146
+ ```
147
+
148
+ ## Try the No-Key Demo
149
+
150
+ ```bash
151
+ uv run python examples/runaway_cost_prevention.py
152
+ ```
153
+
154
+ The demo uses a fake OpenAI-compatible client and intentionally loops forever.
155
+ GuardLoop stops it when the next model request would exceed the cost cap.
156
+
157
+ ```bash
158
+ uv run python examples/tool_circuit_breaker.py
159
+ ```
160
+
161
+ This demo uses a failing fake tool. GuardLoop allows the first failures,
162
+ opens the circuit breaker, then rejects the next call without invoking the tool.
163
+
164
+ ```bash
165
+ uv run python examples/verifier_retry_loop.py
166
+ ```
167
+
168
+ This demo's agent first returns a bad answer (a `TODO` placeholder, then
169
+ malformed JSON). A verifier chain rejects it with feedback, the agent reads
170
+ `ctx.retry_feedback` and self-corrects, and the run ends with
171
+ `verification_passed: true` after three attempts.
172
+
173
+ ## Live Provider Smoke Tests
174
+
175
+ ```bash
176
+ export OPENAI_API_KEY="..."
177
+ export ANTHROPIC_API_KEY="..."
178
+
179
+ uv run python examples/live_openai_basic.py
180
+ uv run python examples/live_anthropic_basic.py
181
+ ```
182
+
183
+ Both live examples can be customized with `OPENAI_MODEL` or `ANTHROPIC_MODEL`.
184
+
185
+ ## Quality Gates
186
+
187
+ ```bash
188
+ uv run pytest
189
+ uv run pytest --cov=guardloop
190
+ uv run ruff check .
191
+ uv run ruff format --check .
192
+ uv run pyright
193
+ ```
194
+
195
+ ## v0.3 Scope
196
+
197
+ - Async Python runtime with `src/` package layout.
198
+ - Hard caps for cost, tokens, time, and tool calls.
199
+ - Per-tool circuit breakers with closed, open, and half-open states; global
200
+ default breaker policy plus per-tool overrides.
201
+ - Verify-fix-retry loop: sync or async output verifiers, fail-fast chains,
202
+ built-in rule-based verifiers, feedback into `ctx.retry_feedback`, and an
203
+ opt-in strict mode — all attempts share one budget and the run timeout.
204
+ - Direct wrappers for `AsyncOpenAI.responses.create` and
205
+ `AsyncAnthropic.messages.create`.
206
+ - OpenTelemetry spans for agent runs, LLM calls, tools, and verifiers.
207
+ - Fake-client tests and demos that do not require API keys.
208
+
209
+ ## Roadmap
210
+
211
+ - v0.2: per-tool circuit breakers. ✅
212
+ - v0.3: verify-fix-retry loop. ✅
213
+ - v0.4: LangGraph and OpenAI Agents SDK adapters.
214
+ - v0.5: Jaeger/Phoenix trace screenshots, demo video, and blog post.
215
+ - v0.6: persistent breaker state, YAML/TOML policy, multi-model pricing, loop detection.
216
+ - v1.0: stable API, changelog, docs site, release checklist.
217
+
218
+ See [docs/roadmap.md](docs/roadmap.md) for details.
@@ -1,4 +1,4 @@
1
- # GuardLoop v0.2 Design
1
+ # GuardLoop Design
2
2
 
3
3
  GuardLoop is a wrapper, not an agent framework. A user passes an async agent
4
4
  callable to `runtime.run()`. The runtime creates a `RunContext` containing
@@ -40,9 +40,38 @@ rejections do not count as tool failures.
40
40
  Built-in prices are defaults, not truth forever. Callers can pass
41
41
  `ModelPricing` entries to override or add models as providers update pricing.
42
42
 
43
+ ## Verifier Retry Loop
44
+
45
+ Verifiers are stateless callables (sync or async) that judge an agent's output.
46
+ A `VerifierChain` runs them in order, fail-fast: the first failing verdict wins.
47
+ Anything not a `VerifierResult` is normalized (`True`/`None` -> passed,
48
+ `False` -> failed). If a verifier itself raises, that is a verifier bug, not the
49
+ agent's: the runtime surfaces it as `VerifierExecutionError`
50
+ (`terminated_reason="verifier_error"`) and does not retry.
51
+
52
+ The runtime owns the loop, not the agent. One `BudgetController` and one
53
+ `RunContext` flow through every attempt; the only mutation between attempts is
54
+ appending the failing verifier's feedback to `ctx.retry_feedback` (and bumping
55
+ `ctx.attempt`). The agent is re-invoked with the same `*args`/`**kwargs` and is
56
+ expected to read `ctx.retry_feedback` if it wants to self-correct. Because the
57
+ budget is shared and the whole loop sits inside the run's single
58
+ `asyncio.timeout()`, a verifier loop can never spend past a cap or outlive the
59
+ time limit.
60
+
61
+ When retries are exhausted: by default the runtime returns
62
+ `RunResult(success=False, terminated_reason="verification_failed",
63
+ verification_passed=False)` with `output` still set to the last attempt — the
64
+ agent produced an answer, it just isn't trusted. With
65
+ `VerifierConfig(raise_on_failure=True)` the runtime instead surfaces a
66
+ `VerificationFailed` (same `terminated_reason`, `output=None`, attempt count and
67
+ feedback in `metadata`).
68
+
43
69
  ## Telemetry
44
70
 
45
71
  Provider wrappers emit OpenTelemetry spans through a small conventions module.
46
72
  This keeps GenAI semantic convention names isolated while the standard evolves.
47
73
  Tool spans also include circuit breaker state, failure count, and whether a
48
- call was blocked.
74
+ call was blocked. Each verifier runs in a `verifier_run <name>` child span; the
75
+ root `agent_run` span carries `guardloop.verification.passed` /
76
+ `guardloop.verification.attempts` plus `guardloop.verification.failed`,
77
+ `.retrying`, and `.exhausted` events.