PyPI - loopgain - Versions diffs - 0.5.0__tar.gz → 0.5.2__tar.gz - Mend

loopgain 0.5.0tar.gz → 0.5.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

{loopgain-0.5.0 → loopgain-0.5.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: loopgain
-Version: 0.5.0
+Version: 0.5.2
 Summary: An open-source cost controller for AI agent loops. Stops a loop when it has actually converged and rolls back before it degrades — replacing the max_iterations guess with a real-time loop-gain (Aβ) monitor with five named threshold bands and best-so-far rollback.
 Author-email: Dave Fitzsimmons <hello@loopgain.ai>
 License: Apache-2.0
@@ -53,19 +53,24 @@ Dynamic: license-file
 AI agent loops waste time and money when they don't know when to stop. LoopGain measures the loop in real time and stops it the moment it has actually converged — and rolls back before it degrades — instead of running to a fixed `max_iterations` cap.
-> **Across 2,000 paired trials over 10 cells**, LoopGain reduced total API spend by **92.8%** vs `max_iter=20`, dropped median wall-clock latency from 30.9s to 2.1s (**~15×**), preserved output quality on natural-distribution workloads (W1–W4: judge winrate 0.50–0.63, CI excluding null on most cells), and improved output quality on engineered-failure workloads (W5: winrate 0.92–0.95 across three adapters). Weighted-average pairwise preference for LG vs B20 across 1,800 judge comparisons: **0.678**. Zero of six kill criteria fired.
+> **Benchmark — 2,000 paired trials across 10 workload cells** ([run it yourself](https://github.com/loopgain-ai/loopgain-bench)):
+>
+> - **92.8% less API spend** than `max_iter=20` — $27.05 → $1.94 in total benchmark spend
+> - **~15× faster** — median wall-clock per trial 30.9s → 2.1s
+> - **Quality preserved, not traded for speed** — judge win-rate 0.50–0.63 on natural-distribution workloads (W1–W4, CI excluding null on most cells), 0.92–0.95 on engineered-failure workloads (W5); 0.678 weighted preference across 1,800 judge comparisons
+> - **Zero of six kill criteria fired** (all six pre-registered with thresholds before the run)
+**Honest limits, up front:** LoopGain detects *convergence, not correctness* — it knows when more iterations won't help, not whether the answer is right, and it's only as good as the verifier behind your error signal. [The full list of what it can't do →](#what-loopgain-does-and-doesnt-guarantee)
 [![PyPI](https://img.shields.io/pypi/v/loopgain.svg)](https://pypi.org/project/loopgain/)
 [![Python](https://img.shields.io/pypi/pyversions/loopgain.svg)](https://pypi.org/project/loopgain/)
 [![License](https://img.shields.io/badge/license-Apache_2.0-blue.svg)](LICENSE)
-[![Tests](https://img.shields.io/badge/tests-200%2B_passing-brightgreen.svg)](tests/)
+[![Tests](https://img.shields.io/badge/tests-190%2B_passing-brightgreen.svg)](tests/)
 **Home:** [loopgain.ai](https://loopgain.ai)
 Works for **any iterative AI workflow with a measurable error signal** — verify-revise loops, refinement passes, tool-use retry chains, RAG with self-correction, code-gen with linter feedback, multi-step reasoning loops. **Pre-built adapters for [LangGraph](#langgraph), [CrewAI](#crewai), [AutoGen](#autogen-v04), [LangChain](#langchain), [OpenAI Agents SDK](#openai-agents-sdk), and [Claude Agent SDK](#claude-agent-sdk)**; drop-in via the raw API for any custom stack. Pure Python, no runtime dependencies.
-**Keywords:** AI agent loops · agentic AI · infinite loop detection · divergence detection · early stopping · convergence · agent orchestration · LLM stability · generator-verifier-reviser · feedback-loop control.
 ---
 ## Why
@@ -176,13 +181,34 @@ This transforms divergence detection from "abort with garbage" into "abort with
 ---
+## See it across a fleet (optional dashboard)
+The library is the whole product locally — telemetry is opt-in and self-hostable. If you want a fleet view of every loop's stability, cost, and rollbacks across a team, there's a hosted dashboard fed by the [telemetry receiver](https://github.com/loopgain-ai/telemetry-receiver):
+[![LoopGain dashboard — loop health, convergence, waste, and rollbacks across a fleet](https://loopgain.ai/dashboard-demo.png)](https://dashboard.loopgain.ai/demo)
+**[Open the live demo →](https://dashboard.loopgain.ai/demo)** — no signup, real benchmark data.
+The receiver and dashboard are both open-source — self-host to keep telemetry entirely under your control.
+### Repositories
+| Repo | What it is |
+| --- | --- |
+| [**loopgain**](https://github.com/loopgain-ai/loopgain) | This library — the Apache-2.0 control loop (you are here) |
+| [**telemetry-receiver**](https://github.com/loopgain-ai/telemetry-receiver) | Cloudflare Worker that ingests anonymized loop telemetry |
+| [**dashboard**](https://github.com/loopgain-ai/dashboard) | The fleet dashboard — self-hostable |
+| [**loopgain-bench**](https://github.com/loopgain-ai/loopgain-bench) | The reproducible 2,000-trial benchmark behind the numbers above |
+---
 ## What LoopGain does and doesn't guarantee
-LoopGain saves money by stopping a loop once it stops improving — fewer iterations, fewer tokens. In our [public benchmark](https://github.com/loopgain-ai/loopgain-bench), that was a **92.8% median cut in API spend** vs `max_iterations=20`, with output quality preserved. Two honest limits:
+LoopGain saves money by stopping a loop once it stops improving — fewer iterations, fewer tokens. In our [public benchmark](https://github.com/loopgain-ai/loopgain-bench), that was a **92.8% cut in total API spend** vs `max_iterations=20`, with output quality preserved. Two honest limits:
 - **Savings depend on your workload.** Loops that usually succeed fast save the most (~96%); adversarial, failure-prone loops save less (~78–84%). The headline is a blend — run the benchmark on your own loops before quoting a number.
 - **LoopGain detects convergence, not correctness.** It stops when your error signal stops improving — which means more iterations won't help, *not* that the loop succeeded. On the benchmark this preserved quality (it rarely stopped early on a worse output; false-stop rate ≤4.5%), but a loop can stall with the error still above zero — a plateau at, say, 2 failing tests. So check `result.best_error` (or your own pass/fail) before you trust the output: if it plateaued short of your target, that's a quality gap LoopGain can't see, and a false stop that forces a rerun is the one way it eats into the savings. LoopGain decides *when to stop*; you decide *whether the answer is good enough*.
-- **LoopGain is only as right as your verifier.** It acts on the error signal you give it. If your verifier reports zero errors, LoopGain trusts that and stops — so a verifier with blind spots can report success on an answer that is still wrong, and LoopGain will confidently stop there. This is not the plateau case above: the error reads zero and the loop looks like a clean success, so neither LoopGain nor its convergence signal can flag it. The quality of the stop is bounded by the quality of the check behind your error signal. Pair LoopGain with the strongest verifier you can afford at the stop — executable tests over a sampled subset, a schema or type check over a vibe, a held-out check the loop didn't optimize against.
+- **LoopGain is only as right as your verifier.** It acts on the error signal you give it. If your verifier reports zero errors, LoopGain trusts that and stops — so a verifier with blind spots can report success on an answer that is still wrong, and LoopGain will confidently stop there. This is not the plateau case above: the error reads zero and the loop looks like a clean success, so neither LoopGain nor its convergence signal can flag it. The quality of the stop is bounded by the quality of the check behind your error signal. We measured this on the benchmark's code-gen workload: **4.5% of converged runs (16/355) passed every check the loop ran but failed the full held-out test suite** — and that's a floor, not a ceiling, because the in-loop verifier there was strong; a weaker verifier exposes more. (Distinct from the ≤4.5% false-stop rate above — the numbers coincide, the failure modes don't.) Pair LoopGain with the strongest verifier you can afford at the stop — executable tests over a sampled subset, a schema or type check over a vibe, a held-out check the loop didn't optimize against. **[How to design a strong verifier](https://loopgain.ai/blog/posts/how-to-design-a-strong-verifier/)** is a field guide to exactly this.
 ---
@@ -246,9 +272,9 @@ python3 -c "import keyring; keyring.set_password('loopgain', 'telemetry', input(
 # Then in code: keyring.get_password('loopgain', 'telemetry')
 ```
-What is sent: state transitions, Aβ summary (min/max/median), gain margin, rollback flag, iterations used, savings, library version, optional opaque `workload_id`, threshold config, hour-bucketed timestamp.
+What is sent: state transitions, Aβ summary (min/max/median), rollback flag, iterations used, savings, library version, optional opaque `workload_id`, threshold config, hour-bucketed timestamp — and, unless you pass `include_per_iteration=False`, a length-capped per-iteration trajectory (smoothed Aβ values and numeric error magnitudes; this is what drives the dashboard's convergence-profile scrubbing).
-**What is NEVER sent: prompts, completions, error contents, output buffer, individual Aβ values, or any customer identity beyond the bearer token.** Privacy contract is enforced by the payload-shape unit tests in `tests/test_telemetry.py`.
+**What is NEVER sent: prompts, completions, error contents, the output buffer, or any customer identity beyond the bearer token.** Numeric error *magnitudes* are sent (they're the loop-gain signal); error *contents* never are. Privacy contract is enforced by the payload-shape unit tests in `tests/test_telemetry.py`.
 The hosted endpoint at `telemetry.loopgain.ai` is one acceptable destination. The [receiver](https://github.com/loopgain-ai/telemetry-receiver) and [dashboard](https://github.com/loopgain-ai/dashboard) are both open-source — self-host to keep telemetry fully under your control.
@@ -507,7 +533,7 @@ This is alpha software. The API may break before 1.0 if production usage surface
 LoopGain applies the **Barkhausen stability criterion** (Heinrich Barkhausen, 1921 — the foundational result on when feedback amplifiers oscillate) to AI agent feedback loops. The criterion was originally a way to predict whether an electronic oscillator would sustain oscillation; it turns out to map cleanly onto any feedback loop you can attach an error signal to.
-The cleanest summary: an iterative AI loop with a measurable error signal is a feedback system. The ratio `E(n) / E(n-1)` is its empirical loop gain. The Barkhausen result tells you that loop gain less than 1 converges, equal to 1 oscillates, greater than 1 diverges. LoopGain operationalizes this: classifies the loop's current band, decides what to do, and tells you when you'll converge.
+The cleanest summary: an iterative AI loop with a measurable error signal is a feedback system. The ratio `E(n) / E(n-1)` is its empirical loop gain. The Barkhausen result tells you that loop gain less than 1 converges, equal to 1 oscillates, greater than 1 diverges. LoopGain operationalizes this: classifies the loop's current band, and decides what to do — stop, continue, or roll back to the best output seen so far.
 Loop types this applies to in practice:

{loopgain-0.5.0 → loopgain-0.5.2}/README.md RENAMED Viewed

@@ -4,19 +4,24 @@
 AI agent loops waste time and money when they don't know when to stop. LoopGain measures the loop in real time and stops it the moment it has actually converged — and rolls back before it degrades — instead of running to a fixed `max_iterations` cap.
-> **Across 2,000 paired trials over 10 cells**, LoopGain reduced total API spend by **92.8%** vs `max_iter=20`, dropped median wall-clock latency from 30.9s to 2.1s (**~15×**), preserved output quality on natural-distribution workloads (W1–W4: judge winrate 0.50–0.63, CI excluding null on most cells), and improved output quality on engineered-failure workloads (W5: winrate 0.92–0.95 across three adapters). Weighted-average pairwise preference for LG vs B20 across 1,800 judge comparisons: **0.678**. Zero of six kill criteria fired.
+> **Benchmark — 2,000 paired trials across 10 workload cells** ([run it yourself](https://github.com/loopgain-ai/loopgain-bench)):
+>
+> - **92.8% less API spend** than `max_iter=20` — $27.05 → $1.94 in total benchmark spend
+> - **~15× faster** — median wall-clock per trial 30.9s → 2.1s
+> - **Quality preserved, not traded for speed** — judge win-rate 0.50–0.63 on natural-distribution workloads (W1–W4, CI excluding null on most cells), 0.92–0.95 on engineered-failure workloads (W5); 0.678 weighted preference across 1,800 judge comparisons
+> - **Zero of six kill criteria fired** (all six pre-registered with thresholds before the run)
+**Honest limits, up front:** LoopGain detects *convergence, not correctness* — it knows when more iterations won't help, not whether the answer is right, and it's only as good as the verifier behind your error signal. [The full list of what it can't do →](#what-loopgain-does-and-doesnt-guarantee)
 [![PyPI](https://img.shields.io/pypi/v/loopgain.svg)](https://pypi.org/project/loopgain/)
 [![Python](https://img.shields.io/pypi/pyversions/loopgain.svg)](https://pypi.org/project/loopgain/)
 [![License](https://img.shields.io/badge/license-Apache_2.0-blue.svg)](LICENSE)
-[![Tests](https://img.shields.io/badge/tests-200%2B_passing-brightgreen.svg)](tests/)
+[![Tests](https://img.shields.io/badge/tests-190%2B_passing-brightgreen.svg)](tests/)
 **Home:** [loopgain.ai](https://loopgain.ai)
 Works for **any iterative AI workflow with a measurable error signal** — verify-revise loops, refinement passes, tool-use retry chains, RAG with self-correction, code-gen with linter feedback, multi-step reasoning loops. **Pre-built adapters for [LangGraph](#langgraph), [CrewAI](#crewai), [AutoGen](#autogen-v04), [LangChain](#langchain), [OpenAI Agents SDK](#openai-agents-sdk), and [Claude Agent SDK](#claude-agent-sdk)**; drop-in via the raw API for any custom stack. Pure Python, no runtime dependencies.
-**Keywords:** AI agent loops · agentic AI · infinite loop detection · divergence detection · early stopping · convergence · agent orchestration · LLM stability · generator-verifier-reviser · feedback-loop control.
 ---
 ## Why
@@ -127,13 +132,34 @@ This transforms divergence detection from "abort with garbage" into "abort with
 ---
+## See it across a fleet (optional dashboard)
+The library is the whole product locally — telemetry is opt-in and self-hostable. If you want a fleet view of every loop's stability, cost, and rollbacks across a team, there's a hosted dashboard fed by the [telemetry receiver](https://github.com/loopgain-ai/telemetry-receiver):
+[![LoopGain dashboard — loop health, convergence, waste, and rollbacks across a fleet](https://loopgain.ai/dashboard-demo.png)](https://dashboard.loopgain.ai/demo)
+**[Open the live demo →](https://dashboard.loopgain.ai/demo)** — no signup, real benchmark data.
+The receiver and dashboard are both open-source — self-host to keep telemetry entirely under your control.
+### Repositories
+| Repo | What it is |
+| --- | --- |
+| [**loopgain**](https://github.com/loopgain-ai/loopgain) | This library — the Apache-2.0 control loop (you are here) |
+| [**telemetry-receiver**](https://github.com/loopgain-ai/telemetry-receiver) | Cloudflare Worker that ingests anonymized loop telemetry |
+| [**dashboard**](https://github.com/loopgain-ai/dashboard) | The fleet dashboard — self-hostable |
+| [**loopgain-bench**](https://github.com/loopgain-ai/loopgain-bench) | The reproducible 2,000-trial benchmark behind the numbers above |
+---
 ## What LoopGain does and doesn't guarantee
-LoopGain saves money by stopping a loop once it stops improving — fewer iterations, fewer tokens. In our [public benchmark](https://github.com/loopgain-ai/loopgain-bench), that was a **92.8% median cut in API spend** vs `max_iterations=20`, with output quality preserved. Two honest limits:
+LoopGain saves money by stopping a loop once it stops improving — fewer iterations, fewer tokens. In our [public benchmark](https://github.com/loopgain-ai/loopgain-bench), that was a **92.8% cut in total API spend** vs `max_iterations=20`, with output quality preserved. Two honest limits:
 - **Savings depend on your workload.** Loops that usually succeed fast save the most (~96%); adversarial, failure-prone loops save less (~78–84%). The headline is a blend — run the benchmark on your own loops before quoting a number.
 - **LoopGain detects convergence, not correctness.** It stops when your error signal stops improving — which means more iterations won't help, *not* that the loop succeeded. On the benchmark this preserved quality (it rarely stopped early on a worse output; false-stop rate ≤4.5%), but a loop can stall with the error still above zero — a plateau at, say, 2 failing tests. So check `result.best_error` (or your own pass/fail) before you trust the output: if it plateaued short of your target, that's a quality gap LoopGain can't see, and a false stop that forces a rerun is the one way it eats into the savings. LoopGain decides *when to stop*; you decide *whether the answer is good enough*.
-- **LoopGain is only as right as your verifier.** It acts on the error signal you give it. If your verifier reports zero errors, LoopGain trusts that and stops — so a verifier with blind spots can report success on an answer that is still wrong, and LoopGain will confidently stop there. This is not the plateau case above: the error reads zero and the loop looks like a clean success, so neither LoopGain nor its convergence signal can flag it. The quality of the stop is bounded by the quality of the check behind your error signal. Pair LoopGain with the strongest verifier you can afford at the stop — executable tests over a sampled subset, a schema or type check over a vibe, a held-out check the loop didn't optimize against.
+- **LoopGain is only as right as your verifier.** It acts on the error signal you give it. If your verifier reports zero errors, LoopGain trusts that and stops — so a verifier with blind spots can report success on an answer that is still wrong, and LoopGain will confidently stop there. This is not the plateau case above: the error reads zero and the loop looks like a clean success, so neither LoopGain nor its convergence signal can flag it. The quality of the stop is bounded by the quality of the check behind your error signal. We measured this on the benchmark's code-gen workload: **4.5% of converged runs (16/355) passed every check the loop ran but failed the full held-out test suite** — and that's a floor, not a ceiling, because the in-loop verifier there was strong; a weaker verifier exposes more. (Distinct from the ≤4.5% false-stop rate above — the numbers coincide, the failure modes don't.) Pair LoopGain with the strongest verifier you can afford at the stop — executable tests over a sampled subset, a schema or type check over a vibe, a held-out check the loop didn't optimize against. **[How to design a strong verifier](https://loopgain.ai/blog/posts/how-to-design-a-strong-verifier/)** is a field guide to exactly this.
 ---
@@ -197,9 +223,9 @@ python3 -c "import keyring; keyring.set_password('loopgain', 'telemetry', input(
 # Then in code: keyring.get_password('loopgain', 'telemetry')
 ```
-What is sent: state transitions, Aβ summary (min/max/median), gain margin, rollback flag, iterations used, savings, library version, optional opaque `workload_id`, threshold config, hour-bucketed timestamp.
+What is sent: state transitions, Aβ summary (min/max/median), rollback flag, iterations used, savings, library version, optional opaque `workload_id`, threshold config, hour-bucketed timestamp — and, unless you pass `include_per_iteration=False`, a length-capped per-iteration trajectory (smoothed Aβ values and numeric error magnitudes; this is what drives the dashboard's convergence-profile scrubbing).
-**What is NEVER sent: prompts, completions, error contents, output buffer, individual Aβ values, or any customer identity beyond the bearer token.** Privacy contract is enforced by the payload-shape unit tests in `tests/test_telemetry.py`.
+**What is NEVER sent: prompts, completions, error contents, the output buffer, or any customer identity beyond the bearer token.** Numeric error *magnitudes* are sent (they're the loop-gain signal); error *contents* never are. Privacy contract is enforced by the payload-shape unit tests in `tests/test_telemetry.py`.
 The hosted endpoint at `telemetry.loopgain.ai` is one acceptable destination. The [receiver](https://github.com/loopgain-ai/telemetry-receiver) and [dashboard](https://github.com/loopgain-ai/dashboard) are both open-source — self-host to keep telemetry fully under your control.
@@ -458,7 +484,7 @@ This is alpha software. The API may break before 1.0 if production usage surface
 LoopGain applies the **Barkhausen stability criterion** (Heinrich Barkhausen, 1921 — the foundational result on when feedback amplifiers oscillate) to AI agent feedback loops. The criterion was originally a way to predict whether an electronic oscillator would sustain oscillation; it turns out to map cleanly onto any feedback loop you can attach an error signal to.
-The cleanest summary: an iterative AI loop with a measurable error signal is a feedback system. The ratio `E(n) / E(n-1)` is its empirical loop gain. The Barkhausen result tells you that loop gain less than 1 converges, equal to 1 oscillates, greater than 1 diverges. LoopGain operationalizes this: classifies the loop's current band, decides what to do, and tells you when you'll converge.
+The cleanest summary: an iterative AI loop with a measurable error signal is a feedback system. The ratio `E(n) / E(n-1)` is its empirical loop gain. The Barkhausen result tells you that loop gain less than 1 converges, equal to 1 oscillates, greater than 1 diverges. LoopGain operationalizes this: classifies the loop's current band, and decides what to do — stop, continue, or roll back to the best output seen so far.
 Loop types this applies to in practice:

{loopgain-0.5.0 → loopgain-0.5.2}/loopgain/_version.py RENAMED Viewed

@@ -7,4 +7,4 @@ from here so the value never drifts between ``__version__`` and the
 ``pyproject.toml``) for each release.
 """
-__version__ = "0.5.0"
+__version__ = "0.5.2"

{loopgain-0.5.0 → loopgain-0.5.2}/loopgain/telemetry.py RENAMED Viewed

@@ -2,7 +2,7 @@
 Opt-in. Sends a single POST per loop run to a customer-configured endpoint.
 Privacy: only structural statistics — Aβ values, error magnitudes, state
-transitions, gain margin, rollback flag, library version, optional opaque
+transitions, rollback flag, library version, optional opaque
 workload/classification labels. Never sends prompts, completions, error
 contents (the textual content of failures), customer identity beyond the
 bearer token, or best-so-far outputs.

{loopgain-0.5.0 → loopgain-0.5.2}/loopgain.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: loopgain
-Version: 0.5.0
+Version: 0.5.2
 Summary: An open-source cost controller for AI agent loops. Stops a loop when it has actually converged and rolls back before it degrades — replacing the max_iterations guess with a real-time loop-gain (Aβ) monitor with five named threshold bands and best-so-far rollback.
 Author-email: Dave Fitzsimmons <hello@loopgain.ai>
 License: Apache-2.0
@@ -53,19 +53,24 @@ Dynamic: license-file
 AI agent loops waste time and money when they don't know when to stop. LoopGain measures the loop in real time and stops it the moment it has actually converged — and rolls back before it degrades — instead of running to a fixed `max_iterations` cap.
-> **Across 2,000 paired trials over 10 cells**, LoopGain reduced total API spend by **92.8%** vs `max_iter=20`, dropped median wall-clock latency from 30.9s to 2.1s (**~15×**), preserved output quality on natural-distribution workloads (W1–W4: judge winrate 0.50–0.63, CI excluding null on most cells), and improved output quality on engineered-failure workloads (W5: winrate 0.92–0.95 across three adapters). Weighted-average pairwise preference for LG vs B20 across 1,800 judge comparisons: **0.678**. Zero of six kill criteria fired.
+> **Benchmark — 2,000 paired trials across 10 workload cells** ([run it yourself](https://github.com/loopgain-ai/loopgain-bench)):
+>
+> - **92.8% less API spend** than `max_iter=20` — $27.05 → $1.94 in total benchmark spend
+> - **~15× faster** — median wall-clock per trial 30.9s → 2.1s
+> - **Quality preserved, not traded for speed** — judge win-rate 0.50–0.63 on natural-distribution workloads (W1–W4, CI excluding null on most cells), 0.92–0.95 on engineered-failure workloads (W5); 0.678 weighted preference across 1,800 judge comparisons
+> - **Zero of six kill criteria fired** (all six pre-registered with thresholds before the run)
+**Honest limits, up front:** LoopGain detects *convergence, not correctness* — it knows when more iterations won't help, not whether the answer is right, and it's only as good as the verifier behind your error signal. [The full list of what it can't do →](#what-loopgain-does-and-doesnt-guarantee)
 [![PyPI](https://img.shields.io/pypi/v/loopgain.svg)](https://pypi.org/project/loopgain/)
 [![Python](https://img.shields.io/pypi/pyversions/loopgain.svg)](https://pypi.org/project/loopgain/)
 [![License](https://img.shields.io/badge/license-Apache_2.0-blue.svg)](LICENSE)
-[![Tests](https://img.shields.io/badge/tests-200%2B_passing-brightgreen.svg)](tests/)
+[![Tests](https://img.shields.io/badge/tests-190%2B_passing-brightgreen.svg)](tests/)
 **Home:** [loopgain.ai](https://loopgain.ai)
 Works for **any iterative AI workflow with a measurable error signal** — verify-revise loops, refinement passes, tool-use retry chains, RAG with self-correction, code-gen with linter feedback, multi-step reasoning loops. **Pre-built adapters for [LangGraph](#langgraph), [CrewAI](#crewai), [AutoGen](#autogen-v04), [LangChain](#langchain), [OpenAI Agents SDK](#openai-agents-sdk), and [Claude Agent SDK](#claude-agent-sdk)**; drop-in via the raw API for any custom stack. Pure Python, no runtime dependencies.
-**Keywords:** AI agent loops · agentic AI · infinite loop detection · divergence detection · early stopping · convergence · agent orchestration · LLM stability · generator-verifier-reviser · feedback-loop control.
 ---
 ## Why
@@ -176,13 +181,34 @@ This transforms divergence detection from "abort with garbage" into "abort with
 ---
+## See it across a fleet (optional dashboard)
+The library is the whole product locally — telemetry is opt-in and self-hostable. If you want a fleet view of every loop's stability, cost, and rollbacks across a team, there's a hosted dashboard fed by the [telemetry receiver](https://github.com/loopgain-ai/telemetry-receiver):
+[![LoopGain dashboard — loop health, convergence, waste, and rollbacks across a fleet](https://loopgain.ai/dashboard-demo.png)](https://dashboard.loopgain.ai/demo)
+**[Open the live demo →](https://dashboard.loopgain.ai/demo)** — no signup, real benchmark data.
+The receiver and dashboard are both open-source — self-host to keep telemetry entirely under your control.
+### Repositories
+| Repo | What it is |
+| --- | --- |
+| [**loopgain**](https://github.com/loopgain-ai/loopgain) | This library — the Apache-2.0 control loop (you are here) |
+| [**telemetry-receiver**](https://github.com/loopgain-ai/telemetry-receiver) | Cloudflare Worker that ingests anonymized loop telemetry |
+| [**dashboard**](https://github.com/loopgain-ai/dashboard) | The fleet dashboard — self-hostable |
+| [**loopgain-bench**](https://github.com/loopgain-ai/loopgain-bench) | The reproducible 2,000-trial benchmark behind the numbers above |
+---
 ## What LoopGain does and doesn't guarantee
-LoopGain saves money by stopping a loop once it stops improving — fewer iterations, fewer tokens. In our [public benchmark](https://github.com/loopgain-ai/loopgain-bench), that was a **92.8% median cut in API spend** vs `max_iterations=20`, with output quality preserved. Two honest limits:
+LoopGain saves money by stopping a loop once it stops improving — fewer iterations, fewer tokens. In our [public benchmark](https://github.com/loopgain-ai/loopgain-bench), that was a **92.8% cut in total API spend** vs `max_iterations=20`, with output quality preserved. Two honest limits:
 - **Savings depend on your workload.** Loops that usually succeed fast save the most (~96%); adversarial, failure-prone loops save less (~78–84%). The headline is a blend — run the benchmark on your own loops before quoting a number.
 - **LoopGain detects convergence, not correctness.** It stops when your error signal stops improving — which means more iterations won't help, *not* that the loop succeeded. On the benchmark this preserved quality (it rarely stopped early on a worse output; false-stop rate ≤4.5%), but a loop can stall with the error still above zero — a plateau at, say, 2 failing tests. So check `result.best_error` (or your own pass/fail) before you trust the output: if it plateaued short of your target, that's a quality gap LoopGain can't see, and a false stop that forces a rerun is the one way it eats into the savings. LoopGain decides *when to stop*; you decide *whether the answer is good enough*.
-- **LoopGain is only as right as your verifier.** It acts on the error signal you give it. If your verifier reports zero errors, LoopGain trusts that and stops — so a verifier with blind spots can report success on an answer that is still wrong, and LoopGain will confidently stop there. This is not the plateau case above: the error reads zero and the loop looks like a clean success, so neither LoopGain nor its convergence signal can flag it. The quality of the stop is bounded by the quality of the check behind your error signal. Pair LoopGain with the strongest verifier you can afford at the stop — executable tests over a sampled subset, a schema or type check over a vibe, a held-out check the loop didn't optimize against.
+- **LoopGain is only as right as your verifier.** It acts on the error signal you give it. If your verifier reports zero errors, LoopGain trusts that and stops — so a verifier with blind spots can report success on an answer that is still wrong, and LoopGain will confidently stop there. This is not the plateau case above: the error reads zero and the loop looks like a clean success, so neither LoopGain nor its convergence signal can flag it. The quality of the stop is bounded by the quality of the check behind your error signal. We measured this on the benchmark's code-gen workload: **4.5% of converged runs (16/355) passed every check the loop ran but failed the full held-out test suite** — and that's a floor, not a ceiling, because the in-loop verifier there was strong; a weaker verifier exposes more. (Distinct from the ≤4.5% false-stop rate above — the numbers coincide, the failure modes don't.) Pair LoopGain with the strongest verifier you can afford at the stop — executable tests over a sampled subset, a schema or type check over a vibe, a held-out check the loop didn't optimize against. **[How to design a strong verifier](https://loopgain.ai/blog/posts/how-to-design-a-strong-verifier/)** is a field guide to exactly this.
 ---
@@ -246,9 +272,9 @@ python3 -c "import keyring; keyring.set_password('loopgain', 'telemetry', input(
 # Then in code: keyring.get_password('loopgain', 'telemetry')
 ```
-What is sent: state transitions, Aβ summary (min/max/median), gain margin, rollback flag, iterations used, savings, library version, optional opaque `workload_id`, threshold config, hour-bucketed timestamp.
+What is sent: state transitions, Aβ summary (min/max/median), rollback flag, iterations used, savings, library version, optional opaque `workload_id`, threshold config, hour-bucketed timestamp — and, unless you pass `include_per_iteration=False`, a length-capped per-iteration trajectory (smoothed Aβ values and numeric error magnitudes; this is what drives the dashboard's convergence-profile scrubbing).
-**What is NEVER sent: prompts, completions, error contents, output buffer, individual Aβ values, or any customer identity beyond the bearer token.** Privacy contract is enforced by the payload-shape unit tests in `tests/test_telemetry.py`.
+**What is NEVER sent: prompts, completions, error contents, the output buffer, or any customer identity beyond the bearer token.** Numeric error *magnitudes* are sent (they're the loop-gain signal); error *contents* never are. Privacy contract is enforced by the payload-shape unit tests in `tests/test_telemetry.py`.
 The hosted endpoint at `telemetry.loopgain.ai` is one acceptable destination. The [receiver](https://github.com/loopgain-ai/telemetry-receiver) and [dashboard](https://github.com/loopgain-ai/dashboard) are both open-source — self-host to keep telemetry fully under your control.
@@ -507,7 +533,7 @@ This is alpha software. The API may break before 1.0 if production usage surface
 LoopGain applies the **Barkhausen stability criterion** (Heinrich Barkhausen, 1921 — the foundational result on when feedback amplifiers oscillate) to AI agent feedback loops. The criterion was originally a way to predict whether an electronic oscillator would sustain oscillation; it turns out to map cleanly onto any feedback loop you can attach an error signal to.
-The cleanest summary: an iterative AI loop with a measurable error signal is a feedback system. The ratio `E(n) / E(n-1)` is its empirical loop gain. The Barkhausen result tells you that loop gain less than 1 converges, equal to 1 oscillates, greater than 1 diverges. LoopGain operationalizes this: classifies the loop's current band, decides what to do, and tells you when you'll converge.
+The cleanest summary: an iterative AI loop with a measurable error signal is a feedback system. The ratio `E(n) / E(n-1)` is its empirical loop gain. The Barkhausen result tells you that loop gain less than 1 converges, equal to 1 oscillates, greater than 1 diverges. LoopGain operationalizes this: classifies the loop's current band, and decides what to do — stop, continue, or roll back to the best output seen so far.
 Loop types this applies to in practice: