tightloop 0.1.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- loop/__init__.py +40 -0
- loop/approval/__init__.py +87 -0
- loop/blueprints/__init__.py +3 -0
- loop/blueprints/testfix.py +117 -0
- loop/context/__init__.py +144 -0
- loop/core/__init__.py +0 -0
- loop/core/engine.py +515 -0
- loop/core/result.py +64 -0
- loop/core/state.py +143 -0
- loop/exit/__init__.py +60 -0
- loop/llm/__init__.py +70 -0
- loop/llm/anthropic.py +45 -0
- loop/llm/openai.py +55 -0
- loop/policy/__init__.py +96 -0
- loop/pricing.py +47 -0
- loop/progress/__init__.py +72 -0
- loop/tools/__init__.py +220 -0
- loop/trace/__init__.py +81 -0
- tightloop-0.1.0.dist-info/METADATA +439 -0
- tightloop-0.1.0.dist-info/RECORD +21 -0
- tightloop-0.1.0.dist-info/WHEEL +4 -0
|
@@ -0,0 +1,439 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: tightloop
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Production-grade loops for AI agents: a structured runtime for reliable, observable, governable agent loops.
|
|
5
|
+
Requires-Python: >=3.10
|
|
6
|
+
Requires-Dist: pydantic>=2.5
|
|
7
|
+
Provides-Extra: anthropic
|
|
8
|
+
Requires-Dist: anthropic>=0.40; extra == 'anthropic'
|
|
9
|
+
Provides-Extra: dev
|
|
10
|
+
Requires-Dist: pytest>=8; extra == 'dev'
|
|
11
|
+
Provides-Extra: openai
|
|
12
|
+
Requires-Dist: openai>=1.40; extra == 'openai'
|
|
13
|
+
Description-Content-Type: text/markdown
|
|
14
|
+
|
|
15
|
+
# π Loop
|
|
16
|
+
|
|
17
|
+
> **Production-grade loops for AI agents.** A structured runtime that makes agent loops reliable, observable, and governable β so you stop reinventing retry logic, exit conditions, budget caps, and approval gates for every agent you build.
|
|
18
|
+
|
|
19
|
+

|
|
20
|
+

|
|
21
|
+

|
|
22
|
+

|
|
23
|
+

|
|
24
|
+
|
|
25
|
+

|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## Table of Contents
|
|
30
|
+
|
|
31
|
+
- [Why Loop?](#why-loop)
|
|
32
|
+
- [How It Works](#how-it-works)
|
|
33
|
+
- [Installation](#installation)
|
|
34
|
+
- [Quickstart](#quickstart)
|
|
35
|
+
- [The Safety Model](#the-safety-model)
|
|
36
|
+
- [Every Result Is Actionable](#every-result-is-actionable)
|
|
37
|
+
- [Recipes](#recipes)
|
|
38
|
+
- [1. Fix failing tests in a real repo](#1-fix-failing-tests-in-a-real-repo)
|
|
39
|
+
- [2. Resume after running out of budget](#2-resume-after-running-out-of-budget)
|
|
40
|
+
- [3. Human approval gates](#3-human-approval-gates)
|
|
41
|
+
- [4. Headless approvals (CI, bots, services)](#4-headless-approvals-ci-bots-services)
|
|
42
|
+
- [5. Bring your own LLM](#5-bring-your-own-llm)
|
|
43
|
+
- [6. Define progress for your own task](#6-define-progress-for-your-own-task)
|
|
44
|
+
- [7. Watch the loop live](#7-watch-the-loop-live)
|
|
45
|
+
- [Configuration Reference](#configuration-reference)
|
|
46
|
+
- [Writing Tools](#writing-tools)
|
|
47
|
+
- [Architecture](#architecture)
|
|
48
|
+
- [Development](#development)
|
|
49
|
+
- [Troubleshooting](#troubleshooting)
|
|
50
|
+
- [Roadmap](#roadmap)
|
|
51
|
+
- [FAQ](#faq)
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## Why Loop?
|
|
56
|
+
|
|
57
|
+
Every team building agents eventually rewrites the same plumbing:
|
|
58
|
+
|
|
59
|
+
| You keep rebuilding⦠| Loop gives you⦠|
|
|
60
|
+
| -------------------------- | -------------------------------------------------------------------------- |
|
|
61
|
+
| Retry / test-fix loops | A structured **Observe β Plan β Act β Evaluate** engine |
|
|
62
|
+
| "Why won't it stop?" | Declarative **exit conditions** + always-on iteration/token/time ceilings |
|
|
63
|
+
| Surprise API bills | **Token budgets** enforced *before* every action β calls can't overshoot |
|
|
64
|
+
| Agents spinning in circles | A **progress engine** that detects stagnation, repetition, and regressions |
|
|
65
|
+
| Context window overflow | **Managed context**: pinned facts, failed-approach registry, summaries |
|
|
66
|
+
| "Just ask a human first" | **Approval gates** with CLI, callback, and pause/resume-by-token flows |
|
|
67
|
+
| Debugging from print() | **Live JSONL traces** + `loop.explain()` β "why did it stop?" always has an answer |
|
|
68
|
+
|
|
69
|
+
Loop is a **runtime layer, not a framework replacement** β it works with Anthropic, OpenAI, or any callable, and plugs into whatever stack you already have. It is *not* a model provider, vector DB, agent framework, or workflow engine.
|
|
70
|
+
|
|
71
|
+
## How It Works
|
|
72
|
+
|
|
73
|
+
Every loop runs the same auditable cycle. Hard ceilings are checked **before every action** β not just between iterations β so a loop can never overshoot its budget:
|
|
74
|
+
|
|
75
|
+
```mermaid
|
|
76
|
+
flowchart TD
|
|
77
|
+
S([βΆ run]) --> C{ceilings OK?<br/>iterations Β· tokens Β· time Β· cost}
|
|
78
|
+
C -- no --> BE([π BUDGET_EXHAUSTED<br/>+ snapshot + resume handle])
|
|
79
|
+
C -- yes --> O[π Observe<br/>run tests, gather signals]
|
|
80
|
+
O --> M{goal metric<br/>says success?}
|
|
81
|
+
M -- yes --> OK([β
SUCCESS])
|
|
82
|
+
M -- no --> P[π§ Plan<br/>one LLM call, validated tool args]
|
|
83
|
+
P --> G{approval<br/>required?}
|
|
84
|
+
G -- denied --> AD([π
APPROVAL_DENIED])
|
|
85
|
+
G -- pending --> AW([βΈ AWAITING_APPROVAL<br/>resume by token])
|
|
86
|
+
G -- approved / not needed --> A[βοΈ Act<br/>enforced timeouts]
|
|
87
|
+
A --> E[π Evaluate<br/>progress Β· repetition Β· regression]
|
|
88
|
+
E --> X{exit condition hit?}
|
|
89
|
+
X -- "no progress" --> NP([π΄ NO_PROGRESS])
|
|
90
|
+
X -- no --> C
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
Everything the loop does is recorded as structured events, streamed live to JSONL and an optional callback. Nothing important hides inside prompts.
|
|
94
|
+
|
|
95
|
+
## Installation
|
|
96
|
+
|
|
97
|
+
Not yet published to PyPI (distribution name `tightloop`; import name is `loop`):
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
git clone <this-repo> && cd Loops
|
|
101
|
+
pip install -e . # core (pydantic only)
|
|
102
|
+
pip install -e ".[anthropic]" # + Anthropic adapter
|
|
103
|
+
pip install -e ".[openai]" # + OpenAI adapter
|
|
104
|
+
pip install -e ".[dev]" # + pytest
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
**Requirements:** Python 3.10+. The only core dependency is `pydantic>=2.5`.
|
|
108
|
+
|
|
109
|
+
## Quickstart
|
|
110
|
+
|
|
111
|
+
```python
|
|
112
|
+
from loop import Loop, tool
|
|
113
|
+
from loop.llm.anthropic import AnthropicLLM # or loop.llm.openai.OpenAILLM
|
|
114
|
+
|
|
115
|
+
@tool
|
|
116
|
+
def read_file(path: str) -> str:
|
|
117
|
+
"""Read a file."""
|
|
118
|
+
return open(path).read()
|
|
119
|
+
|
|
120
|
+
@tool
|
|
121
|
+
def edit_file(path: str, content: str) -> str:
|
|
122
|
+
"""Overwrite a file."""
|
|
123
|
+
open(path, "w").write(content)
|
|
124
|
+
return f"wrote {path}"
|
|
125
|
+
|
|
126
|
+
loop = Loop(
|
|
127
|
+
goal="Fix the failing tests",
|
|
128
|
+
tools=[read_file, edit_file],
|
|
129
|
+
llm=AnthropicLLM(), # ANTHROPIC_API_KEY from env
|
|
130
|
+
)
|
|
131
|
+
result = loop.run()
|
|
132
|
+
|
|
133
|
+
print(result.status) # SUCCESS, BUDGET_EXHAUSTED, NO_PROGRESS, ...
|
|
134
|
+
print(result.recommended_action) # every status tells you what to do next
|
|
135
|
+
print(loop.explain().render()) # full "why did it stop" report
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
When it starts, the loop **announces its effective limits** β safety is never silent:
|
|
139
|
+
|
|
140
|
+
```text
|
|
141
|
+
[loop] goal='Fix the failing tests' | limits: 20 iterations, 500,000 tokens, 1800s wall-clock
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
## The Safety Model
|
|
145
|
+
|
|
146
|
+
Three ceilings are **always on** β you cannot construct a loop without them:
|
|
147
|
+
|
|
148
|
+
| Ceiling | Default | What happens at the limit |
|
|
149
|
+
| ---------------- | -------------- | ------------------------------------------------------------ |
|
|
150
|
+
| `max_iterations` | `20` | `BUDGET_EXHAUSTED` + progress snapshot + resume handle |
|
|
151
|
+
| `token_limit` | `500,000` | Same β and `max_tokens` is clamped so no call can overshoot |
|
|
152
|
+
| `wall_clock_s` | `1800` (30min) | Same |
|
|
153
|
+
|
|
154
|
+
Plus, optionally:
|
|
155
|
+
|
|
156
|
+
- `cost_limit_usd` β a USD ceiling derived from a pricing table that carries an **as-of date**. Tokens are authoritative; if the table is stale (>90 days) you choose the behavior: `warn` (default), `token-only`, or `refuse`.
|
|
157
|
+
- `NoProgress(window=3)` β on by default: stops after 3 consecutive iterations of repeated/invalid actions with zero metric movement.
|
|
158
|
+
|
|
159
|
+
Infinite loops are impossible by default. Mysterious stops don't exist: hitting any ceiling returns a **resumable snapshot**, never an exception in your face.
|
|
160
|
+
|
|
161
|
+
## Every Result Is Actionable
|
|
162
|
+
|
|
163
|
+
`LoopResult` always carries `resumable` and `recommended_action`:
|
|
164
|
+
|
|
165
|
+
| Status | Resumable | What to do |
|
|
166
|
+
| ------------------- | :-------: | ---------------------------------------------------------------- |
|
|
167
|
+
| `SUCCESS` | β | Done π |
|
|
168
|
+
| `BUDGET_EXHAUSTED` | β
| Inspect snapshot β `Loop.resume(path, extend={...})` |
|
|
169
|
+
| `NO_PROGRESS` | β
| Change tools/goal/limits, then resume |
|
|
170
|
+
| `PLAN_FAILED` | β
| Fix tool schemas or prompt, then resume |
|
|
171
|
+
| `APPROVAL_DENIED` | β
| Adjust plan or policy, then resume |
|
|
172
|
+
| `AWAITING_APPROVAL` | β
| Approve via token, then resume |
|
|
173
|
+
| `PENDING_EXPIRED` | β
| Resume to re-request approval |
|
|
174
|
+
| `ERROR` | depends | `loop.explain()` has the answer |
|
|
175
|
+
|
|
176
|
+
## Recipes
|
|
177
|
+
|
|
178
|
+
### 1. Fix failing tests in a real repo
|
|
179
|
+
|
|
180
|
+
The flagship blueprint. Progress tracks **test identity, not counts** β if the agent fixes one test but breaks another, the trend flags `regressing` even though totals look flat:
|
|
181
|
+
|
|
182
|
+
```python
|
|
183
|
+
from loop import TestFixLoop
|
|
184
|
+
from loop.llm.anthropic import AnthropicLLM
|
|
185
|
+
|
|
186
|
+
result = TestFixLoop(
|
|
187
|
+
llm=AnthropicLLM(),
|
|
188
|
+
repo="path/to/repo",
|
|
189
|
+
test_cmd="python -m pytest -q -rf --tb=short",
|
|
190
|
+
).run()
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
It ships with `run_tests` / `read_file` / `edit_file` tools (path-escape protected, stale-bytecode safe) and a pytest-aware goal metric.
|
|
194
|
+
|
|
195
|
+
### 2. Resume after running out of budget
|
|
196
|
+
|
|
197
|
+
```python
|
|
198
|
+
result = Loop(goal="...", tools=tools, llm=llm,
|
|
199
|
+
token_limit=50_000, state_path="loop_state.json").run()
|
|
200
|
+
|
|
201
|
+
if result.status == "BUDGET_EXHAUSTED":
|
|
202
|
+
print(result.reason) # e.g. "token_limit (50,000) reached"
|
|
203
|
+
result = Loop.resume(
|
|
204
|
+
"loop_state.json", tools=tools, llm=llm,
|
|
205
|
+
extend={"token_limit": 200_000},
|
|
206
|
+
)
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
Resume is **deterministic**: context summaries and pinned facts are computed once, version-stamped, stored in state, and reused β never recomputed. If your tool schemas changed since the save, resume fails loudly (`SchemaChangedError`) unless you pass `allow_schema_change=True`.
|
|
210
|
+
|
|
211
|
+
### 3. Human approval gates
|
|
212
|
+
|
|
213
|
+
Gate any tool behind a human, with zero interrupt wiring:
|
|
214
|
+
|
|
215
|
+
```python
|
|
216
|
+
from loop import Loop, RequireApproval, CallbackApprovalRunner
|
|
217
|
+
|
|
218
|
+
loop = Loop(
|
|
219
|
+
goal="Clean up the repo",
|
|
220
|
+
tools=[delete_file, edit_file],
|
|
221
|
+
llm=llm,
|
|
222
|
+
policies=[RequireApproval({"delete_file"})], # or a callable matcher
|
|
223
|
+
approval_runner=CallbackApprovalRunner(notify_slack), # 60s timeout, deny-on-exception
|
|
224
|
+
)
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
The callback receives a **frozen, read-only** `ApprovalRequest` (action, args, reason β never your full context). If it throws or times out, the answer is *deny*. Every approval decision is traced.
|
|
228
|
+
|
|
229
|
+
### 4. Headless approvals (CI, bots, services)
|
|
230
|
+
|
|
231
|
+
```mermaid
|
|
232
|
+
sequenceDiagram
|
|
233
|
+
participant L as Loop
|
|
234
|
+
participant S as state.json
|
|
235
|
+
participant H as Human
|
|
236
|
+
L->>L: plan: delete_file(...)
|
|
237
|
+
L->>S: serialize state
|
|
238
|
+
L-->>H: AWAITING_APPROVAL (token abc123, TTL 1h)
|
|
239
|
+
H->>L: Loop.resume(path, approval={"token": "abc123", "approved": True})
|
|
240
|
+
L->>L: re-observe first π
|
|
241
|
+
alt world unchanged
|
|
242
|
+
L->>L: execute approved action, continue
|
|
243
|
+
else preconditions changed
|
|
244
|
+
L-->>H: AWAITING_APPROVAL (fresh token β approval invalidated)
|
|
245
|
+
end
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
```python
|
|
249
|
+
from loop import HeadlessApprovalRunner
|
|
250
|
+
|
|
251
|
+
result = loop.run() # β AWAITING_APPROVAL, result.approval_token
|
|
252
|
+
# ... later, from anywhere:
|
|
253
|
+
result = Loop.resume("loop_state.json", tools=tools, llm=llm,
|
|
254
|
+
approval={"token": result.approval_token, "approved": True})
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
Approvals carry a TTL (default 1 h) and are bound to the action *and* the state of the world. If the situation changed while the approval sat in someone's queue, it's invalidated and re-requested β you never approve yesterday's plan.
|
|
258
|
+
|
|
259
|
+
### 5. Bring your own LLM
|
|
260
|
+
|
|
261
|
+
Anything that returns an `LLMResponse` works β raw APIs, local models, test fakes:
|
|
262
|
+
|
|
263
|
+
```python
|
|
264
|
+
from loop import CallableLLM, LLMResponse, ToolCallReq
|
|
265
|
+
|
|
266
|
+
def my_model(messages, tool_schemas) -> LLMResponse:
|
|
267
|
+
out = my_inference_stack(messages, tool_schemas)
|
|
268
|
+
return LLMResponse(text=out.text,
|
|
269
|
+
tool_calls=[ToolCallReq(name=c.name, args=c.args) for c in out.calls],
|
|
270
|
+
input_tokens=out.in_tok, output_tokens=out.out_tok)
|
|
271
|
+
|
|
272
|
+
loop = Loop(goal="...", tools=tools, llm=CallableLLM(my_model))
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
Provider quirks are normalized at the adapter boundary: hallucinated or malformed tool calls are validated against schemas and fed back to the model as structured errors (retry budget: 2). Three strikes ends the iteration as `PLAN_INVALID`; two such iterations in a row exits `PLAN_FAILED`. Nothing is ever silently dropped.
|
|
276
|
+
|
|
277
|
+
### 6. Define progress for your own task
|
|
278
|
+
|
|
279
|
+
```python
|
|
280
|
+
from loop import GoalMetric, MetricSnapshot
|
|
281
|
+
|
|
282
|
+
class OpenTicketsMetric(GoalMetric):
|
|
283
|
+
def measure(self, observation: str, state) -> MetricSnapshot:
|
|
284
|
+
open_ids = parse_ticket_ids(observation)
|
|
285
|
+
return MetricSnapshot(value=-float(len(open_ids)),
|
|
286
|
+
detail={"open": sorted(open_ids)})
|
|
287
|
+
|
|
288
|
+
def is_success(self, snapshot) -> bool:
|
|
289
|
+
return not snapshot.detail["open"]
|
|
290
|
+
|
|
291
|
+
loop = Loop(goal="Close all open tickets", tools=tools, llm=llm,
|
|
292
|
+
observe=lambda state: ticket_system.report(),
|
|
293
|
+
goal_metric=OpenTicketsMetric())
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
### 7. Watch the loop live
|
|
297
|
+
|
|
298
|
+
```python
|
|
299
|
+
loop = Loop(goal="...", tools=tools, llm=llm,
|
|
300
|
+
trace_path="trace.jsonl", # live-appended JSONL
|
|
301
|
+
on_event=lambda e: print(e["kind"], e)) # or push to your dashboard
|
|
302
|
+
|
|
303
|
+
loop.budget_report() # itemized token accounting: pinned / summaries / verbatim / spent
|
|
304
|
+
loop.explain().render() # markdown: status, reason, signals, full decision chain
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
```bash
|
|
308
|
+
tail -f trace.jsonl | jq .kind
|
|
309
|
+
# "loop.start" "iteration.start" "llm.call" "action.executed" "iteration.end" "loop.end"
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
## Configuration Reference
|
|
313
|
+
|
|
314
|
+
`Loop(...)` constructor β everything is optional except `goal`, `tools`, `llm`:
|
|
315
|
+
|
|
316
|
+
| Parameter | Default | What it does |
|
|
317
|
+
| --------------------- | ---------------- | ------------------------------------------------------------------ |
|
|
318
|
+
| `goal` | *(required)* | What the loop is trying to achieve (pinned into every prompt) |
|
|
319
|
+
| `tools` | *(required)* | List of `@tool` functions / `Tool` objects |
|
|
320
|
+
| `llm` | *(required)* | `AnthropicLLM()`, `OpenAILLM()`, or any `CallableLLM` |
|
|
321
|
+
| `observe` | `None` | `fn(state) -> str` run at the top of each iteration |
|
|
322
|
+
| `goal_metric` | `None` | `GoalMetric` β enables success detection + progress trends |
|
|
323
|
+
| `policies` | `[NoProgress(3)]`| `NoProgress`, `CostLimit`, `RequireApproval`, or your own |
|
|
324
|
+
| `exits` | `[]` | Extra `Exit.success(...)`, `Exit.stagnation(...)`, etc. |
|
|
325
|
+
| `max_iterations` | `20` | Always-on ceiling |
|
|
326
|
+
| `token_limit` | `500_000` | Always-on ceiling; clamps per-call `max_tokens` |
|
|
327
|
+
| `wall_clock_s` | `1800` | Always-on ceiling |
|
|
328
|
+
| `cost_limit_usd` | `None` | Optional USD ceiling (tokens stay authoritative) |
|
|
329
|
+
| `pricing_staleness` | `"warn"` | `warn` / `token-only` / `refuse` when the pricing table is old |
|
|
330
|
+
| `approval_runner` | `CLIApprovalRunner()` | Or `CallbackApprovalRunner(fn)` / `HeadlessApprovalRunner()` |
|
|
331
|
+
| `summarizer` | `None` | Cheaper LLM for history compression (deterministic fallback if unset) |
|
|
332
|
+
| `verbatim_window` | `3` | Last K iterations kept verbatim in context |
|
|
333
|
+
| `max_tokens_per_call` | `4096` | Per-LLM-call output cap (clamped to remaining budget) |
|
|
334
|
+
| `state_path` | `None` | Where to persist state (required for headless approvals) |
|
|
335
|
+
| `trace_path` | `None` | Live JSONL event log |
|
|
336
|
+
| `on_event` | `None` | Callback for every trace event |
|
|
337
|
+
| `quiet` | `False` | Suppress the startup limits announcement |
|
|
338
|
+
|
|
339
|
+
Methods: `loop.run()` Β· `Loop.resume(path, tools=, llm=, approval=, extend=, ...)` Β· `loop.explain()` Β· `loop.budget_report()`
|
|
340
|
+
|
|
341
|
+
## Writing Tools
|
|
342
|
+
|
|
343
|
+
Tools are plain Python functions. Schemas come from type hints and are **frozen for the loop's lifetime**:
|
|
344
|
+
|
|
345
|
+
```python
|
|
346
|
+
from loop import tool, run_command
|
|
347
|
+
|
|
348
|
+
@tool(timeout_s=30) # enforced β result becomes "aborted" on breach
|
|
349
|
+
def lint(path: str, fix: bool = False) -> str:
|
|
350
|
+
"""Run the linter on a file."""
|
|
351
|
+
res = run_command(["ruff", "check", path] + (["--fix"] if fix else []), timeout_s=25)
|
|
352
|
+
return res.stdout
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
| Supported parameter types | Unsupported (fails **at registration**, never silently) |
|
|
356
|
+
| ------------------------- | -------------------------------------------------------- |
|
|
357
|
+
| `str` `int` `float` `bool` `list` `dict` `Optional[...]` `Literal[...]` `Enum` pydantic models | `Callable`, file handles, arbitrary classes, missing hints, `*args/**kwargs` |
|
|
358
|
+
|
|
359
|
+
Two execution modes:
|
|
360
|
+
|
|
361
|
+
- **Thread runner** (default): timeout marks the result `aborted` β Python threads can't be force-killed, so prefer the next option for anything long or untrusted.
|
|
362
|
+
- **`run_command(cmd, timeout_s=, cwd=)`**: subprocess with **SIGTERM β SIGKILL escalation**. Use this inside tools that shell out.
|
|
363
|
+
|
|
364
|
+
One rule: **no nested loops.** Calling `Loop.run()` inside a tool raises `NestedLoopError` β delegate sub-tasks via a tool that returns a result instead.
|
|
365
|
+
|
|
366
|
+
## Architecture
|
|
367
|
+
|
|
368
|
+
```mermaid
|
|
369
|
+
flowchart LR
|
|
370
|
+
subgraph engine ["loop.core β engine"]
|
|
371
|
+
E[Loop<br/>run / resume / ceilings]
|
|
372
|
+
ST[(State<br/>serializable Β· versioned)]
|
|
373
|
+
R[LoopResult]
|
|
374
|
+
end
|
|
375
|
+
LLM["loop.llm<br/>Anthropic Β· OpenAI Β· Callable"] --> E
|
|
376
|
+
T["loop.tools<br/>schemas Β· validation Β· timeouts"] --> E
|
|
377
|
+
P["loop.policy<br/>NoProgress Β· CostLimit Β· RequireApproval"] --> E
|
|
378
|
+
X["loop.exit<br/>success Β· stagnation Β· limits"] --> E
|
|
379
|
+
PR["loop.progress<br/>metrics Β· repetition Β· regression"] --> E
|
|
380
|
+
CX["loop.context<br/>pinned facts Β· summaries"] --> E
|
|
381
|
+
AP["loop.approval<br/>CLI Β· callback Β· headless"] --> E
|
|
382
|
+
E --> TR["loop.trace<br/>JSONL Β· explain()"]
|
|
383
|
+
E --> ST --> R
|
|
384
|
+
B["loop.blueprints<br/>TestFixLoop"] -.extends.-> E
|
|
385
|
+
```
|
|
386
|
+
|
|
387
|
+
```text
|
|
388
|
+
src/loop/
|
|
389
|
+
βββ core/ # engine.py (run/resume/ceilings/approvals), state.py, result.py
|
|
390
|
+
βββ llm/ # LLMClient protocol, CallableLLM, anthropic.py, openai.py
|
|
391
|
+
βββ tools/ # @tool, schema derivation, validation, run_command
|
|
392
|
+
βββ policy/ # NoProgress, CostLimit, RequireApproval
|
|
393
|
+
βββ exit/ # Exit.success / max_iterations / token_limit / stagnation
|
|
394
|
+
βββ progress/ # GoalMetric, repetition fingerprints, regression detection
|
|
395
|
+
βββ context/ # pinned facts, failed-approaches registry, stored summaries
|
|
396
|
+
βββ approval/ # frozen ApprovalRequest, CLI/callback/headless runners
|
|
397
|
+
βββ trace/ # TraceSink (live JSONL), explain()
|
|
398
|
+
βββ blueprints/ # TestFixLoop + PytestFailureMetric
|
|
399
|
+
βββ pricing.py # dated pricing table, staleness policy
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
## Development
|
|
403
|
+
|
|
404
|
+
```bash
|
|
405
|
+
python3 -m venv .venv && source .venv/bin/activate
|
|
406
|
+
pip install -e ".[dev]"
|
|
407
|
+
pytest -q # 23 tests, < 1s
|
|
408
|
+
```
|
|
409
|
+
|
|
410
|
+
The suite covers the design's release gates: budget preemption, deterministic resume, validation three-strikes, no-progress detection, the nested-loop guard, tool timeouts, frozen approvals, TTL expiry, stale-precondition invalidation, schema-change detection, pricing staleness β plus an end-to-end `TestFixLoop` fixing a real failing pytest suite.
|
|
411
|
+
|
|
412
|
+
## Troubleshooting
|
|
413
|
+
|
|
414
|
+
| Symptom | Cause & fix |
|
|
415
|
+
| ------- | ----------- |
|
|
416
|
+
| `SchemaChangedError` on resume | Your tools changed since the state was saved. Intentional? β `allow_schema_change=True` |
|
|
417
|
+
| `ArtifactDriftError` on resume | Stored summaries were made by a different engine/summarizer version β `allow_artifact_drift=True` to reuse anyway |
|
|
418
|
+
| `LoopConfigError: headless approval requires state_path` | `HeadlessApprovalRunner` must serialize state to pause β pass `state_path="..."` |
|
|
419
|
+
| `UnsupportedTypeError` at startup | A tool parameter uses an unsupported hint β see the [type matrix](#writing-tools). This is deliberate: it fails at registration, never mid-run |
|
|
420
|
+
| Pricing staleness warning | The USD table is >90 days old. Tokens remain authoritative; choose `pricing_staleness="token-only"` or `"refuse"` to change behavior |
|
|
421
|
+
| Loop exits `NO_PROGRESS` "too early" | Read `loop.explain()` β it shows the repetition flags and flat-metric streak. Widen with `policies=[NoProgress(window=5)]` |
|
|
422
|
+
| Tool hangs past its timeout | Thread-runner results go `aborted` but the thread lingers (Python can't kill threads). Shell out via `run_command` β it SIGTERMβSIGKILLs |
|
|
423
|
+
| `NestedLoopError` | A tool tried to start a loop. Replace the inner loop with a tool that returns a result |
|
|
424
|
+
| Edits seem ignored when re-running Python tests | Stale `__pycache__` bytecode. `TestFixLoop.edit_file` already invalidates it; custom edit tools should too |
|
|
425
|
+
|
|
426
|
+
## Roadmap
|
|
427
|
+
|
|
428
|
+
- **v1.1 (committed):** async engine Β· OpenTelemetry exporter (firm requirement) Β· Refactor / PR-review / Bug-repro blueprints Β· webhook approvals
|
|
429
|
+
- **Naming:** ships as `tightloop` on PyPI with `import loop` for ergonomics. Note: PyPI's unrelated `loop` package also installs a `loop` module β don't install both in one environment
|
|
430
|
+
|
|
431
|
+
## FAQ
|
|
432
|
+
|
|
433
|
+
**Is this an agent framework?** No. Loop is the *runtime layer* for the loop itself β it composes with whatever does your prompting, retrieval, and orchestration.
|
|
434
|
+
|
|
435
|
+
**Why did my loop stop?** `loop.explain().render()`. That question always having an answer is the core design goal.
|
|
436
|
+
|
|
437
|
+
**Can the LLM rate its own progress?** It can annotate the trace, but LLM self-assessment **cannot trigger exits** in v1 β exits rely on hard signals (metrics, repetition, budgets) by design.
|
|
438
|
+
|
|
439
|
+
**What stops a runaway loop?** Three always-on ceilings, per-action budget checks, `max_tokens` clamping, and default no-progress detection. The quickstart announces all of them at start.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
loop/__init__.py,sha256=jx6H_uGJMp0QpSXjxHKsvEGEntHHTDgP1Do5wbhY3IY,1464
|
|
2
|
+
loop/pricing.py,sha256=caUt7nCqC6w0v-rZ_iT2yRn85_ELqbWb5nX6zwOgX-4,1716
|
|
3
|
+
loop/approval/__init__.py,sha256=alJG0uGTjzSD2wIsU3rFOGUQw9HlC6tHjoa8GRIqqQ4,3019
|
|
4
|
+
loop/blueprints/__init__.py,sha256=KQZ85PB7otfuJDr0dgxrQgH87wOrvzVjbdWDpUkaI3k,104
|
|
5
|
+
loop/blueprints/testfix.py,sha256=nTxHmC8jvE8J0AtEv7cl37_ztcTeBKWp-u1lcE_DB2s,4526
|
|
6
|
+
loop/context/__init__.py,sha256=IhXjGSS7KL-FTNK96gAdRZzjpiHbH9_XvCSI0tdPUXU,6334
|
|
7
|
+
loop/core/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
8
|
+
loop/core/engine.py,sha256=KeGyxbCkQJTT-xaUNwDth_Hn7k_CNF9MTXDE_zJZDwo,22333
|
|
9
|
+
loop/core/result.py,sha256=rxQkZqn3mYCAdCk8rh1Kjo-RTJvjrjMR90cP3MlLsB4,2076
|
|
10
|
+
loop/core/state.py,sha256=sM99CcWbvJ5YKe8bDeY8u8463Y9tAQKh4N7XxgUXB50,4137
|
|
11
|
+
loop/exit/__init__.py,sha256=StR309AYZJrbw357zW5fshEterrqFQ-u5o10Fm5reJg,1986
|
|
12
|
+
loop/llm/__init__.py,sha256=r0PMZElEo-L6E4CCVcuwyzU8vPFOMO23nHJFwn0AD3w,1933
|
|
13
|
+
loop/llm/anthropic.py,sha256=p7PMOcu3Dlo5BsnAnL6EXi7Rod8UvBL2ctiniBoDX0o,1814
|
|
14
|
+
loop/llm/openai.py,sha256=QkCSw2svA--Dw46-9YHpyecE69DIiLhovFzEl5acnG8,2071
|
|
15
|
+
loop/policy/__init__.py,sha256=XwPxVds3I4LtYs3RnsQXtLpeGw6VKY9R-3VG3TjHdN4,3165
|
|
16
|
+
loop/progress/__init__.py,sha256=oQu1FJRCBc2TDF_dFEm015t_YlfkQvEEaPpP2S-Ga1U,2344
|
|
17
|
+
loop/tools/__init__.py,sha256=DvjfN5QtXQWgL8GUztYDZqZso5jRae1T8XnUxssMiYw,8167
|
|
18
|
+
loop/trace/__init__.py,sha256=Rd_5urETA_8lrbQdz_oG1YHJ0aeYdA4pK2bygWLvnaA,3188
|
|
19
|
+
tightloop-0.1.0.dist-info/METADATA,sha256=BnIqs3cuCiFB19smnxtDqvbFTAG7bSKOOMymTXMVh9M,22073
|
|
20
|
+
tightloop-0.1.0.dist-info/WHEEL,sha256=mffPy8wBnZQn2VnJUU5jE99KsxaSfiyMHV9Yt0aLVxs,87
|
|
21
|
+
tightloop-0.1.0.dist-info/RECORD,,
|