loopllm 0.7.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- loopllm/__init__.py +69 -0
- loopllm/__main__.py +5 -0
- loopllm/adaptive_exit.py +78 -0
- loopllm/agent_loop.py +299 -0
- loopllm/cli.py +521 -0
- loopllm/elicitation.py +519 -0
- loopllm/engine.py +376 -0
- loopllm/evaluator_factory.py +72 -0
- loopllm/evaluators.py +419 -0
- loopllm/guards.py +254 -0
- loopllm/local_loop.py +273 -0
- loopllm/mcp_server.py +2657 -0
- loopllm/plan_registry.py +412 -0
- loopllm/priors.py +604 -0
- loopllm/provider.py +51 -0
- loopllm/providers/__init__.py +15 -0
- loopllm/providers/agent.py +64 -0
- loopllm/providers/mock.py +64 -0
- loopllm/providers/ollama.py +95 -0
- loopllm/providers/openrouter.py +101 -0
- loopllm/serve.py +297 -0
- loopllm/step_scorer.py +190 -0
- loopllm/store.py +1126 -0
- loopllm/tasks.py +599 -0
- loopllm-0.7.0.dist-info/METADATA +454 -0
- loopllm-0.7.0.dist-info/RECORD +29 -0
- loopllm-0.7.0.dist-info/WHEEL +4 -0
- loopllm-0.7.0.dist-info/entry_points.txt +3 -0
- loopllm-0.7.0.dist-info/licenses/LICENSE +21 -0
|
@@ -0,0 +1,454 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: loopllm
|
|
3
|
+
Version: 0.7.0
|
|
4
|
+
Summary: MCP server that observes every prompt, scores quality in real time, and closes the loop with iterative refinement. Built on FastMCP, SQLite, and Bayesian priors — no extra LLM required.
|
|
5
|
+
Project-URL: Homepage, https://github.com/azank1/loop-llm
|
|
6
|
+
Project-URL: Repository, https://github.com/azank1/loop-llm
|
|
7
|
+
Project-URL: Changelog, https://github.com/azank1/loop-llm/blob/main/CHANGELOG.md
|
|
8
|
+
Project-URL: Issues, https://github.com/azank1/loop-llm/issues
|
|
9
|
+
Author: azank1
|
|
10
|
+
License: MIT
|
|
11
|
+
License-File: LICENSE
|
|
12
|
+
Keywords: agent,agent-loop,bayesian,cursor,llm,mcp,prompt-engineering,thompson-sampling,vscode
|
|
13
|
+
Classifier: Development Status :: 4 - Beta
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
19
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
20
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
21
|
+
Requires-Python: >=3.11
|
|
22
|
+
Requires-Dist: structlog>=24.0.0
|
|
23
|
+
Provides-Extra: all
|
|
24
|
+
Requires-Dist: fastapi>=0.110.0; extra == 'all'
|
|
25
|
+
Requires-Dist: httpx>=0.27.0; extra == 'all'
|
|
26
|
+
Requires-Dist: mcp>=1.0.0; extra == 'all'
|
|
27
|
+
Requires-Dist: uvicorn>=0.29.0; extra == 'all'
|
|
28
|
+
Provides-Extra: dev
|
|
29
|
+
Requires-Dist: httpx>=0.27.0; extra == 'dev'
|
|
30
|
+
Requires-Dist: mcp>=1.0.0; extra == 'dev'
|
|
31
|
+
Requires-Dist: mypy>=1.10.0; extra == 'dev'
|
|
32
|
+
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
|
|
33
|
+
Requires-Dist: pytest>=8.0.0; extra == 'dev'
|
|
34
|
+
Requires-Dist: ruff>=0.4.0; extra == 'dev'
|
|
35
|
+
Provides-Extra: mcp
|
|
36
|
+
Requires-Dist: mcp>=1.0.0; extra == 'mcp'
|
|
37
|
+
Provides-Extra: ollama
|
|
38
|
+
Requires-Dist: httpx>=0.27.0; extra == 'ollama'
|
|
39
|
+
Provides-Extra: openrouter
|
|
40
|
+
Requires-Dist: httpx>=0.27.0; extra == 'openrouter'
|
|
41
|
+
Provides-Extra: serve
|
|
42
|
+
Requires-Dist: fastapi>=0.110.0; extra == 'serve'
|
|
43
|
+
Requires-Dist: httpx>=0.27.0; extra == 'serve'
|
|
44
|
+
Requires-Dist: uvicorn>=0.29.0; extra == 'serve'
|
|
45
|
+
Description-Content-Type: text/markdown
|
|
46
|
+
|
|
47
|
+
# PromptLoop
|
|
48
|
+
|
|
49
|
+
[](https://github.com/azank1/loop-llm)
|
|
50
|
+
|
|
51
|
+
[](https://github.com/azank1/loop-llm/actions/workflows/ci.yml)
|
|
52
|
+
[](https://opensource.org/licenses/MIT)
|
|
53
|
+
[](https://pypi.org/project/loopllm/)
|
|
54
|
+
[](https://github.com/azank1/loop-llm/tree/main/vscode-loopllm)
|
|
55
|
+
|
|
56
|
+
**A Bayesian MCP sidecar for your IDE agent** — observe prompts, refine outputs, and
|
|
57
|
+
stop agent loops using externally verified scores.
|
|
58
|
+
|
|
59
|
+
> The underlying CLI and tool API ship as `loopllm` (the original name). PromptLoop is the project brand. Current release: **v0.7.0**.
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## What is PromptLoop?
|
|
64
|
+
|
|
65
|
+
PromptLoop sits between you and Cursor, VS Code Copilot, or any MCP client. It does
|
|
66
|
+
not replace your agent harness — it adds three capabilities on top:
|
|
67
|
+
|
|
68
|
+
1. **Prompt observer** — score every prompt across 5 dimensions, route to elicitation
|
|
69
|
+
or refinement, learn your preferences via online SGD.
|
|
70
|
+
2. **Refinement loop** — generate → evaluate → retry inside a single MCP tool call
|
|
71
|
+
using sampling (`loopllm_run_pipeline`, `loopllm_refine`).
|
|
72
|
+
3. **Conservative Dual-Verify agent loops** — agents submit step **artifacts**; the
|
|
73
|
+
server scores them through two independent channels and learns when to stop
|
|
74
|
+
(`loopllm_loop_start` / `loop_step` / `loop_end`).
|
|
75
|
+
|
|
76
|
+
All three layers share one Bayesian learning core (`AdaptivePriors` + SQLite) — no
|
|
77
|
+
training data, no PyTorch.
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## System at a glance
|
|
82
|
+
|
|
83
|
+
```mermaid
|
|
84
|
+
flowchart TB
|
|
85
|
+
subgraph interfaces [Interfaces]
|
|
86
|
+
MCP[MCP 28 tools]
|
|
87
|
+
Ext[VS Code extension]
|
|
88
|
+
end
|
|
89
|
+
subgraph layers [Three layers]
|
|
90
|
+
L1[Layer 1: Prompt observer intercept + SGD]
|
|
91
|
+
L2[Layer 2: Refinement loop LoopedLLM]
|
|
92
|
+
L3[Layer 3: CDV agent loops loop_start step end]
|
|
93
|
+
end
|
|
94
|
+
subgraph learn [Learning]
|
|
95
|
+
Priors[AdaptivePriors SQLite v4]
|
|
96
|
+
end
|
|
97
|
+
interfaces --> layers
|
|
98
|
+
layers --> Priors
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
| Layer | Entry point | What it does |
|
|
102
|
+
|---|---|---|
|
|
103
|
+
| 1 — Observer | `loopllm_intercept` | Score, route, log; Thompson Sampling for questions |
|
|
104
|
+
| 2 — Refinement | `loopllm_run_pipeline` | Elicit → decompose → execute → verify via MCP sampling |
|
|
105
|
+
| 3 — CDV loops | `loopllm_loop_step(step_output=...)` | External dual-verify scoring → guards → Bayesian stop |
|
|
106
|
+
|
|
107
|
+
---
|
|
108
|
+
|
|
109
|
+
## Quickstart
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
git clone https://github.com/azank1/loop-llm # GitHub repo still named loop-llm
|
|
113
|
+
cd loop-llm
|
|
114
|
+
pip install -e ".[mcp]"
|
|
115
|
+
code . # or open in Cursor
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
`.vscode/mcp.json` and `.cursor/mcp.json` are committed — the MCP server is picked
|
|
119
|
+
up automatically. Verify on first load:
|
|
120
|
+
|
|
121
|
+
```
|
|
122
|
+
use loopllm_intercept with prompt: add retry logic to the download function
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
For iterative tasks, start a CDV loop:
|
|
126
|
+
|
|
127
|
+
```
|
|
128
|
+
use loopllm_loop_start with goal="make the failing test pass" task_type="bugfix"
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## VS Code Extension
|
|
134
|
+
|
|
135
|
+
Install the companion extension for a live quality scratchpad and prompt history
|
|
136
|
+
dashboard directly in the sidebar.
|
|
137
|
+
|
|
138
|
+
<table>
|
|
139
|
+
<tr>
|
|
140
|
+
<td width="50%" valign="top">
|
|
141
|
+
|
|
142
|
+
**Prompt Lab** — live quality scratchpad
|
|
143
|
+
|
|
144
|
+

|
|
145
|
+
|
|
146
|
+
Scores on every keystroke (350 ms debounce). Grade badge, 5 dimension bars, issues + suggestions tags, Copy and Send to Chat.
|
|
147
|
+
|
|
148
|
+
</td>
|
|
149
|
+
<td width="50%" valign="top">
|
|
150
|
+
|
|
151
|
+
**History** — learning curve + metrics
|
|
152
|
+
|
|
153
|
+

|
|
154
|
+
|
|
155
|
+
Learning curve sparkline, grade distribution, SGD learned weights per dimension. Updates after every `loopllm_feedback` call.
|
|
156
|
+
|
|
157
|
+
</td>
|
|
158
|
+
</tr>
|
|
159
|
+
</table>
|
|
160
|
+
|
|
161
|
+
Build and install the extension from source:
|
|
162
|
+
|
|
163
|
+
```bash
|
|
164
|
+
cd vscode-loopllm
|
|
165
|
+
npm install
|
|
166
|
+
npx @vscode/vsce package # produces loopllm-prompt-gauge-0.1.0.vsix
|
|
167
|
+
code --install-extension loopllm-prompt-gauge-0.1.0.vsix
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
---
|
|
171
|
+
|
|
172
|
+
## Conservative Dual-Verify (CDV) — Layer 3
|
|
173
|
+
|
|
174
|
+
Most agent loops stop on a fixed `max_iterations` or let the agent self-grade when
|
|
175
|
+
it's "done." Both waste tokens or optimize **reported** progress. v0.7 introduces
|
|
176
|
+
**Conservative Dual-Verify**: agents submit **step artifacts** (test logs, diffs,
|
|
177
|
+
summaries); the MCP server scores them through **two independent channels** and
|
|
178
|
+
feeds the **stricter** score into Bayesian stop/continue logic.
|
|
179
|
+
|
|
180
|
+
```python
|
|
181
|
+
channel_a = deterministic_evaluator.evaluate(step_output) # regex, JSON, completeness
|
|
182
|
+
channel_b = critic_sample(step_output, goal, criteria) # separate verifier call
|
|
183
|
+
final_score = min(channel_a, channel_b) # either channel can veto
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
```
|
|
187
|
+
loopllm_loop_start(
|
|
188
|
+
goal="refactor module and make tests pass",
|
|
189
|
+
task_type="bugfix",
|
|
190
|
+
required_patterns=["tests passed"],
|
|
191
|
+
)
|
|
192
|
+
→ { suggested_budget: 3, quality_threshold: 0.8, evaluator_type: "composite" }
|
|
193
|
+
|
|
194
|
+
loopllm_loop_step(session_id, step_output="pytest: 3 failed, 12 passed")
|
|
195
|
+
→ {
|
|
196
|
+
decision: "continue",
|
|
197
|
+
score: 0.0,
|
|
198
|
+
channel_a_score: 0.0,
|
|
199
|
+
channel_b_score: 0.55,
|
|
200
|
+
score_source: "conservative_dual_verify",
|
|
201
|
+
deficiencies: ["Required pattern not found: tests passed"],
|
|
202
|
+
}
|
|
203
|
+
|
|
204
|
+
loopllm_loop_end(session_id) → learns optimal depth from verified trajectories
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
`loopllm_loop_step` returns `stop` when any guard fires: goal reached (verified score),
|
|
208
|
+
plateau, low Bayesian ROI, budget exhausted, timeout, token cap, or repeated output.
|
|
209
|
+
|
|
210
|
+
See [`examples/agent_loop.py`](examples/agent_loop.py) for the library demo and
|
|
211
|
+
[`docs/demo/agent_loop_demo.md`](docs/demo/agent_loop_demo.md) for CDV via MCP.
|
|
212
|
+
|
|
213
|
+
```python
|
|
214
|
+
from loopllm import AdaptivePriors, AgentLoopController
|
|
215
|
+
|
|
216
|
+
controller = AgentLoopController(AdaptivePriors())
|
|
217
|
+
session = controller.start("fix flaky test", task_type="bugfix")
|
|
218
|
+
verdict = controller.step(session.session_id, score=0.9) # library API (pre-scored)
|
|
219
|
+
controller.end(session.session_id)
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+

|
|
223
|
+
|
|
224
|
+

|
|
225
|
+
|
|
226
|
+
What a terminal run looks like (`python examples/agent_loop.py`):
|
|
227
|
+
|
|
228
|
+
```text
|
|
229
|
+
=== Loop (task_type=bugfix) ===
|
|
230
|
+
Suggested budget: 3 step(s) | threshold 0.80 | confidence 0.00 (from 0 past loops)
|
|
231
|
+
step 1 | 0.45 |######### | -> CONTINUE: step 1/3, score 0.450 below 0.80
|
|
232
|
+
step 2 | 0.85 |################# | -> STOP: Goal reached: 0.850 >= 0.80 at step 2
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
### Benchmark: adaptive vs fixed `max_iterations`
|
|
236
|
+
|
|
237
|
+
Reproducible simulation (`benchmarks/adaptive_vs_fixed.py`, seed=7, 300 test tasks,
|
|
238
|
+
threshold 0.80):
|
|
239
|
+
|
|
240
|
+
| Strategy | Mean steps | Mean final score | % reaching 0.80 | Wasted steps | Efficiency (reach/step) |
|
|
241
|
+
|---|---|---|---|---|---|
|
|
242
|
+
| fixed (budget=2) | 2.00 | 0.698 | 34.3% | 0.00 | 17.2 |
|
|
243
|
+
| fixed (budget=6) | 6.00 | 0.939 | 94.0% | 2.50 | 15.7 |
|
|
244
|
+
| threshold (reactive) | 3.56 | 0.852 | 100.0% | 0.00 | 28.1 |
|
|
245
|
+
| **adaptive (loopllm)** | **3.56** | **0.852** | **99.7%** | **0.00** | **28.0** |
|
|
246
|
+
|
|
247
|
+
**Adaptive uses ~41% fewer steps than a fixed 6-step budget** while reaching the bar
|
|
248
|
+
on 99.7% of tasks. Repro: `python benchmarks/adaptive_vs_fixed.py`.
|
|
249
|
+
|
|
250
|
+
> Honest caveat: simulation with stated assumptions; measures *decision efficiency
|
|
251
|
+
> given a quality signal*, not absolute model quality.
|
|
252
|
+
|
|
253
|
+
---
|
|
254
|
+
|
|
255
|
+
## Prompt pipeline — Layer 2
|
|
256
|
+
|
|
257
|
+
`loopllm_run_pipeline` is the main entry point for observe → elicit → refine → verify:
|
|
258
|
+
|
|
259
|
+
```
|
|
260
|
+
loopllm_run_pipeline("add retry logic to the download function")
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
What happens inside:
|
|
264
|
+
1. **Score** the prompt across 5 dimensions (< 1 ms, deterministic)
|
|
265
|
+
2. **Elicit** clarifying questions if quality < 0.6 — Thompson Sampling on Beta priors
|
|
266
|
+
3. **Decompose** into subtasks if complexity > 0.5
|
|
267
|
+
4. **Execute** each subtask: `ctx.sample(prompt)` → evaluate → retry if below threshold
|
|
268
|
+
5. **Verify** the assembled output via a second `ctx.sample()` call
|
|
269
|
+
6. **Log** result to SQLite; update scoring weights via online SGD
|
|
270
|
+
|
|
271
|
+
Everything runs inline via MCP Sampling — no extra chat turns, no polling.
|
|
272
|
+
|
|
273
|
+
---
|
|
274
|
+
|
|
275
|
+
## How it works — Layer 1
|
|
276
|
+
|
|
277
|
+
```
|
|
278
|
+
You type a prompt
|
|
279
|
+
↓
|
|
280
|
+
loopllm_intercept ← scores across 5 dimensions (~0ms, deterministic)
|
|
281
|
+
↓
|
|
282
|
+
route decision
|
|
283
|
+
< 0.4 → elicitation ← Thompson Sampling picks the highest-gain question
|
|
284
|
+
0.4–0.6 → elicit + refine
|
|
285
|
+
≥ 0.6 → refine directly
|
|
286
|
+
↓
|
|
287
|
+
loopllm_refine (if needed)
|
|
288
|
+
→ ctx.sample(prompt) ← MCP Sampling: calls host LLM mid-execution
|
|
289
|
+
→ evaluate output ← deterministic evaluators (length, regex, JSON schema)
|
|
290
|
+
→ if score < threshold: ctx.sample(improved prompt)
|
|
291
|
+
↓
|
|
292
|
+
result logged to ~/.loopllm/store.db
|
|
293
|
+
↓
|
|
294
|
+
loopllm_feedback (optional rating 1–5) → SGD updates dimension weights
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
**Scoring dimensions** (each 0–1, composited by learned weights into grade A–F):
|
|
298
|
+
|
|
299
|
+
| Dimension | What it catches |
|
|
300
|
+
|---|---|
|
|
301
|
+
| Specificity | Vague, generic requests |
|
|
302
|
+
| Constraint Clarity | Missing format, length, or rule requirements |
|
|
303
|
+
| Context Completeness | No background or goal stated |
|
|
304
|
+
| Ambiguity | Unclear references, pronouns without antecedents |
|
|
305
|
+
| Format Specification | No output format specified |
|
|
306
|
+
|
|
307
|
+
**MCP Sampling** — generation tools call `ctx.sample()` to invoke the host agent's LLM
|
|
308
|
+
inline. Falls back to `agent_execute` passthrough if the client doesn't declare sampling.
|
|
309
|
+
|
|
310
|
+
<details>
|
|
311
|
+
<summary>Learning math (SGD, Thompson Sampling, Bayesian priors)</summary>
|
|
312
|
+
|
|
313
|
+
### Online Gradient Descent on scoring weights
|
|
314
|
+
|
|
315
|
+
Default weights: `{specificity: 0.25, constraint_clarity: 0.20, context_completeness: 0.20, ambiguity: 0.20, format_spec: 0.15}`. Each `loopllm_feedback(rating)` runs one SGD step; weights clip to $[0.05, 0.50]$ and renormalise. Persisted in `learned_weights` (schema v4).
|
|
316
|
+
|
|
317
|
+
### Thompson Sampling for question ordering
|
|
318
|
+
|
|
319
|
+
Each question type maintains $\text{Beta}(\alpha, \beta)$; the pipeline draws $s_i \sim \text{Beta}(\alpha_i, \beta_i)$ and picks $\arg\max_i s_i$.
|
|
320
|
+
|
|
321
|
+
### Beta-Binomial Bayesian priors
|
|
322
|
+
|
|
323
|
+
Per-(task\_type, model) convergence priors drive adaptive exit in `adaptive_exit.py` via `BetaPrior.prob_above(threshold)`.
|
|
324
|
+
|
|
325
|
+
### Welford online variance
|
|
326
|
+
|
|
327
|
+
`NormalPrior` tracks running mean/variance with optional exponential decay ($\lambda = 0.95$).
|
|
328
|
+
|
|
329
|
+
</details>
|
|
330
|
+
|
|
331
|
+
---
|
|
332
|
+
|
|
333
|
+
## Install as a package
|
|
334
|
+
|
|
335
|
+
```bash
|
|
336
|
+
pip install loopllm[mcp]
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
Add `.vscode/mcp.json` to your project:
|
|
340
|
+
|
|
341
|
+
```json
|
|
342
|
+
{
|
|
343
|
+
"servers": {
|
|
344
|
+
"loopllm": {
|
|
345
|
+
"type": "stdio",
|
|
346
|
+
"command": "loopllm",
|
|
347
|
+
"args": ["mcp-server", "--provider", "agent"]
|
|
348
|
+
}
|
|
349
|
+
}
|
|
350
|
+
}
|
|
351
|
+
```
|
|
352
|
+
|
|
353
|
+
Cursor users: `.cursor/mcp.json` uses `"mcpServers"` as the top-level key.
|
|
354
|
+
|
|
355
|
+
---
|
|
356
|
+
|
|
357
|
+
## Tools (28)
|
|
358
|
+
|
|
359
|
+
| Tool | What it does |
|
|
360
|
+
|---|---|
|
|
361
|
+
| `loopllm_run_pipeline` | **Layer 2.** Elicit → decompose → execute → verify in one call |
|
|
362
|
+
| `loopllm_intercept` | **Layer 1.** Score + route a prompt; logs to history |
|
|
363
|
+
| `loopllm_gauge` | Instant quality bars, no DB write |
|
|
364
|
+
| `loopllm_refine` | Score → sample → retry loop via MCP Sampling |
|
|
365
|
+
| `loopllm_plan_tasks` | Decompose a goal into ordered subtasks via MCP Sampling |
|
|
366
|
+
| `loopllm_verify_output` | Keyword pre-check + deep sample against quality criteria |
|
|
367
|
+
| `loopllm_elicitation_start/answer/finish` | Multi-turn clarifying question session |
|
|
368
|
+
| `loopllm_plan_register` | Create a confidence-gated plan saved to SQLite |
|
|
369
|
+
| `loopllm_plan_next` | Advance to next task; returns `needs_replan` if quality dropped |
|
|
370
|
+
| `loopllm_plan_update` | Record task scores; recalculates rolling confidence |
|
|
371
|
+
| `loopllm_plan_list` | Dashboard: all plans with gauges and task counts |
|
|
372
|
+
| `loopllm_plan_delete` | Remove a completed or abandoned plan |
|
|
373
|
+
| `loopllm_context_history` | Browse prompt history with sparklines |
|
|
374
|
+
| `loopllm_context_clear` | Wipe prompt history (scoped or all) |
|
|
375
|
+
| `loopllm_prompt_stats` | Prompting quality trend and learning curve |
|
|
376
|
+
| `loopllm_feedback` | Rate a response (1–5); triggers SGD weight update |
|
|
377
|
+
| `loopllm_suggest_config` | Bayesian-optimal loop config for a task type |
|
|
378
|
+
| `loopllm_loop_start` | **Layer 3.** Begin CDV agent loop; returns learned budget + verifier recipe |
|
|
379
|
+
| `loopllm_loop_step` | Submit step artifact for CDV; returns continue/stop + channel scores |
|
|
380
|
+
| `loopllm_loop_end` | Close loop and learn optimal depth from verified trajectories |
|
|
381
|
+
| `loopllm_loop_status` | Inspect an active agent-loop session |
|
|
382
|
+
| `loopllm_classify_task` | Label a prompt's task type |
|
|
383
|
+
| `loopllm_analyze_prompt` | Generate clarifying questions ranked by Thompson-sampled gain |
|
|
384
|
+
| `loopllm_list_tasks` | List tasks from the persistent store |
|
|
385
|
+
| `loopllm_show_task` | Detail view for a single task |
|
|
386
|
+
| `loopllm_report` | Learned weights, Bayesian priors, question effectiveness stats |
|
|
387
|
+
|
|
388
|
+
Plans and learned weights persist to `~/.loopllm/store.db` (schema v4).
|
|
389
|
+
|
|
390
|
+
---
|
|
391
|
+
|
|
392
|
+
## Local models (no MCP)
|
|
393
|
+
|
|
394
|
+
```bash
|
|
395
|
+
pip install loopllm[serve]
|
|
396
|
+
loopllm serve --port 8765 # REST scoring middleware
|
|
397
|
+
```
|
|
398
|
+
|
|
399
|
+
```python
|
|
400
|
+
from loopllm.local_loop import LocalModelLoop
|
|
401
|
+
|
|
402
|
+
loop = LocalModelLoop(
|
|
403
|
+
base_url="http://localhost:11434",
|
|
404
|
+
model="llama3.2",
|
|
405
|
+
score_url="http://localhost:8765/score",
|
|
406
|
+
quality_threshold=0.80,
|
|
407
|
+
max_retries=3,
|
|
408
|
+
)
|
|
409
|
+
result = loop.run("Write a Python function to parse JSON safely.")
|
|
410
|
+
print(result.output, result.best_score, result.converged)
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
---
|
|
414
|
+
|
|
415
|
+
## Contributing
|
|
416
|
+
|
|
417
|
+
```bash
|
|
418
|
+
git clone https://github.com/azank1/loop-llm
|
|
419
|
+
cd loop-llm
|
|
420
|
+
pip install -e ".[dev]"
|
|
421
|
+
python -m pytest tests/ -q # 219 tests (215 pass, 4 skipped), ~2s
|
|
422
|
+
```
|
|
423
|
+
|
|
424
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) for branch naming (`az/<type>/<short>`) and checks.
|
|
425
|
+
|
|
426
|
+
**Key files:**
|
|
427
|
+
- `src/loopllm/mcp_server.py` — 28 MCP tools + MCP Sampling helpers
|
|
428
|
+
- `src/loopllm/step_scorer.py` — Conservative Dual-Verify scoring
|
|
429
|
+
- `src/loopllm/guards.py` — composable agent-loop stop stack
|
|
430
|
+
- `src/loopllm/agent_loop.py` — adaptive agent-loop controller
|
|
431
|
+
- `src/loopllm/evaluator_factory.py` — build evaluators for CDV Channel A
|
|
432
|
+
- `src/loopllm/priors.py` — Beta/Normal priors, Welford, Thompson Sampling
|
|
433
|
+
- `src/loopllm/store.py` — SQLite persistence (schema v4)
|
|
434
|
+
- `src/loopllm/engine.py` — core refinement loop (`LoopedLLM`)
|
|
435
|
+
|
|
436
|
+
PRs welcome. Add tests for new tools in `tests/`.
|
|
437
|
+
|
|
438
|
+
---
|
|
439
|
+
|
|
440
|
+
## Environment variables
|
|
441
|
+
|
|
442
|
+
| Variable | Default | Description |
|
|
443
|
+
|---|---|---|
|
|
444
|
+
| `LOOPLLM_PROVIDER` | `agent` | `agent`, `ollama`, or `openrouter` |
|
|
445
|
+
| `LOOPLLM_MODEL` | `agent` | Model identifier (ignored in agent mode) |
|
|
446
|
+
| `LOOPLLM_DB` | `~/.loopllm/store.db` | SQLite store path |
|
|
447
|
+
| `OLLAMA_HOST` | `http://localhost:11434` | Ollama base URL |
|
|
448
|
+
| `OPENROUTER_API_KEY` | — | OpenRouter API key |
|
|
449
|
+
|
|
450
|
+
---
|
|
451
|
+
|
|
452
|
+
## License
|
|
453
|
+
|
|
454
|
+
MIT
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
loopllm/__init__.py,sha256=3wm-aKz4OzUP7MeQ_hKcRmgo2MLhu9lrgA1xS8fednQ,1681
|
|
2
|
+
loopllm/__main__.py,sha256=BgvbAP-wgYew4vciRFhUiHNmPzxE0FuH_86lkOXCVf4,123
|
|
3
|
+
loopllm/adaptive_exit.py,sha256=-GQscZgdA6IlbY1d81ZERcnylCevhubl_0eOEAsZU-k,2579
|
|
4
|
+
loopllm/agent_loop.py,sha256=0rLCF9AJ4I-J2kZTPp1To7wxqLJzshox6mNY-ttr7e8,10642
|
|
5
|
+
loopllm/cli.py,sha256=ERQnHrQgH6UsgBxe5PxterXOCWxD5nyqfBoIhSKZOOE,17225
|
|
6
|
+
loopllm/elicitation.py,sha256=1KEsCYtJo04Ec2A2UymnGL7yW6vz-th9AXVT5zUqd8k,18776
|
|
7
|
+
loopllm/engine.py,sha256=nvjvzmuuNJYDvmvYFL4kM11fLt5galmj7R5Ej1JtuBY,12531
|
|
8
|
+
loopllm/evaluator_factory.py,sha256=YcC0IjXZCitWbe8ayGmvQzZ7vSY3wprJntj2ZvBmbAA,2522
|
|
9
|
+
loopllm/evaluators.py,sha256=Sw2BZE6r9gaIh1Q6XIgjUxD21cx53bwLhxymX6xWW0A,14591
|
|
10
|
+
loopllm/guards.py,sha256=f9BP7CFDBeJ67nnViRO3Eno4aKDQejuQXgvgLR2mYdk,8164
|
|
11
|
+
loopllm/local_loop.py,sha256=qhmiYJWV_wjz8wV7aUBLl0gj7c0Malpxp8VcdqIfrb4,9216
|
|
12
|
+
loopllm/mcp_server.py,sha256=oRa3CGiCL4lbgUYcZ2iDOLUHjdF1N3BvSTFlZbySH6k,95082
|
|
13
|
+
loopllm/plan_registry.py,sha256=QfUFXq4u8sfHpCj8F8Q8i6jDwZpasOa0ojvwBYFMOXc,14262
|
|
14
|
+
loopllm/priors.py,sha256=T-yfAw2dW9rvPm66IM7XmNrEY3KjzzyoijyCtRtz8do,21888
|
|
15
|
+
loopllm/provider.py,sha256=f2-3tHekfpGNMVijkCFE3LWQSByR5qGwIWWVFwSShBg,1325
|
|
16
|
+
loopllm/serve.py,sha256=As87i2ZQE_C8BZtesB9sgIJKXtMS-cskj_oj3n5y5qw,10504
|
|
17
|
+
loopllm/step_scorer.py,sha256=1Ri772Pm3HQPMVITkLtriKpaN-BwidMJ6Ju2ZTHN14M,6401
|
|
18
|
+
loopllm/store.py,sha256=suTuPgmJ8KztytRROPoqATYUw47WR_ipfktUixPfJ34,40369
|
|
19
|
+
loopllm/tasks.py,sha256=2s6iyw0yXwYNJdgeHKVYizL222-7GIJC7I3aI3YzBhw,20100
|
|
20
|
+
loopllm/providers/__init__.py,sha256=nZFSTrfMMrnNrc0nsQpCPazoYD66TaY_UdzJsSPPwhU,477
|
|
21
|
+
loopllm/providers/agent.py,sha256=c7vMAwO6M_Ugp5PhHrki4DYUki5pGvwMzyC2PO5_f2Y,2294
|
|
22
|
+
loopllm/providers/mock.py,sha256=RwCKszYcY1bXtXqynUaqOD0gYmqjyLEglZMAprJ94yg,1951
|
|
23
|
+
loopllm/providers/ollama.py,sha256=CzL3duYPNqZyK2Unb3tgWJk48jZ0VsWi-2N8FsbYrn8,2575
|
|
24
|
+
loopllm/providers/openrouter.py,sha256=MYlMNu5jqja7p5eUtWAy8ndxXB2GSMkpGZCB0XLHj9M,2946
|
|
25
|
+
loopllm-0.7.0.dist-info/METADATA,sha256=Cy--DZAlEoPyvu4HOJDQMY6bXDmal_HdB6DUvlhw_Y4,16949
|
|
26
|
+
loopllm-0.7.0.dist-info/WHEEL,sha256=mffPy8wBnZQn2VnJUU5jE99KsxaSfiyMHV9Yt0aLVxs,87
|
|
27
|
+
loopllm-0.7.0.dist-info/entry_points.txt,sha256=F0f7rnBmau5Kokr1uxLQTqEivm49mcyJ-n_fQZ6e2uo,83
|
|
28
|
+
loopllm-0.7.0.dist-info/licenses/LICENSE,sha256=f_3XPTAAaue7S4bfYnZHN8lqSI4i1M6W1D_Khgirr2Y,1063
|
|
29
|
+
loopllm-0.7.0.dist-info/RECORD,,
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 azank1
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|