daveloop 1.4.0__py3-none-any.whl → 1.5.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- daveloop-1.5.0.dist-info/METADATA +392 -0
- daveloop-1.5.0.dist-info/RECORD +7 -0
- {daveloop-1.4.0.dist-info → daveloop-1.5.0.dist-info}/WHEEL +1 -1
- daveloop.py +265 -14
- daveloop-1.4.0.dist-info/METADATA +0 -391
- daveloop-1.4.0.dist-info/RECORD +0 -7
- {daveloop-1.4.0.dist-info → daveloop-1.5.0.dist-info}/entry_points.txt +0 -0
- {daveloop-1.4.0.dist-info → daveloop-1.5.0.dist-info}/top_level.txt +0 -0
|
@@ -0,0 +1,392 @@
|
|
|
1
|
+
Metadata-Version: 2.1
|
|
2
|
+
Name: daveloop
|
|
3
|
+
Version: 1.5.0
|
|
4
|
+
Summary: Self-healing debug agent powered by Claude Code CLI
|
|
5
|
+
Home-page: https://github.com/davebruzil/DaveLoop
|
|
6
|
+
Author: Dave Bruzil
|
|
7
|
+
Keywords: debugging ai claude automation agent
|
|
8
|
+
Classifier: Development Status :: 4 - Beta
|
|
9
|
+
Classifier: Intended Audience :: Developers
|
|
10
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.7
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.8
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
+
Classifier: Topic :: Software Development :: Debuggers
|
|
19
|
+
Classifier: Topic :: Software Development :: Quality Assurance
|
|
20
|
+
Requires-Python: >=3.7
|
|
21
|
+
Description-Content-Type: text/markdown
|
|
22
|
+
|
|
23
|
+
# DaveLoop
|
|
24
|
+
|
|
25
|
+
<img width="842" height="258" alt="DaveLoop Banner" src="https://github.com/user-attachments/assets/97212a83-6eb9-43ed-95c7-ec236718ee16" />
|
|
26
|
+
|
|
27
|
+
### The agent that doesn't quit until the job is done.
|
|
28
|
+
|
|
29
|
+
**DaveLoop** is a self-healing autonomous agent powered by LLM-driven iterative reasoning. It was designed for debugging. Then it started building features, writing test suites, fixing production workflows, and improving its own source code -- all without being asked.
|
|
30
|
+
|
|
31
|
+
You give it a problem. It reasons, hypothesizes, executes, verifies, and loops until the problem is gone. No hand-holding. No copy-pasting context between retries. No pressing "approve" every 10 seconds. It just works.
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
pip install daveloop
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## What DaveLoop Has Actually Done
|
|
40
|
+
|
|
41
|
+
This isn't a toy demo. These are real, logged, verifiable results from DaveLoop running autonomously.
|
|
42
|
+
|
|
43
|
+
### It Upgraded Itself
|
|
44
|
+
|
|
45
|
+
DaveLoop was given a feature request file and told to add capabilities to its own codebase. It read its own source code, understood the architecture, and implemented 4 major features:
|
|
46
|
+
|
|
47
|
+
- **TaskQueue** -- multi-bug sequential processing with status tracking
|
|
48
|
+
- **Session Memory** -- persistent history across runs via `.daveloop_history.json`
|
|
49
|
+
- **InputMonitor** -- real-time interrupt system (type `wait` mid-execution to redirect it)
|
|
50
|
+
- **Multi-task CLI** -- queue multiple bugs in one command
|
|
51
|
+
|
|
52
|
+
It verified the syntax, tested the integration, and resolved. One session. Its own codebase. No guidance.
|
|
53
|
+
|
|
54
|
+
### It Built a Complete Mobile Test Suite From Scratch
|
|
55
|
+
|
|
56
|
+
Pointed at an Android Dog Dating app on an emulator, DaveLoop:
|
|
57
|
+
|
|
58
|
+
1. Located and installed the APK autonomously
|
|
59
|
+
2. Dumped the UI hierarchy with `uiautomator` to understand the screen layout
|
|
60
|
+
3. Read the Kotlin source files to understand data models and navigation
|
|
61
|
+
4. Discovered the Room DB seeds 10 mock dog profiles on fresh install
|
|
62
|
+
5. **Wrote 3 comprehensive Maestro YAML test flows:**
|
|
63
|
+
- Swipe card functionality (left/right swipe with profile progression)
|
|
64
|
+
- Matches vs Chat screen separation (distinct bottom nav destinations)
|
|
65
|
+
- Chat recipient verification (message send and display)
|
|
66
|
+
6. Ran all tests 3 consecutive times -- all passed
|
|
67
|
+
|
|
68
|
+
**One iteration. Zero human input. From APK to passing test suite.**
|
|
69
|
+
|
|
70
|
+
### It Debugged a Production n8n Workflow
|
|
71
|
+
|
|
72
|
+
A WhatsApp webhook integration was returning 500 errors on every request. Hebrew and English messages both failing. DaveLoop:
|
|
73
|
+
|
|
74
|
+
- Traced the data flow through the entire n8n workflow JSON
|
|
75
|
+
- Found Bug #1: incorrect nested path `webhookData.data.messages` instead of `webhookData.data.messages.message`
|
|
76
|
+
- Found Bug #2: the `If2` node was reading from MongoDB output (`$json.body`) instead of referencing the webhook node directly
|
|
77
|
+
- Fixed both, preserving the Hebrew goodbye detection logic that was actually correct
|
|
78
|
+
- Generated a test script and deployment guide
|
|
79
|
+
|
|
80
|
+
Two bugs, both data structure traversal errors buried in a multi-node workflow. Found and fixed.
|
|
81
|
+
|
|
82
|
+
### It Solved a Django ORM Bug in One Iteration
|
|
83
|
+
|
|
84
|
+
Running against [SWE-bench](https://www.swebench.com/), DaveLoop resolved `django__django-13321` -- a real bug from the Django issue tracker -- in a single iteration. 5 minutes from start to `[DAVELOOP:RESOLVED]`.
|
|
85
|
+
|
|
86
|
+
```json
|
|
87
|
+
{
|
|
88
|
+
"instance_id": "django__django-13321",
|
|
89
|
+
"repo": "django/django",
|
|
90
|
+
"resolved": true,
|
|
91
|
+
"iterations": 1
|
|
92
|
+
}
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### It Fixed a Bug in tqdm (47k+ GitHub Stars)
|
|
96
|
+
|
|
97
|
+
A `ZeroDivisionError` when `total=0`. DaveLoop explored the entire tqdm codebase, traced all division operations across multiple files, identified that `if total:` was treating `0` the same as `None`, and applied a targeted fix. One iteration.
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## How It Works
|
|
102
|
+
|
|
103
|
+
```
|
|
104
|
+
You describe the bug
|
|
105
|
+
|
|
|
106
|
+
v
|
|
107
|
+
DaveLoop injects a reasoning protocol into the LLM
|
|
108
|
+
|
|
|
109
|
+
v
|
|
110
|
+
The agent analyzes: KNOWN / UNKNOWN / HYPOTHESIS / NEXT ACTION
|
|
111
|
+
|
|
|
112
|
+
v
|
|
113
|
+
Executes the fix, runs verification
|
|
114
|
+
|
|
|
115
|
+
v
|
|
116
|
+
Not fixed? Loop again with full context (--continue)
|
|
117
|
+
|
|
|
118
|
+
v
|
|
119
|
+
Fixed? --> [DAVELOOP:RESOLVED]
|
|
120
|
+
Stuck? --> [DAVELOOP:BLOCKED] (documents what it tried)
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
The key insight: even the best coding agents sometimes need multiple attempts for complex bugs. DaveLoop automates that retry loop with persistent context and structured reasoning, so each iteration builds on everything the agent already learned.
|
|
124
|
+
|
|
125
|
+
---
|
|
126
|
+
|
|
127
|
+
## The 4-Level Reasoning Protocol
|
|
128
|
+
|
|
129
|
+

|
|
130
|
+
|
|
131
|
+
Every iteration, DaveLoop forces the agent to state:
|
|
132
|
+
|
|
133
|
+
| Level | Purpose |
|
|
134
|
+
|-------|---------|
|
|
135
|
+
| **KNOWN** | What facts have been established so far |
|
|
136
|
+
| **UNKNOWN** | What gaps remain |
|
|
137
|
+
| **HYPOTHESIS** | A testable guess about the root cause |
|
|
138
|
+
| **NEXT ACTION** | The concrete step to test that hypothesis |
|
|
139
|
+
|
|
140
|
+
This prevents random shotgun debugging. Each iteration's KNOWN section grows. The UNKNOWN list shrinks or shifts focus. You can watch the agent systematically close in on the bug in real time.
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
## Install
|
|
145
|
+
|
|
146
|
+
```bash
|
|
147
|
+
pip install daveloop
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
Zero external dependencies. Python 3.7+ and an LLM CLI agent.
|
|
151
|
+
|
|
152
|
+
---
|
|
153
|
+
|
|
154
|
+
## Use It Straight From Chat
|
|
155
|
+
|
|
156
|
+
After installing, paste the following block into your global agent instructions file (the markdown file your coding agent reads for project instructions). This lets you trigger DaveLoop by just talking naturally in chat -- no terminal switching needed.
|
|
157
|
+
|
|
158
|
+
<details>
|
|
159
|
+
<summary><b>Click to expand -- copy this into your global agent instructions file</b></summary>
|
|
160
|
+
|
|
161
|
+
```markdown
|
|
162
|
+
## DaveLoop - Self-Healing Debug Agent
|
|
163
|
+
|
|
164
|
+
DaveLoop is a Python wrapper that runs the coding agent in a loop until bugs are resolved. It keeps iterating until the issue is fixed.
|
|
165
|
+
|
|
166
|
+
### Basic Commands
|
|
167
|
+
|
|
168
|
+
\`\`\`bash
|
|
169
|
+
# Basic usage - describe the bug directly
|
|
170
|
+
daveloop "your bug description here"
|
|
171
|
+
|
|
172
|
+
# Specify working directory
|
|
173
|
+
daveloop -d /path/to/project "bug description"
|
|
174
|
+
|
|
175
|
+
# Set max iterations (default is unlimited until resolved)
|
|
176
|
+
daveloop -m 10 "bug description"
|
|
177
|
+
|
|
178
|
+
# Read bug description from a file
|
|
179
|
+
daveloop -f error.txt
|
|
180
|
+
|
|
181
|
+
# Verbose output for debugging
|
|
182
|
+
daveloop -v "bug description"
|
|
183
|
+
\`\`\`
|
|
184
|
+
|
|
185
|
+
### Options
|
|
186
|
+
|
|
187
|
+
| Flag | Description |
|
|
188
|
+
|------|-------------|
|
|
189
|
+
| `-h, --help` | Show help message |
|
|
190
|
+
| `-f, --file FILE` | Read bug description from file |
|
|
191
|
+
| `-d, --dir DIR` | Working directory |
|
|
192
|
+
| `-m, --max-iterations N` | Maximum iterations before stopping |
|
|
193
|
+
| `-v, --verbose` | Enable verbose output |
|
|
194
|
+
|
|
195
|
+
### How It Works
|
|
196
|
+
|
|
197
|
+
1. DaveLoop sends your bug description to the coding agent
|
|
198
|
+
2. The agent analyzes, hypothesizes, and attempts fixes
|
|
199
|
+
3. Runs verification (build/tests)
|
|
200
|
+
4. If not resolved, DaveLoop loops back with updated context
|
|
201
|
+
5. Continues until `[DAVELOOP:RESOLVED]` or max iterations reached
|
|
202
|
+
|
|
203
|
+
### Giving DaveLoop a Command via Chat
|
|
204
|
+
|
|
205
|
+
When the user says "daveloop this" or "run daveloop" with a task, run:
|
|
206
|
+
|
|
207
|
+
\`\`\`bash
|
|
208
|
+
daveloop "the bug description here"
|
|
209
|
+
|
|
210
|
+
# Or with a specific project directory:
|
|
211
|
+
daveloop -d /path/to/project "the bug description"
|
|
212
|
+
\`\`\`
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
</details>
|
|
216
|
+
|
|
217
|
+
Once that's in your agent instructions file, just say:
|
|
218
|
+
|
|
219
|
+
```
|
|
220
|
+
daveloop this: "mongodb connection error in the lookup node"
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
or
|
|
224
|
+
|
|
225
|
+
```
|
|
226
|
+
run daveloop on the jwt validation bug
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
The agent picks it up and runs DaveLoop automatically. No special syntax.
|
|
230
|
+
|
|
231
|
+
---
|
|
232
|
+
|
|
233
|
+
## Usage
|
|
234
|
+
|
|
235
|
+
### Give it a bug
|
|
236
|
+
|
|
237
|
+
```bash
|
|
238
|
+
daveloop "routes/order.ts has a race condition on wallet balance. two concurrent orders overdraw the account"
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
### Give it a file
|
|
242
|
+
|
|
243
|
+
```bash
|
|
244
|
+
daveloop --file bug-report.txt
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Queue multiple bugs
|
|
248
|
+
|
|
249
|
+
```bash
|
|
250
|
+
daveloop "fix the login crash" "fix payment validation" "add dark mode toggle"
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
### Point it at a project
|
|
254
|
+
|
|
255
|
+
```bash
|
|
256
|
+
daveloop --dir /path/to/project "the bug description"
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
### Limit iterations
|
|
260
|
+
|
|
261
|
+
```bash
|
|
262
|
+
daveloop --max-iterations 10 "fix the bug"
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
### Mobile testing mode (Maestro)
|
|
266
|
+
|
|
267
|
+
```bash
|
|
268
|
+
daveloop --maestro "write UI tests for the onboarding flow"
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
### Web testing mode (Playwright)
|
|
272
|
+
|
|
273
|
+
```bash
|
|
274
|
+
daveloop --web "test the checkout flow end to end"
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
---
|
|
278
|
+
|
|
279
|
+
## Interactive Controls
|
|
280
|
+
|
|
281
|
+
DaveLoop doesn't lock you out. While it's running, type:
|
|
282
|
+
|
|
283
|
+
| Command | What happens |
|
|
284
|
+
|---------|-------------|
|
|
285
|
+
| `wait` / `pause` | Stops the current iteration. You type a correction. It resumes with your new context. |
|
|
286
|
+
| `add` | Queue a new bug without stopping the current one |
|
|
287
|
+
| `done` | Graceful exit, saves history |
|
|
288
|
+
| `Ctrl+C` | Kill it |
|
|
289
|
+
|
|
290
|
+
The `wait` command is the standout feature. When you see the agent going down the wrong path, type `wait`, tell it what you know, and it course-corrects with full context preserved.
|
|
291
|
+
|
|
292
|
+
---
|
|
293
|
+
|
|
294
|
+
## Three Modes
|
|
295
|
+
|
|
296
|
+
### Standard Mode (default)
|
|
297
|
+
Classic iterative debugging. Reads code, makes hypotheses, applies fixes, verifies.
|
|
298
|
+
|
|
299
|
+
### Maestro Mode (`--maestro`)
|
|
300
|
+
Autonomous mobile UI testing. Auto-detects devices/emulators, installs APKs, explores UI hierarchy, writes Maestro YAML test flows, and verifies with 3 consecutive passes.
|
|
301
|
+
|
|
302
|
+
### Web Mode (`--web`)
|
|
303
|
+
Playwright-based web testing. Detects your framework (React, Next.js, Vue, etc.), installs Playwright, starts dev servers, and tests with human-like interactions -- real mouse movements, drags, hovers, not just DOM manipulation.
|
|
304
|
+
|
|
305
|
+
---
|
|
306
|
+
|
|
307
|
+
## Session Memory
|
|
308
|
+
|
|
309
|
+
DaveLoop remembers. It saves a history of past sessions in `.daveloop_history.json` and loads that context into future runs. If it fixed a similar bug before, it knows.
|
|
310
|
+
|
|
311
|
+
---
|
|
312
|
+
|
|
313
|
+
## SWE-bench Integration
|
|
314
|
+
|
|
315
|
+
Test DaveLoop against real-world bugs from open source projects:
|
|
316
|
+
|
|
317
|
+
```bash
|
|
318
|
+
daveloop-swebench --file django_hash_task.json --max-iterations 15
|
|
319
|
+
```
|
|
320
|
+
|
|
321
|
+
Pre-configured tasks from Django, Pytest, SymPy, and Sklearn included.
|
|
322
|
+
|
|
323
|
+
---
|
|
324
|
+
|
|
325
|
+
## Battle-Tested On
|
|
326
|
+
|
|
327
|
+
| Domain | What it solved |
|
|
328
|
+
|--------|---------------|
|
|
329
|
+
| **Security** | Juice-Shop race conditions, NoSQL injection, ReDoS, path traversal |
|
|
330
|
+
| **Backend** | Django ORM bugs, session handling crashes |
|
|
331
|
+
| **Workflow Automation** | n8n webhook failures, MongoDB connection errors, multi-node data flow bugs |
|
|
332
|
+
| **Testing Frameworks** | Pytest AST rewriting issues, Material-UI flaky visual regression tests |
|
|
333
|
+
| **Libraries** | tqdm ZeroDivisionError, SymPy C-code generation |
|
|
334
|
+
| **Mobile** | Android Maestro test suite generation from scratch |
|
|
335
|
+
| **Self** | Added features to its own codebase autonomously |
|
|
336
|
+
|
|
337
|
+
---
|
|
338
|
+
|
|
339
|
+
## Writing Good Bug Descriptions
|
|
340
|
+
|
|
341
|
+
More context = fewer iterations.
|
|
342
|
+
|
|
343
|
+
**Vague (works, but slow):**
|
|
344
|
+
```bash
|
|
345
|
+
daveloop "fix the bug"
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
**Specific (fast):**
|
|
349
|
+
```bash
|
|
350
|
+
daveloop "RACE CONDITION: routes/order.ts lines 139-148. Balance check at 141 before decrement at 142. Two concurrent $100 orders both pass and overdraw to -$100. Need atomic check+decrement."
|
|
351
|
+
```
|
|
352
|
+
|
|
353
|
+
Include: bug type, file location, reproduction steps, root cause if known, fix direction if you have one.
|
|
354
|
+
|
|
355
|
+
---
|
|
356
|
+
|
|
357
|
+
## Logs
|
|
358
|
+
|
|
359
|
+
Every session is fully logged:
|
|
360
|
+
|
|
361
|
+
```
|
|
362
|
+
logs/
|
|
363
|
+
20260131_142120_iteration_01.log <- what it tried
|
|
364
|
+
20260131_142120_iteration_02.log <- what it tried next
|
|
365
|
+
20260131_142120_summary.md <- overview
|
|
366
|
+
```
|
|
367
|
+
|
|
368
|
+
Every reasoning block, every file read, every edit, every command. Full audit trail.
|
|
369
|
+
|
|
370
|
+
---
|
|
371
|
+
|
|
372
|
+
## Why DaveLoop Exists
|
|
373
|
+
|
|
374
|
+
Some bugs don't fall in one shot. Race conditions. Multi-file refactors. Subtle logic errors buried in nested data structures. Production workflows with 12 interconnected nodes.
|
|
375
|
+
|
|
376
|
+
DaveLoop wraps your coding agent in a persistence layer -- structured reasoning, iterative context, session memory, and the stubbornness to keep going until the job is done or honestly admit it's stuck.
|
|
377
|
+
|
|
378
|
+
It started as a debug loop. It turned out to be something more.
|
|
379
|
+
|
|
380
|
+
---
|
|
381
|
+
|
|
382
|
+
## License
|
|
383
|
+
|
|
384
|
+
MIT
|
|
385
|
+
|
|
386
|
+
---
|
|
387
|
+
|
|
388
|
+
**Built by [Dave Bruzil](https://github.com/davebruzil)**
|
|
389
|
+
|
|
390
|
+
```bash
|
|
391
|
+
pip install daveloop
|
|
392
|
+
```
|
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
daveloop.py,sha256=ge03G7KAmYwCZllCFztGlfpuwbhzo70Wdaki0loSaO0,66852
|
|
2
|
+
daveloop_swebench.py,sha256=iD9AU3XRiMQpt7TknFNlvnmPCNp64V-JaTfqTFgsGBM,15996
|
|
3
|
+
daveloop-1.5.0.dist-info/METADATA,sha256=FyXtTgGZmdn8jooM6q6MWEhD5Rs0Mhk65jVOj7iYIAk,12842
|
|
4
|
+
daveloop-1.5.0.dist-info/WHEEL,sha256=hPN0AlP2dZM_3ZJZWP4WooepkmU9wzjGgCLCeFjkHLA,92
|
|
5
|
+
daveloop-1.5.0.dist-info/entry_points.txt,sha256=QcFAZgFrDfPtIikNQb7eW9DxOpBK7T-qWrKqbGAS9Ww,86
|
|
6
|
+
daveloop-1.5.0.dist-info/top_level.txt,sha256=36DiYt70m4DIK8t7IhV_y6hAzUIyeb5-qDUf3-gbDdg,27
|
|
7
|
+
daveloop-1.5.0.dist-info/RECORD,,
|
daveloop.py
CHANGED
|
@@ -29,6 +29,12 @@ SIGNAL_RESOLVED = "[DAVELOOP:RESOLVED]"
|
|
|
29
29
|
SIGNAL_BLOCKED = "[DAVELOOP:BLOCKED]"
|
|
30
30
|
SIGNAL_CLARIFY = "[DAVELOOP:CLARIFY]"
|
|
31
31
|
|
|
32
|
+
# Allowed tools for Claude Code CLI
|
|
33
|
+
# Default: no Task tool (prevents recursive sub-agent spawning)
|
|
34
|
+
ALLOWED_TOOLS_DEFAULT = "Bash,Read,Write,Edit,Glob,Grep"
|
|
35
|
+
# Swarm mode: Task tool enabled for controlled sub-agent spawning
|
|
36
|
+
ALLOWED_TOOLS_SWARM = "Bash,Read,Write,Edit,Glob,Grep,Task"
|
|
37
|
+
|
|
32
38
|
# ============================================================================
|
|
33
39
|
# ANSI Color Codes
|
|
34
40
|
# ============================================================================
|
|
@@ -268,6 +274,128 @@ class TaskQueue:
|
|
|
268
274
|
print()
|
|
269
275
|
|
|
270
276
|
|
|
277
|
+
# ============================================================================
|
|
278
|
+
# Swarm Budget
|
|
279
|
+
# ============================================================================
|
|
280
|
+
class SwarmBudget:
|
|
281
|
+
"""Tracks and enforces sub-agent spawn budget for swarm mode."""
|
|
282
|
+
|
|
283
|
+
def __init__(self, max_spawns: int = 5, max_depth: int = 1):
|
|
284
|
+
self.max_spawns = max_spawns
|
|
285
|
+
self.max_depth = max_depth
|
|
286
|
+
self.spawn_count = 0
|
|
287
|
+
self.active_agents = 0
|
|
288
|
+
self.completed_agents = 0
|
|
289
|
+
|
|
290
|
+
def can_spawn(self) -> bool:
|
|
291
|
+
"""Check if spawning another sub-agent is within budget."""
|
|
292
|
+
return self.spawn_count < self.max_spawns
|
|
293
|
+
|
|
294
|
+
def record_spawn(self, description: str):
|
|
295
|
+
"""Record a sub-agent spawn."""
|
|
296
|
+
self.spawn_count += 1
|
|
297
|
+
self.active_agents += 1
|
|
298
|
+
print(f" {C.BRIGHT_CYAN}[Swarm]{C.RESET} Sub-agent {self.spawn_count}/{self.max_spawns}: {description}")
|
|
299
|
+
|
|
300
|
+
def record_completion(self):
|
|
301
|
+
"""Record a sub-agent completion."""
|
|
302
|
+
self.active_agents -= 1
|
|
303
|
+
self.completed_agents += 1
|
|
304
|
+
|
|
305
|
+
def budget_exhausted_message(self) -> str:
|
|
306
|
+
"""Return message when budget is exhausted."""
|
|
307
|
+
return (
|
|
308
|
+
f"Sub-agent budget exhausted ({self.spawn_count}/{self.max_spawns}). "
|
|
309
|
+
f"Complete remaining work directly without spawning more sub-agents."
|
|
310
|
+
)
|
|
311
|
+
|
|
312
|
+
def summary(self) -> dict:
|
|
313
|
+
"""Return budget tracking summary."""
|
|
314
|
+
return {
|
|
315
|
+
"total_spawned": self.spawn_count,
|
|
316
|
+
"completed": self.completed_agents,
|
|
317
|
+
"budget": self.max_spawns,
|
|
318
|
+
}
|
|
319
|
+
|
|
320
|
+
|
|
321
|
+
# ============================================================================
|
|
322
|
+
# Token Tracker
|
|
323
|
+
# ============================================================================
|
|
324
|
+
class TokenTracker:
|
|
325
|
+
"""Tracks token usage across API turns in a DaveLoop session."""
|
|
326
|
+
|
|
327
|
+
def __init__(self):
|
|
328
|
+
self.total_input = 0
|
|
329
|
+
self.total_output = 0
|
|
330
|
+
self.turn_count = 0
|
|
331
|
+
self.peak_input = 0
|
|
332
|
+
self.peak_output = 0
|
|
333
|
+
self.peak_total = 0
|
|
334
|
+
self.per_tool = {} # tool_name -> {"input": int, "output": int, "count": int}
|
|
335
|
+
self._current_tool = None # Track which tool is active for per-tool attribution
|
|
336
|
+
self._turn_input = 0 # Accumulate within a turn for per-tool attribution
|
|
337
|
+
self._turn_output = 0
|
|
338
|
+
|
|
339
|
+
def set_current_tool(self, tool_name: str):
|
|
340
|
+
"""Set the currently active tool for per-tool token attribution."""
|
|
341
|
+
self._current_tool = tool_name
|
|
342
|
+
|
|
343
|
+
def record_usage(self, input_tokens: int, output_tokens: int):
|
|
344
|
+
"""Record token usage from an API turn."""
|
|
345
|
+
self.total_input += input_tokens
|
|
346
|
+
self.total_output += output_tokens
|
|
347
|
+
self.turn_count += 1
|
|
348
|
+
|
|
349
|
+
turn_total = input_tokens + output_tokens
|
|
350
|
+
if turn_total > self.peak_total:
|
|
351
|
+
self.peak_total = turn_total
|
|
352
|
+
self.peak_input = input_tokens
|
|
353
|
+
self.peak_output = output_tokens
|
|
354
|
+
|
|
355
|
+
# Attribute to current tool if one is active
|
|
356
|
+
if self._current_tool:
|
|
357
|
+
if self._current_tool not in self.per_tool:
|
|
358
|
+
self.per_tool[self._current_tool] = {"input": 0, "output": 0, "count": 0}
|
|
359
|
+
self.per_tool[self._current_tool]["input"] += input_tokens
|
|
360
|
+
self.per_tool[self._current_tool]["output"] += output_tokens
|
|
361
|
+
self.per_tool[self._current_tool]["count"] += 1
|
|
362
|
+
|
|
363
|
+
@property
|
|
364
|
+
def total_tokens(self) -> int:
|
|
365
|
+
return self.total_input + self.total_output
|
|
366
|
+
|
|
367
|
+
def summary(self) -> dict:
|
|
368
|
+
"""Return a dict with all token stats."""
|
|
369
|
+
return {
|
|
370
|
+
"input_tokens": self.total_input,
|
|
371
|
+
"output_tokens": self.total_output,
|
|
372
|
+
"total_tokens": self.total_tokens,
|
|
373
|
+
"turn_count": self.turn_count,
|
|
374
|
+
"peak_turn": {
|
|
375
|
+
"input": self.peak_input,
|
|
376
|
+
"output": self.peak_output,
|
|
377
|
+
"total": self.peak_total,
|
|
378
|
+
},
|
|
379
|
+
"per_tool": dict(self.per_tool),
|
|
380
|
+
}
|
|
381
|
+
|
|
382
|
+
def summary_line(self) -> str:
|
|
383
|
+
"""Return a one-line summary string for display."""
|
|
384
|
+
return (
|
|
385
|
+
f"Tokens: {self.total_input:,} in / {self.total_output:,} out / "
|
|
386
|
+
f"{self.total_tokens:,} total ({self.turn_count} turns)"
|
|
387
|
+
)
|
|
388
|
+
|
|
389
|
+
def verbose_turn_line(self, input_tokens: int, output_tokens: int) -> str:
|
|
390
|
+
"""Return a per-turn detail line for --show-tokens mode."""
|
|
391
|
+
total = input_tokens + output_tokens
|
|
392
|
+
tool_info = f" [{self._current_tool}]" if self._current_tool else ""
|
|
393
|
+
return (
|
|
394
|
+
f" Turn {self.turn_count}: {input_tokens:,} in / {output_tokens:,} out / "
|
|
395
|
+
f"{total:,} total{tool_info}"
|
|
396
|
+
)
|
|
397
|
+
|
|
398
|
+
|
|
271
399
|
# ============================================================================
|
|
272
400
|
# Session Memory
|
|
273
401
|
# ============================================================================
|
|
@@ -294,16 +422,21 @@ def save_history(working_dir: str, history_data: dict):
|
|
|
294
422
|
history_file.write_text(json.dumps(history_data, indent=2), encoding="utf-8")
|
|
295
423
|
|
|
296
424
|
|
|
297
|
-
def summarize_session(bug: str, outcome: str, iterations: int) -> dict:
|
|
425
|
+
def summarize_session(bug: str, outcome: str, iterations: int, token_tracker: "TokenTracker" = None) -> dict:
|
|
298
426
|
"""Return a dict summarizing a session."""
|
|
299
427
|
now = datetime.now()
|
|
300
|
-
|
|
428
|
+
entry = {
|
|
301
429
|
"session_id": now.strftime("%Y%m%d_%H%M%S"),
|
|
302
430
|
"bug": bug,
|
|
303
431
|
"outcome": outcome,
|
|
304
432
|
"iterations": iterations,
|
|
305
433
|
"timestamp": now.isoformat()
|
|
306
434
|
}
|
|
435
|
+
if token_tracker and token_tracker.turn_count > 0:
|
|
436
|
+
entry["tokens_in"] = token_tracker.total_input
|
|
437
|
+
entry["tokens_out"] = token_tracker.total_output
|
|
438
|
+
entry["tokens_total"] = token_tracker.total_tokens
|
|
439
|
+
return entry
|
|
307
440
|
|
|
308
441
|
|
|
309
442
|
def format_history_context(sessions: list) -> str:
|
|
@@ -329,10 +462,12 @@ def print_history_box(sessions: list):
|
|
|
329
462
|
outcome = s.get("outcome", "UNKNOWN")
|
|
330
463
|
bug = s.get("bug", "unknown")[:55]
|
|
331
464
|
iters = s.get("iterations", "?")
|
|
465
|
+
tokens_total = s.get("tokens_total")
|
|
466
|
+
token_str = f" · {tokens_total:,} tok" if tokens_total else ""
|
|
332
467
|
if outcome == "RESOLVED":
|
|
333
|
-
print(f" {C.BRIGHT_GREEN}✓{C.RESET} {C.WHITE}{bug}{C.RESET} {C.DIM}({iters} iter){C.RESET}")
|
|
468
|
+
print(f" {C.BRIGHT_GREEN}✓{C.RESET} {C.WHITE}{bug}{C.RESET} {C.DIM}({iters} iter{token_str}){C.RESET}")
|
|
334
469
|
else:
|
|
335
|
-
print(f" {C.BRIGHT_RED}✗{C.RESET} {C.WHITE}{bug}{C.RESET} {C.DIM}({iters} iter){C.RESET}")
|
|
470
|
+
print(f" {C.BRIGHT_RED}✗{C.RESET} {C.WHITE}{bug}{C.RESET} {C.DIM}({iters} iter{token_str}){C.RESET}")
|
|
336
471
|
print()
|
|
337
472
|
|
|
338
473
|
|
|
@@ -426,7 +561,7 @@ class InputMonitor:
|
|
|
426
561
|
Call resume_reading() after the main thread is done with input().
|
|
427
562
|
"""
|
|
428
563
|
|
|
429
|
-
VALID_COMMANDS = ("wait", "pause", "add", "done")
|
|
564
|
+
VALID_COMMANDS = ("wait", "pause", "add", "done", "stop")
|
|
430
565
|
|
|
431
566
|
def __init__(self):
|
|
432
567
|
self._command = None
|
|
@@ -554,12 +689,17 @@ def find_claude_cli():
|
|
|
554
689
|
return None
|
|
555
690
|
|
|
556
691
|
|
|
557
|
-
def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool = False, stream: bool = True, timeout: int = DEFAULT_TIMEOUT, input_monitor=None) -> str:
|
|
692
|
+
def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool = False, stream: bool = True, timeout: int = DEFAULT_TIMEOUT, input_monitor=None, swarm_mode: bool = False, swarm_budget_max: int = 5, swarm_depth_max: int = 1, token_tracker: "TokenTracker" = None, show_tokens: bool = False) -> str:
|
|
558
693
|
"""Execute Claude Code CLI with the given prompt.
|
|
559
694
|
|
|
560
695
|
If stream=True, output is printed in real-time and also returned.
|
|
561
696
|
timeout is in seconds (default 600 = 10 minutes).
|
|
562
697
|
input_monitor: optional InputMonitor to check for user commands during execution.
|
|
698
|
+
swarm_mode: if True, enables Task tool for sub-agent spawning.
|
|
699
|
+
swarm_budget_max: max sub-agents per session in swarm mode.
|
|
700
|
+
swarm_depth_max: max sub-agent depth in swarm mode.
|
|
701
|
+
token_tracker: optional TokenTracker to accumulate token usage from the stream.
|
|
702
|
+
show_tokens: if True, print per-turn token usage during execution.
|
|
563
703
|
"""
|
|
564
704
|
claude_cmd = find_claude_cli()
|
|
565
705
|
if not claude_cmd:
|
|
@@ -578,7 +718,8 @@ def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool
|
|
|
578
718
|
if continue_session:
|
|
579
719
|
cmd.append("--continue")
|
|
580
720
|
|
|
581
|
-
|
|
721
|
+
allowed = ALLOWED_TOOLS_SWARM if swarm_mode else ALLOWED_TOOLS_DEFAULT
|
|
722
|
+
cmd.extend(["-p", "--verbose", "--output-format", "stream-json", "--allowedTools", allowed])
|
|
582
723
|
|
|
583
724
|
try:
|
|
584
725
|
if stream:
|
|
@@ -602,6 +743,9 @@ def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool
|
|
|
602
743
|
# Track start time
|
|
603
744
|
start_time = time.time()
|
|
604
745
|
|
|
746
|
+
# Swarm budget tracking (only active in swarm mode)
|
|
747
|
+
swarm_budget = SwarmBudget(max_spawns=swarm_budget_max, max_depth=swarm_depth_max) if swarm_mode else None
|
|
748
|
+
|
|
605
749
|
# Read and display JSON stream output
|
|
606
750
|
output_lines = []
|
|
607
751
|
full_text = []
|
|
@@ -617,6 +761,19 @@ def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool
|
|
|
617
761
|
msg_type = data.get("type", "")
|
|
618
762
|
|
|
619
763
|
|
|
764
|
+
# Extract token usage from any message that has it
|
|
765
|
+
if token_tracker:
|
|
766
|
+
usage = (data.get("message", {}).get("usage")
|
|
767
|
+
or data.get("usage")
|
|
768
|
+
or None)
|
|
769
|
+
if usage and isinstance(usage, dict):
|
|
770
|
+
inp = usage.get("input_tokens", 0)
|
|
771
|
+
outp = usage.get("output_tokens", 0)
|
|
772
|
+
if inp or outp:
|
|
773
|
+
token_tracker.record_usage(inp, outp)
|
|
774
|
+
if show_tokens:
|
|
775
|
+
print(f" {C.DIM}{token_tracker.verbose_turn_line(inp, outp)}{C.RESET}")
|
|
776
|
+
|
|
620
777
|
# Handle different message types
|
|
621
778
|
if msg_type == "assistant":
|
|
622
779
|
# Assistant text message
|
|
@@ -631,6 +788,8 @@ def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool
|
|
|
631
788
|
elif block.get("type") == "tool_use":
|
|
632
789
|
# Tool being called - show what Claude is doing
|
|
633
790
|
tool_name = block.get("name", "unknown")
|
|
791
|
+
if token_tracker:
|
|
792
|
+
token_tracker.set_current_tool(tool_name)
|
|
634
793
|
tool_input = block.get("input", {})
|
|
635
794
|
|
|
636
795
|
# Format tool call based on type
|
|
@@ -660,6 +819,18 @@ def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool
|
|
|
660
819
|
elif tool_name == "Task":
|
|
661
820
|
desc = tool_input.get("description", "")
|
|
662
821
|
tool_display = f"{C.BRIGHT_BLUE}Task{C.RESET}({C.WHITE}{desc}{C.RESET})"
|
|
822
|
+
# Swarm budget enforcement
|
|
823
|
+
if swarm_budget:
|
|
824
|
+
if not swarm_budget.can_spawn():
|
|
825
|
+
print(f" {C.BRIGHT_YELLOW}[Swarm] Budget exhausted. Terminating to restart without Task tool.{C.RESET}")
|
|
826
|
+
process.terminate()
|
|
827
|
+
try:
|
|
828
|
+
process.wait(timeout=10)
|
|
829
|
+
except Exception:
|
|
830
|
+
process.kill()
|
|
831
|
+
return '\n'.join(full_text) + "\n[DAVELOOP:SWARM_BUDGET_EXHAUSTED]"
|
|
832
|
+
else:
|
|
833
|
+
swarm_budget.record_spawn(desc)
|
|
663
834
|
else:
|
|
664
835
|
tool_display = f"{C.BRIGHT_BLUE}{tool_name}{C.RESET}"
|
|
665
836
|
|
|
@@ -677,6 +848,8 @@ def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool
|
|
|
677
848
|
elif msg_type == "tool_use":
|
|
678
849
|
# Tool being used - show what Claude is doing
|
|
679
850
|
tool_name = data.get("name", "unknown")
|
|
851
|
+
if token_tracker:
|
|
852
|
+
token_tracker.set_current_tool(tool_name)
|
|
680
853
|
tool_input = data.get("input", {})
|
|
681
854
|
|
|
682
855
|
# Format tool call based on type
|
|
@@ -706,6 +879,18 @@ def run_claude_code(prompt: str, working_dir: str = None, continue_session: bool
|
|
|
706
879
|
elif tool_name == "Task":
|
|
707
880
|
desc = tool_input.get("description", "")
|
|
708
881
|
tool_display = f"{C.BRIGHT_BLUE}Task{C.RESET}({C.WHITE}{desc}{C.RESET})"
|
|
882
|
+
# Swarm budget enforcement
|
|
883
|
+
if swarm_budget:
|
|
884
|
+
if not swarm_budget.can_spawn():
|
|
885
|
+
print(f" {C.BRIGHT_YELLOW}[Swarm] Budget exhausted. Terminating to restart without Task tool.{C.RESET}")
|
|
886
|
+
process.terminate()
|
|
887
|
+
try:
|
|
888
|
+
process.wait(timeout=10)
|
|
889
|
+
except Exception:
|
|
890
|
+
process.kill()
|
|
891
|
+
return '\n'.join(full_text) + "\n[DAVELOOP:SWARM_BUDGET_EXHAUSTED]"
|
|
892
|
+
else:
|
|
893
|
+
swarm_budget.record_spawn(desc)
|
|
709
894
|
else:
|
|
710
895
|
tool_display = f"{C.BRIGHT_BLUE}{tool_name}{C.RESET}"
|
|
711
896
|
|
|
@@ -859,6 +1044,14 @@ def main():
|
|
|
859
1044
|
parser.add_argument("-v", "--verbose", action="store_true", help="Verbose output")
|
|
860
1045
|
parser.add_argument("--maestro", action="store_true", help="Enable Maestro mobile testing mode")
|
|
861
1046
|
parser.add_argument("--web", action="store_true", help="Enable Playwright web UI testing mode")
|
|
1047
|
+
parser.add_argument("--swarm", action="store_true",
|
|
1048
|
+
help="Enable swarm mode: DaveLoop can spawn sub-agents via Task tool")
|
|
1049
|
+
parser.add_argument("--swarm-budget", type=int, default=5,
|
|
1050
|
+
help="Max sub-agents per DaveLoop worker in swarm mode (default: 5)")
|
|
1051
|
+
parser.add_argument("--swarm-depth", type=int, default=1, choices=[1, 2],
|
|
1052
|
+
help="Max sub-agent depth in swarm mode (default: 1, no recursive spawning)")
|
|
1053
|
+
parser.add_argument("--show-tokens", action="store_true",
|
|
1054
|
+
help="Show verbose per-turn token usage during execution")
|
|
862
1055
|
|
|
863
1056
|
args = parser.parse_args()
|
|
864
1057
|
|
|
@@ -908,6 +1101,11 @@ def main():
|
|
|
908
1101
|
print_status("Tasks", str(len(bug_descriptions)), C.WHITE)
|
|
909
1102
|
mode_name = "Maestro Mobile Testing" if args.maestro else "Playwright Web Testing" if args.web else "Autonomous"
|
|
910
1103
|
print_status("Mode", mode_name, C.WHITE)
|
|
1104
|
+
if args.swarm:
|
|
1105
|
+
print_status("Swarm", f"ENABLED (budget: {args.swarm_budget}, depth: {args.swarm_depth})", C.BRIGHT_CYAN)
|
|
1106
|
+
print_status("Tools", ALLOWED_TOOLS_SWARM, C.WHITE)
|
|
1107
|
+
else:
|
|
1108
|
+
print_status("Tools", ALLOWED_TOOLS_DEFAULT, C.WHITE)
|
|
911
1109
|
print(f"{C.BRIGHT_BLUE}└{'─' * 70}┘{C.RESET}")
|
|
912
1110
|
|
|
913
1111
|
# Build task queue
|
|
@@ -917,7 +1115,7 @@ def main():
|
|
|
917
1115
|
|
|
918
1116
|
# Print controls hint
|
|
919
1117
|
print(f"\n{C.BRIGHT_BLUE}{C.BOLD}┌─ CONTROLS {'─' * 58}┐{C.RESET}")
|
|
920
|
-
print(f"{C.BRIGHT_BLUE}│{C.RESET} Type while running: {C.BRIGHT_WHITE}wait{C.RESET} {C.DIM}·{C.RESET} {C.BRIGHT_WHITE}pause{C.RESET} {C.DIM}·{C.RESET} {C.BRIGHT_WHITE}add{C.RESET} {C.DIM}·{C.RESET} {C.BRIGHT_WHITE}done{C.RESET}
|
|
1118
|
+
print(f"{C.BRIGHT_BLUE}│{C.RESET} Type while running: {C.BRIGHT_WHITE}wait{C.RESET} {C.DIM}·{C.RESET} {C.BRIGHT_WHITE}pause{C.RESET} {C.DIM}·{C.RESET} {C.BRIGHT_WHITE}add{C.RESET} {C.DIM}·{C.RESET} {C.BRIGHT_WHITE}done{C.RESET} {C.DIM}·{C.RESET} {C.BRIGHT_WHITE}stop{C.RESET} {C.BRIGHT_BLUE}│{C.RESET}")
|
|
921
1119
|
print(f"{C.BRIGHT_BLUE}└{'─' * 70}┘{C.RESET}")
|
|
922
1120
|
|
|
923
1121
|
# Start input monitor
|
|
@@ -929,6 +1127,9 @@ def main():
|
|
|
929
1127
|
if history_data["sessions"]:
|
|
930
1128
|
history_context = "\n\n" + format_history_context(history_data["sessions"])
|
|
931
1129
|
|
|
1130
|
+
# Session-wide token tracking (aggregates across all tasks)
|
|
1131
|
+
session_token_tracker = TokenTracker()
|
|
1132
|
+
|
|
932
1133
|
# === OUTER LOOP: iterate over tasks ===
|
|
933
1134
|
while True:
|
|
934
1135
|
task = task_queue.next()
|
|
@@ -1004,6 +1205,7 @@ Then fix it. Use the reasoning protocol before each action.
|
|
|
1004
1205
|
"""
|
|
1005
1206
|
|
|
1006
1207
|
iteration_history = []
|
|
1208
|
+
task_token_tracker = TokenTracker()
|
|
1007
1209
|
|
|
1008
1210
|
# === INNER LOOP: iterations for current task ===
|
|
1009
1211
|
for iteration in range(1, args.max_iterations + 1):
|
|
@@ -1027,11 +1229,20 @@ Then fix it. Use the reasoning protocol before each action.
|
|
|
1027
1229
|
full_prompt, working_dir,
|
|
1028
1230
|
continue_session=continue_session,
|
|
1029
1231
|
stream=True, timeout=args.timeout,
|
|
1030
|
-
input_monitor=input_monitor
|
|
1232
|
+
input_monitor=input_monitor,
|
|
1233
|
+
swarm_mode=args.swarm,
|
|
1234
|
+
swarm_budget_max=args.swarm_budget,
|
|
1235
|
+
swarm_depth_max=args.swarm_depth,
|
|
1236
|
+
token_tracker=task_token_tracker,
|
|
1237
|
+
show_tokens=args.show_tokens
|
|
1031
1238
|
)
|
|
1032
1239
|
|
|
1033
1240
|
print(f"\n{C.BRIGHT_BLUE} {'─' * 70}{C.RESET}")
|
|
1034
1241
|
|
|
1242
|
+
# Print token usage summary for this iteration
|
|
1243
|
+
if task_token_tracker.turn_count > 0:
|
|
1244
|
+
print(f" {C.BRIGHT_CYAN}⊛ {task_token_tracker.summary_line()}{C.RESET}")
|
|
1245
|
+
|
|
1035
1246
|
# Save log
|
|
1036
1247
|
save_log(iteration, output, session_id)
|
|
1037
1248
|
iteration_history.append(output)
|
|
@@ -1085,22 +1296,34 @@ Continue the current debugging task. Use the reasoning protocol before each acti
|
|
|
1085
1296
|
elif user_cmd == "done":
|
|
1086
1297
|
# Clean exit
|
|
1087
1298
|
input_monitor.stop()
|
|
1088
|
-
session_entry = summarize_session(bug_input, "DONE_BY_USER", iteration)
|
|
1299
|
+
session_entry = summarize_session(bug_input, "DONE_BY_USER", iteration, task_token_tracker)
|
|
1089
1300
|
history_data["sessions"].append(session_entry)
|
|
1090
1301
|
save_history(working_dir, history_data)
|
|
1091
1302
|
print(f"\n {C.GREEN}✓{C.RESET} Session saved. Exiting by user request.")
|
|
1092
1303
|
return 0
|
|
1093
1304
|
|
|
1305
|
+
elif user_cmd == "stop":
|
|
1306
|
+
# Boris-commanded stop - terminate this iteration immediately
|
|
1307
|
+
print(f"\n {C.BRIGHT_RED}{C.BOLD} ■ STOPPED BY BORIS{C.RESET}")
|
|
1308
|
+
print(f"{C.BRIGHT_RED} {'─' * 70}{C.RESET}")
|
|
1309
|
+
input_monitor.stop()
|
|
1310
|
+
session_entry = summarize_session(bug_input, "STOPPED_BY_BORIS", iteration, task_token_tracker)
|
|
1311
|
+
history_data["sessions"].append(session_entry)
|
|
1312
|
+
save_history(working_dir, history_data)
|
|
1313
|
+
return 1
|
|
1314
|
+
|
|
1094
1315
|
# Check exit condition
|
|
1095
1316
|
signal, should_exit = check_exit_condition(output)
|
|
1096
1317
|
|
|
1097
1318
|
if should_exit:
|
|
1098
1319
|
if signal == "RESOLVED":
|
|
1099
1320
|
print_success_box("")
|
|
1321
|
+
if task_token_tracker.turn_count > 0:
|
|
1322
|
+
print(f" {C.BRIGHT_CYAN}⊛ {task_token_tracker.summary_line()}{C.RESET}")
|
|
1100
1323
|
print(f" {C.DIM}Session: {session_id}{C.RESET}")
|
|
1101
1324
|
print(f" {C.DIM}Logs: {LOG_DIR}{C.RESET}\n")
|
|
1102
1325
|
task_queue.mark_done()
|
|
1103
|
-
session_entry = summarize_session(bug_input, "RESOLVED", iteration)
|
|
1326
|
+
session_entry = summarize_session(bug_input, "RESOLVED", iteration, task_token_tracker)
|
|
1104
1327
|
history_data["sessions"].append(session_entry)
|
|
1105
1328
|
save_history(working_dir, history_data)
|
|
1106
1329
|
break # Move to next task
|
|
@@ -1126,14 +1349,14 @@ Continue debugging with this information. Use the reasoning protocol before each
|
|
|
1126
1349
|
print_status("Logs", str(LOG_DIR), C.WHITE)
|
|
1127
1350
|
print()
|
|
1128
1351
|
task_queue.mark_failed()
|
|
1129
|
-
session_entry = summarize_session(bug_input, "BLOCKED", iteration)
|
|
1352
|
+
session_entry = summarize_session(bug_input, "BLOCKED", iteration, task_token_tracker)
|
|
1130
1353
|
history_data["sessions"].append(session_entry)
|
|
1131
1354
|
save_history(working_dir, history_data)
|
|
1132
1355
|
break # Move to next task
|
|
1133
1356
|
else:
|
|
1134
1357
|
print_error_box(f"Error occurred: {signal}")
|
|
1135
1358
|
task_queue.mark_failed()
|
|
1136
|
-
session_entry = summarize_session(bug_input, "ERROR", iteration)
|
|
1359
|
+
session_entry = summarize_session(bug_input, "ERROR", iteration, task_token_tracker)
|
|
1137
1360
|
history_data["sessions"].append(session_entry)
|
|
1138
1361
|
save_history(working_dir, history_data)
|
|
1139
1362
|
break # Move to next task
|
|
@@ -1173,15 +1396,33 @@ Use the reasoning protocol before each action.
|
|
|
1173
1396
|
# Max iterations reached for this task (for-else)
|
|
1174
1397
|
print_warning_box(f"Max iterations ({args.max_iterations}) reached for current task")
|
|
1175
1398
|
task_queue.mark_failed()
|
|
1176
|
-
session_entry = summarize_session(bug_input, "MAX_ITERATIONS", args.max_iterations)
|
|
1399
|
+
session_entry = summarize_session(bug_input, "MAX_ITERATIONS", args.max_iterations, task_token_tracker)
|
|
1177
1400
|
history_data["sessions"].append(session_entry)
|
|
1178
1401
|
save_history(working_dir, history_data)
|
|
1179
1402
|
|
|
1403
|
+
# Aggregate task tokens into session-level tracker
|
|
1404
|
+
if task_token_tracker.turn_count > 0:
|
|
1405
|
+
session_token_tracker.total_input += task_token_tracker.total_input
|
|
1406
|
+
session_token_tracker.total_output += task_token_tracker.total_output
|
|
1407
|
+
session_token_tracker.turn_count += task_token_tracker.turn_count
|
|
1408
|
+
if task_token_tracker.peak_total > session_token_tracker.peak_total:
|
|
1409
|
+
session_token_tracker.peak_total = task_token_tracker.peak_total
|
|
1410
|
+
session_token_tracker.peak_input = task_token_tracker.peak_input
|
|
1411
|
+
session_token_tracker.peak_output = task_token_tracker.peak_output
|
|
1412
|
+
for tool, stats in task_token_tracker.per_tool.items():
|
|
1413
|
+
if tool not in session_token_tracker.per_tool:
|
|
1414
|
+
session_token_tracker.per_tool[tool] = {"input": 0, "output": 0, "count": 0}
|
|
1415
|
+
session_token_tracker.per_tool[tool]["input"] += stats["input"]
|
|
1416
|
+
session_token_tracker.per_tool[tool]["output"] += stats["output"]
|
|
1417
|
+
session_token_tracker.per_tool[tool]["count"] += stats["count"]
|
|
1418
|
+
|
|
1180
1419
|
# Save iteration summary for this task
|
|
1181
1420
|
LOG_DIR.mkdir(exist_ok=True)
|
|
1182
1421
|
summary = f"# DaveLoop Session {session_id}\n\n"
|
|
1183
1422
|
summary += f"Bug: {bug_input[:200]}...\n\n"
|
|
1184
1423
|
summary += f"Iterations: {len(iteration_history)}\n\n"
|
|
1424
|
+
if task_token_tracker.turn_count > 0:
|
|
1425
|
+
summary += f"Token Usage: {task_token_tracker.summary_line()}\n\n"
|
|
1185
1426
|
summary += "## Iteration History\n\n"
|
|
1186
1427
|
for i, hist in enumerate(iteration_history, 1):
|
|
1187
1428
|
summary += f"### Iteration {i}\n```\n{hist[:500]}...\n```\n\n"
|
|
@@ -1203,6 +1444,16 @@ Use the reasoning protocol before each action.
|
|
|
1203
1444
|
print(f" {C.DIM}○ {desc}{C.RESET}")
|
|
1204
1445
|
print()
|
|
1205
1446
|
|
|
1447
|
+
# Print session-wide token usage
|
|
1448
|
+
if session_token_tracker.turn_count > 0:
|
|
1449
|
+
print(f" {C.BRIGHT_CYAN}⊛ {session_token_tracker.summary_line()}{C.RESET}")
|
|
1450
|
+
if session_token_tracker.per_tool:
|
|
1451
|
+
print(f" {C.DIM} Per tool:{C.RESET}")
|
|
1452
|
+
for tool, stats in sorted(session_token_tracker.per_tool.items(), key=lambda x: x[1]["input"] + x[1]["output"], reverse=True):
|
|
1453
|
+
tool_total = stats["input"] + stats["output"]
|
|
1454
|
+
print(f" {C.DIM} {tool}: {stats['input']:,} in / {stats['output']:,} out / {tool_total:,} total ({stats['count']} calls){C.RESET}")
|
|
1455
|
+
print()
|
|
1456
|
+
|
|
1206
1457
|
print(f" {C.DIM}Session: {session_id}{C.RESET}")
|
|
1207
1458
|
print(f" {C.DIM}Logs: {LOG_DIR}{C.RESET}\n")
|
|
1208
1459
|
|
|
@@ -1,391 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: daveloop
|
|
3
|
-
Version: 1.4.0
|
|
4
|
-
Summary: Self-healing debug agent powered by Claude Code CLI
|
|
5
|
-
Home-page: https://github.com/davebruzil/DaveLoop
|
|
6
|
-
Author: Dave Bruzil
|
|
7
|
-
Keywords: debugging ai claude automation agent
|
|
8
|
-
Classifier: Development Status :: 4 - Beta
|
|
9
|
-
Classifier: Intended Audience :: Developers
|
|
10
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
11
|
-
Classifier: Programming Language :: Python :: 3
|
|
12
|
-
Classifier: Programming Language :: Python :: 3.7
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.8
|
|
14
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
15
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
16
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
-
Classifier: Topic :: Software Development :: Debuggers
|
|
19
|
-
Classifier: Topic :: Software Development :: Quality Assurance
|
|
20
|
-
Requires-Python: >=3.7
|
|
21
|
-
Description-Content-Type: text/markdown
|
|
22
|
-
Dynamic: author
|
|
23
|
-
Dynamic: classifier
|
|
24
|
-
Dynamic: description
|
|
25
|
-
Dynamic: description-content-type
|
|
26
|
-
Dynamic: home-page
|
|
27
|
-
Dynamic: keywords
|
|
28
|
-
Dynamic: requires-python
|
|
29
|
-
Dynamic: summary
|
|
30
|
-
|
|
31
|
-
# DaveLoop
|
|
32
|
-
|
|
33
|
-

|
|
34
|
-
|
|
35
|
-
**DaveLoop** is a Claude CLI based debug agent designed to solve bugs in cases where Claude Code fails to one-shot them.
|
|
36
|
-
It feeds itself bugs iteratively until it resolves them or gets blocked. Each iteration builds on the previous one with
|
|
37
|
-
full context thanks to the `--continue` flag.
|
|
38
|
-
|
|
39
|
-
---
|
|
40
|
-
|
|
41
|
-
## How It Works
|
|
42
|
-
|
|
43
|
-
1. You give it a bug description
|
|
44
|
-
2. It analyzes, makes a hypothesis, and tries a fix
|
|
45
|
-
3. If not fixed, it loops again with new context
|
|
46
|
-
4. Keeps going until it outputs `[DAVELOOP:RESOLVED]`
|
|
47
|
-
|
|
48
|
-
---
|
|
49
|
-
|
|
50
|
-
## Why Use It
|
|
51
|
-
|
|
52
|
-
- Claude Code sometimes needs multiple attempts to fix complex bugs
|
|
53
|
-
- Race conditions, subtle logic errors, multi-file refactors
|
|
54
|
-
- You dont want to manually copy-paste context every iteration
|
|
55
|
-
- Autonomous operation means you dont need to press enter for permissions all the time
|
|
56
|
-
|
|
57
|
-
---
|
|
58
|
-
|
|
59
|
-
## Key Features
|
|
60
|
-
|
|
61
|
-
- **Persistent Context** - uses `claude --continue` so it remembers everything
|
|
62
|
-
- **Exit Signals** - explicitly tells you when done or blocked
|
|
63
|
-
- **Real-time Streaming** - watch it think and debug live
|
|
64
|
-
- **Pragmatic Exits** - if environment is broken, it documents the fix and exits
|
|
65
|
-
- **4-Level Reasoning** - KNOWN, UNKNOWN, HYPOTHESIS, and WHY
|
|
66
|
-
|
|
67
|
-
---
|
|
68
|
-
|
|
69
|
-
## The 4-Level Reasoning Protocol
|
|
70
|
-
|
|
71
|
-

|
|
72
|
-
|
|
73
|
-
The reasoning protocol forces systematic debugging:
|
|
74
|
-
|
|
75
|
-
1. **Prevents random changes** - cant just try stuff without stating why
|
|
76
|
-
2. **Builds knowledge incrementally** - each iterations KNOWN grows
|
|
77
|
-
3. **Explicit about uncertainty** - UNKNOWN list gets smaller or changes focus
|
|
78
|
-
4. **Testable hypotheses** - you can verify if the guess matches symptoms
|
|
79
|
-
5. **Clear action items** - NEXT ACTION is concrete and measurable
|
|
80
|
-
|
|
81
|
-
---
|
|
82
|
-
## INSTALL
|
|
83
|
-
### Via pip
|
|
84
|
-
```bash
|
|
85
|
-
pip install daveloop
|
|
86
|
-
```
|
|
87
|
-
|
|
88
|
-
## How to Use
|
|
89
|
-
|
|
90
|
-
### Basic Usage
|
|
91
|
-
|
|
92
|
-
```bash
|
|
93
|
-
python daveloop.py "your bug description here"
|
|
94
|
-
```
|
|
95
|
-
|
|
96
|
-
**Example:**
|
|
97
|
-
```bash
|
|
98
|
-
python daveloop.py "routes/order.ts has a race condition on wallet balance. two concurrent orders can overdraw the account"
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
---
|
|
102
|
-
|
|
103
|
-
### From a File
|
|
104
|
-
|
|
105
|
-
If you have a detailed bug report:
|
|
106
|
-
|
|
107
|
-
```bash
|
|
108
|
-
python daveloop.py --file bug-report.txt
|
|
109
|
-
```
|
|
110
|
-
|
|
111
|
-
The file should contain the bug description. Can be as detailed as you want - stack traces, error logs, reproduction steps, whatever.
|
|
112
|
-
|
|
113
|
-
---
|
|
114
|
-
|
|
115
|
-
### From Claude Code Chat
|
|
116
|
-
|
|
117
|
-
Just talk naturally to Claude Code:
|
|
118
|
-
|
|
119
|
-
```
|
|
120
|
-
"daveloop this: mongodb connection error in lookup artist node"
|
|
121
|
-
```
|
|
122
|
-
|
|
123
|
-
Or:
|
|
124
|
-
|
|
125
|
-
```
|
|
126
|
-
"run daveloop on the jwt validation bug"
|
|
127
|
-
```
|
|
128
|
-
|
|
129
|
-
Claude Code will automatically run:
|
|
130
|
-
|
|
131
|
-
```bash
|
|
132
|
-
python daveloop.py "mongodb connection error in lookup artist node"
|
|
133
|
-
```
|
|
134
|
-
|
|
135
|
-
No special commands needed. Just mention "daveloop" and describe the bug.
|
|
136
|
-
|
|
137
|
-
---
|
|
138
|
-
|
|
139
|
-
### With Options
|
|
140
|
-
|
|
141
|
-
```bash
|
|
142
|
-
# Custom working directory
|
|
143
|
-
python daveloop.py "fix the bug" --dir /path/to/your/project
|
|
144
|
-
|
|
145
|
-
# Limit iterations (default is 20)
|
|
146
|
-
python daveloop.py "fix the bug" --max-iterations 10
|
|
147
|
-
|
|
148
|
-
# All together
|
|
149
|
-
python daveloop.py --file bug.txt --dir ./my-app --max-iterations 15
|
|
150
|
-
```
|
|
151
|
-
|
|
152
|
-
---
|
|
153
|
-
|
|
154
|
-
## If Claude CLI Not Found
|
|
155
|
-
|
|
156
|
-
DaveLoop automatically searches for Claude CLI in common locations. But if you get:
|
|
157
|
-
|
|
158
|
-
```
|
|
159
|
-
ERROR: Claude CLI not found!
|
|
160
|
-
|
|
161
|
-
Please install Claude Code CLI or set CLAUDE_CLI_PATH environment variable:
|
|
162
|
-
Windows: set CLAUDE_CLI_PATH=C:\path\to\claude.cmd
|
|
163
|
-
Linux/Mac: export CLAUDE_CLI_PATH=/path/to/claude
|
|
164
|
-
|
|
165
|
-
Install from: https://github.com/anthropics/claude-code
|
|
166
|
-
```
|
|
167
|
-
|
|
168
|
-
### Option 1: Set Environment Variable (Recommended)
|
|
169
|
-
|
|
170
|
-
Find where Claude CLI is installed:
|
|
171
|
-
|
|
172
|
-
**Windows:**
|
|
173
|
-
```cmd
|
|
174
|
-
where claude.cmd
|
|
175
|
-
```
|
|
176
|
-
|
|
177
|
-
**Linux/Mac:**
|
|
178
|
-
```bash
|
|
179
|
-
which claude
|
|
180
|
-
```
|
|
181
|
-
|
|
182
|
-
Then set the path:
|
|
183
|
-
|
|
184
|
-
**Windows (temporary - current session):**
|
|
185
|
-
```cmd
|
|
186
|
-
set CLAUDE_CLI_PATH=C:\Users\YourName\AppData\Roaming\npm\claude.cmd
|
|
187
|
-
```
|
|
188
|
-
|
|
189
|
-
**Windows (permanent):**
|
|
190
|
-
```cmd
|
|
191
|
-
setx CLAUDE_CLI_PATH "C:\Users\YourName\AppData\Roaming\npm\claude.cmd"
|
|
192
|
-
```
|
|
193
|
-
|
|
194
|
-
**Linux/Mac (add to ~/.bashrc or ~/.zshrc):**
|
|
195
|
-
```bash
|
|
196
|
-
export CLAUDE_CLI_PATH=/usr/local/bin/claude
|
|
197
|
-
```
|
|
198
|
-
|
|
199
|
-
---
|
|
200
|
-
|
|
201
|
-
### Option 2: Add to PATH
|
|
202
|
-
|
|
203
|
-
**Windows:**
|
|
204
|
-
1. Search for "environment variables" in start menu
|
|
205
|
-
2. Click "Environment Variables" button
|
|
206
|
-
3. Under "User variables", find "Path"
|
|
207
|
-
4. Add the directory containing claude.cmd
|
|
208
|
-
5. Restart terminal
|
|
209
|
-
|
|
210
|
-
**Linux/Mac:**
|
|
211
|
-
```bash
|
|
212
|
-
# Add to ~/.bashrc or ~/.zshrc
|
|
213
|
-
export PATH="$PATH:/path/to/claude/directory"
|
|
214
|
-
```
|
|
215
|
-
|
|
216
|
-
---
|
|
217
|
-
|
|
218
|
-
### Option 3: Install/Reinstall Claude CLI
|
|
219
|
-
|
|
220
|
-
If Claude CLI isnt installed:
|
|
221
|
-
|
|
222
|
-
```bash
|
|
223
|
-
npm install -g @anthropics/claude-code
|
|
224
|
-
```
|
|
225
|
-
|
|
226
|
-
After setting the path, run DaveLoop again.
|
|
227
|
-
|
|
228
|
-
---
|
|
229
|
-
|
|
230
|
-
## What Happens When You Run It
|
|
231
|
-
|
|
232
|
-
1. **Banner shows up** - you see the DAVELOOP ASCII art
|
|
233
|
-
2. **Session info** - working dir, max iterations, prompt loaded
|
|
234
|
-
3. **Bug report** - your description echoed back
|
|
235
|
-
4. **Iteration 1 starts** - progress bar shows up
|
|
236
|
-
5. **You see the reasoning** - KNOWN, UNKNOWN, HYPOTHESIS, NEXT ACTION
|
|
237
|
-
6. **You see the actions** - file reads, edits, bash commands
|
|
238
|
-
7. **Iteration completes** - either continues or exits
|
|
239
|
-
|
|
240
|
-
---
|
|
241
|
-
|
|
242
|
-
## Reading the Output
|
|
243
|
-
|
|
244
|
-
Output is color coded:
|
|
245
|
-
|
|
246
|
-
- **Blue** - reasoning blocks and actions
|
|
247
|
-
- **White** - normal text and code
|
|
248
|
-
- **Dim** - less important details
|
|
249
|
-
- **Green** - success messages
|
|
250
|
-
- **Red** - errors
|
|
251
|
-
- **Yellow** - warnings
|
|
252
|
-
|
|
253
|
-
Key things to watch for:
|
|
254
|
-
|
|
255
|
-
- **Reasoning blocks** - shows how its thinking through the problem
|
|
256
|
-
- **Tool usage** - what files its reading/editing
|
|
257
|
-
- **Exit signals** - `[Exit signal detected: RESOLVED]` means its done
|
|
258
|
-
|
|
259
|
-
---
|
|
260
|
-
|
|
261
|
-
## When It Finishes
|
|
262
|
-
|
|
263
|
-
### Three Possible Outcomes:
|
|
264
|
-
|
|
265
|
-
**1. Success - Bug is Fixed**
|
|
266
|
-
|
|
267
|
-
★ ★ ★ BUG SUCCESSFULLY RESOLVED ★ ★ ★
|
|
268
|
-
|
|
269
|
-
Bug fixed in 3 iteration(s)!
|
|
270
|
-
```
|
|
271
|
-
|
|
272
|
-
**2. Blocked - Needs Human Help**
|
|
273
|
-
```
|
|
274
|
-
ERROR: Claude is blocked - needs human help
|
|
275
|
-
```
|
|
276
|
-
|
|
277
|
-
Check the logs to see what it tried. Usually means:
|
|
278
|
-
- Environment issues (missing dependencies)
|
|
279
|
-
- Need clarification on requirements
|
|
280
|
-
- Need access to external systems
|
|
281
|
-
|
|
282
|
-
**3. Max Iterations - Ran Out of Attempts**
|
|
283
|
-
```
|
|
284
|
-
WARNING: Max iterations (20) reached without resolution
|
|
285
|
-
```
|
|
286
|
-
|
|
287
|
-
Check logs in `logs/` directory. Either:
|
|
288
|
-
- Increase max iterations
|
|
289
|
-
- Provide more context about the bug
|
|
290
|
-
- Manually help it past a blocker
|
|
291
|
-
|
|
292
|
-
---
|
|
293
|
-
|
|
294
|
-
## Logs Location
|
|
295
|
-
|
|
296
|
-
Every session creates logs:
|
|
297
|
-
|
|
298
|
-
```
|
|
299
|
-
logs/
|
|
300
|
-
20240127_143022_iteration_01.log <- first attempt
|
|
301
|
-
20240127_143022_iteration_02.log <- second attempt
|
|
302
|
-
20240127_143022_summary.md <- overview
|
|
303
|
-
```
|
|
304
|
-
|
|
305
|
-
Session ID format: `YYYYMMDD_HHMMSS`
|
|
306
|
-
|
|
307
|
-
Useful for:
|
|
308
|
-
- Seeing what the agent tried
|
|
309
|
-
- Debugging why it got stuck
|
|
310
|
-
- Understanding its reasoning process
|
|
311
|
-
- Proving to your team that the AI actually fixed the bug
|
|
312
|
-
|
|
313
|
-
---
|
|
314
|
-
|
|
315
|
-
## Tips for Good Bug Descriptions
|
|
316
|
-
|
|
317
|
-
**Bad:**
|
|
318
|
-
```bash
|
|
319
|
-
python daveloop.py "fix the bug"
|
|
320
|
-
```
|
|
321
|
-
Too vague. What bug? Where?
|
|
322
|
-
|
|
323
|
-
**Better:**
|
|
324
|
-
```bash
|
|
325
|
-
python daveloop.py "wallet balance goes negative when two users checkout simultaneously"
|
|
326
|
-
```
|
|
327
|
-
Has symptom and context.
|
|
328
|
-
|
|
329
|
-
**Best:**
|
|
330
|
-
```bash
|
|
331
|
-
python daveloop.py "RACE CONDITION: routes/order.ts wallet payment (lines 139-148). Balance check at line 141 happens BEFORE decrement
|
|
332
|
-
at line 142. Two concurrent $100 orders both pass the check and overdraw wallet to -$100. Need atomic check+decrement."
|
|
333
|
-
```
|
|
334
|
-
|
|
335
|
-
Has:
|
|
336
|
-
- Bug type (race condition)
|
|
337
|
-
- Location (file and lines)
|
|
338
|
-
- Reproduction (concurrent orders)
|
|
339
|
-
- Root cause (non-atomic operations)
|
|
340
|
-
- Suggested fix direction (atomic operation)
|
|
341
|
-
|
|
342
|
-
**More context = faster resolution = fewer iterations**
|
|
343
|
-
|
|
344
|
-
---
|
|
345
|
-
|
|
346
|
-
## Interrupting the Agent
|
|
347
|
-
|
|
348
|
-
If you need to stop it:
|
|
349
|
-
- Press `Ctrl+C` once - graceful shutdown
|
|
350
|
-
- Press `Ctrl+C` twice - force kill
|
|
351
|
-
|
|
352
|
-
Logs are saved even if interrupted.
|
|
353
|
-
|
|
354
|
-
---
|
|
355
|
-
|
|
356
|
-
## Testing Before Production
|
|
357
|
-
|
|
358
|
-
Run on test bugs first:
|
|
359
|
-
|
|
360
|
-
```bash
|
|
361
|
-
# Simple test
|
|
362
|
-
python daveloop.py "create a file test.txt with 'hello world' and output [DAVELOOP:RESOLVED]"
|
|
363
|
-
```
|
|
364
|
-
|
|
365
|
-
Should resolve in 1-2 iterations. If it works, youre good to go.
|
|
366
|
-
|
|
367
|
-
---
|
|
368
|
-
|
|
369
|
-
## Using with SWE-bench
|
|
370
|
-
|
|
371
|
-
For testing against real-world benchmark bugs:
|
|
372
|
-
|
|
373
|
-
```bash
|
|
374
|
-
python daveloop_swebench.py --file django_hash_task.json --max-iterations 15
|
|
375
|
-
```
|
|
376
|
-
|
|
377
|
-
Comes with pre-configured bugs from:
|
|
378
|
-
- Django ORM
|
|
379
|
-
- Pytest AST rewriting
|
|
380
|
-
- SymPy code generation
|
|
381
|
-
- Sklearn edge cases
|
|
382
|
-
|
|
383
|
-
---
|
|
384
|
-
|
|
385
|
-
## Tested On
|
|
386
|
-
|
|
387
|
-
- Juice-Shop security vulnerabilities (race conditions, NoSQL injection, ReDoS, path traversal)
|
|
388
|
-
- SWE-bench real-world bugs (Django ORM, Pytest AST, SymPy C-code generation)
|
|
389
|
-
- Production n8n workflow errors (MongoDB connection, webhook failures)
|
|
390
|
-
|
|
391
|
-
**Success rate significantly higher than one-shot attempts because of the iterative + reasoning approach**
|
daveloop-1.4.0.dist-info/RECORD
DELETED
|
@@ -1,7 +0,0 @@
|
|
|
1
|
-
daveloop.py,sha256=vO_mKj_kSciLmupY_GAw3qkRp4Axo6rsrTJx-lhFIZc,53540
|
|
2
|
-
daveloop_swebench.py,sha256=iD9AU3XRiMQpt7TknFNlvnmPCNp64V-JaTfqTFgsGBM,15996
|
|
3
|
-
daveloop-1.4.0.dist-info/METADATA,sha256=KFXheqH4I1_XexxlhPXlXnMRNzeYEVDnDKCGydjFqEg,10463
|
|
4
|
-
daveloop-1.4.0.dist-info/WHEEL,sha256=wUyA8OaulRlbfwMtmQsvNngGrxQHAvkKcvRmdizlJi0,92
|
|
5
|
-
daveloop-1.4.0.dist-info/entry_points.txt,sha256=QcFAZgFrDfPtIikNQb7eW9DxOpBK7T-qWrKqbGAS9Ww,86
|
|
6
|
-
daveloop-1.4.0.dist-info/top_level.txt,sha256=36DiYt70m4DIK8t7IhV_y6hAzUIyeb5-qDUf3-gbDdg,27
|
|
7
|
-
daveloop-1.4.0.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|