looped 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/PLAN.md +856 -0
- data/README.md +340 -0
- data/docs/self-improving-coding-agent.md +374 -0
- data/exe/looped +115 -0
- data/lib/looped/agent.rb +188 -0
- data/lib/looped/application.rb +252 -0
- data/lib/looped/judge.rb +90 -0
- data/lib/looped/memory.rb +96 -0
- data/lib/looped/optimizer.rb +267 -0
- data/lib/looped/signatures.rb +40 -0
- data/lib/looped/state.rb +120 -0
- data/lib/looped/tools/read_file.rb +35 -0
- data/lib/looped/tools/run_command.rb +56 -0
- data/lib/looped/tools/search_code.rb +38 -0
- data/lib/looped/tools/write_file.rb +37 -0
- data/lib/looped/types.rb +53 -0
- data/lib/looped/version.rb +6 -0
- data/lib/looped.rb +100 -0
- data/looped.gemspec +47 -0
- metadata +246 -0
data/README.md
ADDED
|
@@ -0,0 +1,340 @@
|
|
|
1
|
+
# Looped
|
|
2
|
+
|
|
3
|
+
A self-improving coding agent that learns from its own performance.
|
|
4
|
+
|
|
5
|
+
[](https://rubygems.org/gems/looped)
|
|
6
|
+
[](https://github.com/vicentereig/looped/actions)
|
|
7
|
+
|
|
8
|
+
```
|
|
9
|
+
┌─────────────────────────────────────┐
|
|
10
|
+
│ YOU / YOUR APP │
|
|
11
|
+
│ "Write a fibonacci function" │
|
|
12
|
+
└──────────────┬──────────────────────┘
|
|
13
|
+
│
|
|
14
|
+
▼
|
|
15
|
+
┌──────────────────────────────────────────────────────────────────────────────┐
|
|
16
|
+
│ LOOPED::AGENT │
|
|
17
|
+
│ ┌────────────────────────────────────────────────────────────────────────┐ │
|
|
18
|
+
│ │ ReAct Loop (Think → Act → Observe) │ │
|
|
19
|
+
│ │ │ │
|
|
20
|
+
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
|
|
21
|
+
│ │ │ THINK │───▶│ ACT │───▶│ OBSERVE │──┐ │ │
|
|
22
|
+
│ │ │ Plan next │ │ Use tools │ │ Process │ │ │ │
|
|
23
|
+
│ │ │ step │ │ │ │ results │ │ │ │
|
|
24
|
+
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │
|
|
25
|
+
│ │ ▲ │ │ │
|
|
26
|
+
│ │ └───────────────────────────────────────────────────┘ │ │
|
|
27
|
+
│ │ (repeat until solved) │ │
|
|
28
|
+
│ └────────────────────────────────────────────────────────────────────────┘ │
|
|
29
|
+
│ │
|
|
30
|
+
│ Tools: read_file │ write_file │ search_code │ run_command │
|
|
31
|
+
└──────────────────────────────────────────────────────────────────────────────┘
|
|
32
|
+
│
|
|
33
|
+
▼ solution
|
|
34
|
+
┌──────────────────────────────────────────────────────────────────────────────┐
|
|
35
|
+
│ LOOPED::JUDGE │
|
|
36
|
+
│ │
|
|
37
|
+
│ Evaluates: Correctness • Code Quality • Best Practices │
|
|
38
|
+
│ Returns: Score (0-10) • Pass/Fail • Critique • Suggestions │
|
|
39
|
+
└──────────────────────────────────────────────────────────────────────────────┘
|
|
40
|
+
│
|
|
41
|
+
┌──────────────────┴──────────────────┐
|
|
42
|
+
│ │
|
|
43
|
+
▼ ▼
|
|
44
|
+
┌────────────────┐ ┌────────────────────┐
|
|
45
|
+
│ Return to │ │ Persist Result │
|
|
46
|
+
│ User/App │ │ to Training │
|
|
47
|
+
│ with score │ │ Buffer │
|
|
48
|
+
└────────────────┘ └─────────┬──────────┘
|
|
49
|
+
│
|
|
50
|
+
┌─────────────────────────────────────────────────────────────┼────────────────┐
|
|
51
|
+
│ ~/.looped/ │ │
|
|
52
|
+
│ ▼ │
|
|
53
|
+
│ instructions.json ◀──────────────────────┐ training_buffer.json │
|
|
54
|
+
│ (current best prompts) │ (recent results) │
|
|
55
|
+
│ │ │ │
|
|
56
|
+
│ history/ │ │ when buffer >= 5 │
|
|
57
|
+
│ (archived training data) │ ▼ │
|
|
58
|
+
│ ┌──────┴───────────────────────┐ │
|
|
59
|
+
│ │ LOOPED::OPTIMIZER │ │
|
|
60
|
+
│ │ (Background Async) │ │
|
|
61
|
+
│ │ │ │
|
|
62
|
+
│ │ 1. Build training examples │ │
|
|
63
|
+
│ │ 2. Run GEPA optimization │ │
|
|
64
|
+
│ │ 3. Extract better prompts │ │
|
|
65
|
+
│ │ 4. Hot-swap instructions │ │
|
|
66
|
+
│ └──────────────────────────────┘ │
|
|
67
|
+
└──────────────────────────────────────────────────────────────────────────────┘
|
|
68
|
+
│
|
|
69
|
+
│ Agent detects file change
|
|
70
|
+
│ and hot-reloads improved
|
|
71
|
+
▼ instructions automatically
|
|
72
|
+
┌─────────────────────────┐
|
|
73
|
+
│ Next task benefits │
|
|
74
|
+
│ from learned prompts │
|
|
75
|
+
└─────────────────────────┘
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
## Why Looped?
|
|
79
|
+
|
|
80
|
+
**The Problem:** LLM-based coding agents use static prompts. When they fail or produce suboptimal code, there's no mechanism to learn from those mistakes. You're stuck manually tweaking prompts.
|
|
81
|
+
|
|
82
|
+
**The Solution:** Looped creates a feedback loop where the agent:
|
|
83
|
+
1. Attempts coding tasks using ReAct (Reasoning + Acting)
|
|
84
|
+
2. Gets scored by an LLM judge on correctness, quality, and best practices
|
|
85
|
+
3. Stores results as training data
|
|
86
|
+
4. Automatically optimizes its own prompts using GEPA (Genetic Evolution of Prompt Attributes)
|
|
87
|
+
5. Hot-swaps improved prompts without restart
|
|
88
|
+
|
|
89
|
+
**The Outcome:** An agent that gets measurably better at coding tasks over time, with improvements persisted across sessions.
|
|
90
|
+
|
|
91
|
+
## Benefits
|
|
92
|
+
|
|
93
|
+
| Benefit | Description |
|
|
94
|
+
|---------|-------------|
|
|
95
|
+
| **Self-Improving** | Automatically learns from successes and failures—no manual prompt engineering |
|
|
96
|
+
| **Measurable Progress** | Every solution is scored; track improvement over generations |
|
|
97
|
+
| **Persistent Learning** | Optimized prompts saved to disk; improvements survive restarts |
|
|
98
|
+
| **Zero Configuration** | Works out of the box; optimization runs transparently in background |
|
|
99
|
+
| **Flexible Models** | Use different OpenAI models for agent, judge, and optimizer |
|
|
100
|
+
|
|
101
|
+
## Expected Outcomes
|
|
102
|
+
|
|
103
|
+
When you use Looped over time:
|
|
104
|
+
|
|
105
|
+
1. **Generation 0 (Initial):** Agent uses default DSPy.rb ReAct prompts
|
|
106
|
+
2. **After ~5 tasks:** First optimization cycle runs, prompts refined based on judge feedback
|
|
107
|
+
3. **After ~20 tasks:** Multiple optimization generations; agent develops task-specific reasoning patterns
|
|
108
|
+
4. **Ongoing:** Continuous improvement as more diverse tasks provide richer training signal
|
|
109
|
+
|
|
110
|
+
Check progress anytime with the `status` command:
|
|
111
|
+
```
|
|
112
|
+
looped> status
|
|
113
|
+
=== Optimization Status ===
|
|
114
|
+
Generation: 3
|
|
115
|
+
Best Score: 8.7/10
|
|
116
|
+
Training Buffer: 2/5 results
|
|
117
|
+
Last Updated: 2024-11-28 12:30:00
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
## Overview
|
|
121
|
+
|
|
122
|
+
**Looped** is a Ruby gem that creates a coding agent capable of:
|
|
123
|
+
|
|
124
|
+
1. **Executing coding tasks** using DSPy.rb ReAct with file operations, code search, and command execution
|
|
125
|
+
2. **Evaluating its own work** with an LLM-as-judge that scores solutions
|
|
126
|
+
3. **Continuously improving** by optimizing prompts with GEPA running in the background
|
|
127
|
+
4. **Persisting learning** to disk (`~/.looped/`) for cross-session improvement
|
|
128
|
+
|
|
129
|
+
## Installation
|
|
130
|
+
|
|
131
|
+
Add to your Gemfile:
|
|
132
|
+
|
|
133
|
+
```ruby
|
|
134
|
+
gem 'looped'
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
Or install directly:
|
|
138
|
+
|
|
139
|
+
```bash
|
|
140
|
+
gem install looped
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
## Quick Start
|
|
144
|
+
|
|
145
|
+
### Interactive Mode
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
# Set your API key
|
|
149
|
+
export OPENAI_API_KEY=sk-...
|
|
150
|
+
|
|
151
|
+
# Start the interactive agent
|
|
152
|
+
looped
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
```
|
|
156
|
+
╦ ┌─┐┌─┐┌─┐┌─┐┌┬┐
|
|
157
|
+
║ │ ││ │├─┘├┤ ││
|
|
158
|
+
╩═╝└─┘└─┘┴ └─┘─┴┘
|
|
159
|
+
|
|
160
|
+
Self-improving coding agent powered by DSPy.rb + GEPA
|
|
161
|
+
|
|
162
|
+
Type a coding task and press Enter. Type 'quit' to exit.
|
|
163
|
+
Type 'status' to see optimization status.
|
|
164
|
+
|
|
165
|
+
looped> Write a Ruby function that calculates fibonacci numbers
|
|
166
|
+
|
|
167
|
+
=== Result ===
|
|
168
|
+
|
|
169
|
+
Score: 8.5/10
|
|
170
|
+
|
|
171
|
+
Solution:
|
|
172
|
+
def fibonacci(n)
|
|
173
|
+
return n if n <= 1
|
|
174
|
+
fibonacci(n - 1) + fibonacci(n - 2)
|
|
175
|
+
end
|
|
176
|
+
|
|
177
|
+
Feedback:
|
|
178
|
+
Score: 8.5/10
|
|
179
|
+
Status: PASSED
|
|
180
|
+
...
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
### Single Task Mode
|
|
184
|
+
|
|
185
|
+
```bash
|
|
186
|
+
# Run a single task and exit
|
|
187
|
+
looped "Fix the syntax error in main.rb"
|
|
188
|
+
|
|
189
|
+
# With custom model
|
|
190
|
+
looped -m openai/gpt-4o "Refactor the User class"
|
|
191
|
+
|
|
192
|
+
# With context file
|
|
193
|
+
looped -c README.md "Add installation instructions"
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
### Programmatic Usage
|
|
197
|
+
|
|
198
|
+
```ruby
|
|
199
|
+
require 'looped'
|
|
200
|
+
|
|
201
|
+
# Single task execution
|
|
202
|
+
result = Looped.execute(
|
|
203
|
+
task: 'Write a function that reverses a string',
|
|
204
|
+
context: 'Use Ruby best practices'
|
|
205
|
+
)
|
|
206
|
+
|
|
207
|
+
puts result.solution # => "def reverse_string(str)..."
|
|
208
|
+
puts result.score # => 8.5
|
|
209
|
+
puts result.feedback # => "Score: 8.5/10..."
|
|
210
|
+
|
|
211
|
+
# Or use the agent directly
|
|
212
|
+
agent = Looped::Agent.new(model: 'openai/gpt-4o')
|
|
213
|
+
result = agent.run(task: 'Create a binary search function')
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
## Available Tools
|
|
217
|
+
|
|
218
|
+
The agent has access to four tools:
|
|
219
|
+
|
|
220
|
+
| Tool | Description |
|
|
221
|
+
|------|-------------|
|
|
222
|
+
| `read_file` | Read contents of a file |
|
|
223
|
+
| `write_file` | Write content to a file |
|
|
224
|
+
| `search_code` | Search for patterns in code using ripgrep |
|
|
225
|
+
| `run_command` | Execute shell commands |
|
|
226
|
+
|
|
227
|
+
## CLI Reference
|
|
228
|
+
|
|
229
|
+
```
|
|
230
|
+
looped [options] [task]
|
|
231
|
+
|
|
232
|
+
Options:
|
|
233
|
+
-m, --model MODEL Agent model (default: openai/gpt-4o-mini)
|
|
234
|
+
-j, --judge-model MODEL Judge model for evaluation
|
|
235
|
+
-r, --reflection-model MODEL GEPA reflection model
|
|
236
|
+
-i, --max-iterations N Max ReAct iterations (default: 10)
|
|
237
|
+
-c, --context FILE Load context from file
|
|
238
|
+
--no-optimizer Disable background optimizer
|
|
239
|
+
-v, --version Show version
|
|
240
|
+
-h, --help Show this help message
|
|
241
|
+
|
|
242
|
+
Interactive Commands:
|
|
243
|
+
<task> Execute a coding task
|
|
244
|
+
status Show optimization status
|
|
245
|
+
history Show recent task history
|
|
246
|
+
help Show help message
|
|
247
|
+
quit Exit the application
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
## Configuration
|
|
251
|
+
|
|
252
|
+
### Environment Variables
|
|
253
|
+
|
|
254
|
+
| Variable | Description | Default |
|
|
255
|
+
|----------|-------------|---------|
|
|
256
|
+
| `OPENAI_API_KEY` | OpenAI API key | (required) |
|
|
257
|
+
| `LOOPED_MODEL` | Default agent model | `openai/gpt-4o-mini` |
|
|
258
|
+
| `LOOPED_JUDGE_MODEL` | Default judge model | `openai/gpt-4o-mini` |
|
|
259
|
+
| `LOOPED_STORAGE_DIR` | Storage directory | `~/.looped` |
|
|
260
|
+
|
|
261
|
+
### Supported Models
|
|
262
|
+
|
|
263
|
+
Currently supports OpenAI models via `dspy-openai`:
|
|
264
|
+
|
|
265
|
+
```ruby
|
|
266
|
+
'openai/gpt-4o'
|
|
267
|
+
'openai/gpt-4o-mini'
|
|
268
|
+
'openai/o3-mini'
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
## How It Works
|
|
272
|
+
|
|
273
|
+
### 1. Task Execution
|
|
274
|
+
|
|
275
|
+
When you give Looped a task, it uses a ReAct (Reasoning + Acting) loop:
|
|
276
|
+
|
|
277
|
+
1. **Think**: Analyze the task and plan approach
|
|
278
|
+
2. **Act**: Use tools (read/write files, search, run commands)
|
|
279
|
+
3. **Observe**: Process tool outputs
|
|
280
|
+
4. Repeat until solution is found
|
|
281
|
+
|
|
282
|
+
### 2. Evaluation
|
|
283
|
+
|
|
284
|
+
Every solution is evaluated by an LLM judge that scores:
|
|
285
|
+
- Correctness
|
|
286
|
+
- Code quality
|
|
287
|
+
- Best practices adherence
|
|
288
|
+
|
|
289
|
+
### 3. Learning
|
|
290
|
+
|
|
291
|
+
Results are stored in a training buffer. When enough results accumulate, the background optimizer:
|
|
292
|
+
|
|
293
|
+
1. Collects training examples
|
|
294
|
+
2. Runs GEPA (Genetic Evolution of Prompt Attributes) optimization
|
|
295
|
+
3. Generates improved instructions
|
|
296
|
+
4. Hot-swaps the agent's prompts
|
|
297
|
+
|
|
298
|
+
### 4. Persistence
|
|
299
|
+
|
|
300
|
+
All state is persisted to `~/.looped/`:
|
|
301
|
+
- `instructions.json` - Current optimized prompts
|
|
302
|
+
- `training_buffer.json` - Recent results for learning
|
|
303
|
+
- `history/` - Archived training data
|
|
304
|
+
|
|
305
|
+
## Development
|
|
306
|
+
|
|
307
|
+
```bash
|
|
308
|
+
# Clone the repo
|
|
309
|
+
git clone https://github.com/vicentereig/looped.git
|
|
310
|
+
cd looped
|
|
311
|
+
|
|
312
|
+
# Install dependencies
|
|
313
|
+
bundle install
|
|
314
|
+
|
|
315
|
+
# Run tests
|
|
316
|
+
bundle exec rspec
|
|
317
|
+
|
|
318
|
+
# Run with local changes
|
|
319
|
+
bundle exec exe/looped
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
## Dependencies
|
|
323
|
+
|
|
324
|
+
- [dspy](https://rubygems.org/gems/dspy) - Prompt engineering framework
|
|
325
|
+
- [dspy-openai](https://rubygems.org/gems/dspy-openai) - OpenAI adapter
|
|
326
|
+
- [gepa](https://rubygems.org/gems/gepa) - Genetic prompt optimization
|
|
327
|
+
- [async](https://rubygems.org/gems/async) - Concurrent execution
|
|
328
|
+
- [sorbet-runtime](https://rubygems.org/gems/sorbet-runtime) - Type checking
|
|
329
|
+
|
|
330
|
+
## License
|
|
331
|
+
|
|
332
|
+
MIT License - see [LICENSE.txt](LICENSE.txt)
|
|
333
|
+
|
|
334
|
+
## Contributing
|
|
335
|
+
|
|
336
|
+
1. Fork it
|
|
337
|
+
2. Create your feature branch (`git checkout -b feature/my-feature`)
|
|
338
|
+
3. Commit your changes (`git commit -am 'Add my feature'`)
|
|
339
|
+
4. Push to the branch (`git push origin feature/my-feature`)
|
|
340
|
+
5. Create a Pull Request
|