looped 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/PLAN.md ADDED
@@ -0,0 +1,856 @@
1
+ # Looped - Self-Improving Coding Agent
2
+
3
+ ## Overview
4
+
5
+ **looped** is a standalone Ruby gem that provides a **self-improving coding agent**:
6
+ 1. Uses **DSPy.rb ReAct** with coding tools for controlled, auditable actions
7
+ 2. Implements **ephemeral memory** with context engineering (rich storage, lean prompts)
8
+ 3. **Continuously evolves prompts** using GEPA running as a background async task
9
+ 4. Evaluates with **LLM-as-judge** (configurable model)
10
+ 5. **Persists state to disk** (~/.looped/) for cross-session learning
11
+ 6. **Full Sorbet type annotations** throughout
12
+
13
+ **Dependency**: `dspy-rb` gem (including `dspy-gepa`)
14
+
15
+ ## Architecture
16
+
17
+ ```
18
+ ┌─────────────────────────────────────────────────────────────────┐
19
+ │ Foreground │
20
+ │ ┌───────────────────────────────────────────────────────────┐ │
21
+ │ │ Looped::Agent │ │
22
+ │ │ - Loads current best instructions from disk │ │
23
+ │ │ - Handles coding tasks with DSPy::ReAct + tools │ │
24
+ │ │ - Writes results to training buffer │ │
25
+ │ └───────────────────────────────────────────────────────────┘ │
26
+ └─────────────────────────────────────────────────────────────────┘
27
+
28
+ ▼ writes
29
+ ┌─────────────────────────────────────────────────────────────────┐
30
+ │ ~/.looped/ │
31
+ │ ├── instructions.json # Current best instructions │
32
+ │ ├── frontier.json # Pareto frontier state │
33
+ │ ├── training_buffer.json # Recent task results for learning │
34
+ │ └── history/ # Historical training data │
35
+ └─────────────────────────────────────────────────────────────────┘
36
+
37
+ ▲ reads/updates
38
+ ┌─────────────────────────────────────────────────────────────────┐
39
+ │ Background (Async Task) │
40
+ │ ┌───────────────────────────────────────────────────────────┐ │
41
+ │ │ Looped::Optimizer │ │
42
+ │ │ - Monitors training buffer for new results │ │
43
+ │ │ - Runs GEPA reflection cycles when buffer has data │ │
44
+ │ │ - Evaluates candidates against validation set │ │
45
+ │ │ - Hot-swaps instructions.json when improvement found │ │
46
+ │ └───────────────────────────────────────────────────────────┘ │
47
+ └─────────────────────────────────────────────────────────────────┘
48
+ ```
49
+
50
+ ## Gem Structure
51
+
52
+ ```
53
+ looped/
54
+ ├── looped.gemspec
55
+ ├── Gemfile
56
+ ├── README.md
57
+ ├── LICENSE.txt
58
+ ├── bin/
59
+ │ └── looped # CLI entry point
60
+ ├── lib/
61
+ │ ├── looped.rb # Main entry, Looped.start
62
+ │ └── looped/
63
+ │ ├── version.rb
64
+ │ ├── types.rb # Sorbet T::Struct types
65
+ │ ├── signatures.rb # DSPy Signatures
66
+ │ ├── agent.rb # Looped::Agent (ReAct-based)
67
+ │ ├── optimizer.rb # Looped::Optimizer (GEPA wrapper)
68
+ │ ├── state.rb # Looped::State (file persistence)
69
+ │ ├── judge.rb # Looped::Judge (LLM-as-judge)
70
+ │ ├── memory.rb # Looped::Memory (context engineering)
71
+ │ └── tools/
72
+ │ ├── base.rb # Common tool functionality
73
+ │ ├── read_file.rb
74
+ │ ├── write_file.rb
75
+ │ ├── run_command.rb # Docker-sandboxed
76
+ │ └── search_code.rb
77
+ └── spec/
78
+ ├── spec_helper.rb
79
+ ├── looped/
80
+ │ ├── agent_spec.rb
81
+ │ ├── optimizer_spec.rb
82
+ │ ├── state_spec.rb
83
+ │ └── memory_spec.rb
84
+ └── integration/
85
+ └── full_loop_spec.rb # VCR-recorded integration test
86
+ ```
87
+
88
+ ## Implementation Plan
89
+
90
+ ### Step 1: Sorbet Types (lib/looped/types.rb)
91
+
92
+ ```ruby
93
+ # typed: strict
94
+ # frozen_string_literal: true
95
+
96
+ module Looped
97
+ module Types
98
+ extend T::Sig
99
+
100
+ # Rich memory entry for storage/analytics
101
+ class MemoryEntry < T::Struct
102
+ const :action_type, String
103
+ const :action_input, T::Hash[String, T.untyped]
104
+ const :action_output, String
105
+ const :timestamp, String
106
+ const :model_id, T.nilable(String)
107
+ const :error, T.nilable(String)
108
+ const :tokens_used, T.nilable(Integer)
109
+ end
110
+
111
+ # Lean context entry for prompts
112
+ class ActionSummary < T::Struct
113
+ const :action, String
114
+ const :result, String
115
+ end
116
+
117
+ # Training result stored to buffer
118
+ class TrainingResult < T::Struct
119
+ const :task, String
120
+ const :solution, String
121
+ const :score, Float
122
+ const :feedback, String
123
+ const :timestamp, String
124
+ end
125
+
126
+ # Persisted instructions with metadata
127
+ class Instructions < T::Struct
128
+ const :thought_generator, T.nilable(String)
129
+ const :observation_processor, T.nilable(String)
130
+ const :score, Float
131
+ const :generation, Integer
132
+ const :updated_at, String
133
+ end
134
+
135
+ # Judgment from LLM-as-judge
136
+ class Judgment < T::Struct
137
+ const :score, Float
138
+ const :passed, T::Boolean
139
+ const :critique, String
140
+ const :suggestions, T::Array[String]
141
+ end
142
+ end
143
+ end
144
+ ```
145
+
146
+ ### Step 2: DSPy Signatures (lib/looped/signatures.rb)
147
+
148
+ ```ruby
149
+ # typed: strict
150
+ # frozen_string_literal: true
151
+
152
+ module Looped
153
+ # Main coding task signature
154
+ class CodingTaskSignature < DSPy::Signature
155
+ description "Complete a coding task in any programming language."
156
+
157
+ input do
158
+ const :task, String
159
+ const :context, String, default: ''
160
+ const :history, T::Array[Types::ActionSummary], default: []
161
+ end
162
+
163
+ output do
164
+ const :solution, String
165
+ const :files_modified, T::Array[String]
166
+ end
167
+ end
168
+
169
+ # LLM-as-Judge signature
170
+ class JudgeSignature < DSPy::Signature
171
+ description "Evaluate code quality and correctness."
172
+
173
+ input do
174
+ const :task, String
175
+ const :solution, String
176
+ const :expected_behavior, String
177
+ end
178
+
179
+ output do
180
+ const :score, Float
181
+ const :passed, T::Boolean
182
+ const :critique, String
183
+ const :suggestions, T::Array[String]
184
+ end
185
+ end
186
+ end
187
+ ```
188
+
189
+ ### Step 3: Memory with Context Engineering (lib/looped/memory.rb)
190
+
191
+ ```ruby
192
+ # typed: strict
193
+ # frozen_string_literal: true
194
+
195
+ module Looped
196
+ class Memory
197
+ extend T::Sig
198
+
199
+ DEFAULT_MAX_ENTRIES = 10
200
+ DEFAULT_MAX_RESULT_LENGTH = 500
201
+
202
+ sig { params(max_entries: Integer, max_result_length: Integer).void }
203
+ def initialize(max_entries: DEFAULT_MAX_ENTRIES, max_result_length: DEFAULT_MAX_RESULT_LENGTH)
204
+ @entries = T.let([], T::Array[Types::MemoryEntry])
205
+ @max_entries = max_entries
206
+ @max_result_length = max_result_length
207
+ end
208
+
209
+ sig { params(action: String, input: T::Hash[String, T.untyped], output: String, model_id: T.nilable(String)).void }
210
+ def add(action:, input:, output:, model_id: nil)
211
+ @entries << Types::MemoryEntry.new(
212
+ action_type: action,
213
+ action_input: input,
214
+ action_output: output,
215
+ timestamp: Time.now.utc.iso8601,
216
+ model_id: model_id,
217
+ error: nil,
218
+ tokens_used: nil
219
+ )
220
+ end
221
+
222
+ sig { returns(T::Array[Types::ActionSummary]) }
223
+ def to_context
224
+ @entries.last(@max_entries).map do |entry|
225
+ Types::ActionSummary.new(
226
+ action: summarize_action(entry),
227
+ result: truncate(entry.action_output)
228
+ )
229
+ end
230
+ end
231
+
232
+ sig { returns(T::Array[Types::MemoryEntry]) }
233
+ def entries
234
+ @entries.dup
235
+ end
236
+
237
+ sig { void }
238
+ def clear
239
+ @entries.clear
240
+ end
241
+
242
+ private
243
+
244
+ sig { params(entry: Types::MemoryEntry).returns(String) }
245
+ def summarize_action(entry)
246
+ input_summary = entry.action_input.map { |k, v| "#{k}=#{v.to_s[0..50]}" }.join(', ')
247
+ "#{entry.action_type}(#{input_summary})"
248
+ end
249
+
250
+ sig { params(text: String).returns(String) }
251
+ def truncate(text)
252
+ return text if text.length <= @max_result_length
253
+ "#{text[0...@max_result_length]}..."
254
+ end
255
+ end
256
+ end
257
+ ```
258
+
259
+ ### Step 4: State Persistence (lib/looped/state.rb)
260
+
261
+ ```ruby
262
+ # typed: strict
263
+ # frozen_string_literal: true
264
+
265
+ require 'json'
266
+ require 'fileutils'
267
+
268
+ module Looped
269
+ class State
270
+ extend T::Sig
271
+
272
+ STORAGE_DIR = T.let(File.expand_path('~/.looped'), String)
273
+
274
+ sig { void }
275
+ def initialize
276
+ FileUtils.mkdir_p(STORAGE_DIR)
277
+ FileUtils.mkdir_p(File.join(STORAGE_DIR, 'history'))
278
+ end
279
+
280
+ sig { returns(T.nilable(Types::Instructions)) }
281
+ def load_instructions
282
+ path = instructions_path
283
+ return nil unless File.exist?(path)
284
+
285
+ data = JSON.parse(File.read(path), symbolize_names: true)
286
+ Types::Instructions.new(
287
+ thought_generator: data.dig(:instructions, :thought_generator),
288
+ observation_processor: data.dig(:instructions, :observation_processor),
289
+ score: data[:score] || 0.0,
290
+ generation: data[:generation] || 0,
291
+ updated_at: data[:updated_at] || Time.now.utc.iso8601
292
+ )
293
+ end
294
+
295
+ sig { params(instructions: T::Hash[Symbol, T.nilable(String)], score: Float, generation: Integer).void }
296
+ def save_instructions(instructions:, score:, generation:)
297
+ data = {
298
+ instructions: instructions,
299
+ score: score,
300
+ generation: generation,
301
+ updated_at: Time.now.utc.iso8601
302
+ }
303
+ File.write(instructions_path, JSON.pretty_generate(data))
304
+ end
305
+
306
+ sig { params(result: Types::TrainingResult).void }
307
+ def append_training_result(result)
308
+ buffer = load_training_buffer
309
+ buffer << result.serialize
310
+ File.write(training_buffer_path, JSON.pretty_generate(buffer))
311
+ end
312
+
313
+ sig { returns(T::Array[Types::TrainingResult]) }
314
+ def peek_training_buffer
315
+ load_training_buffer.map { |data| deserialize_training_result(data) }
316
+ end
317
+
318
+ sig { returns(T::Array[Types::TrainingResult]) }
319
+ def consume_training_buffer
320
+ buffer = peek_training_buffer
321
+ return [] if buffer.empty?
322
+
323
+ # Archive to history
324
+ archive_path = File.join(STORAGE_DIR, 'history', "#{Time.now.to_i}.json")
325
+ File.write(archive_path, JSON.pretty_generate(load_training_buffer))
326
+
327
+ # Clear buffer
328
+ File.write(training_buffer_path, '[]')
329
+
330
+ buffer
331
+ end
332
+
333
+ private
334
+
335
+ sig { returns(String) }
336
+ def instructions_path
337
+ File.join(STORAGE_DIR, 'instructions.json')
338
+ end
339
+
340
+ sig { returns(String) }
341
+ def training_buffer_path
342
+ File.join(STORAGE_DIR, 'training_buffer.json')
343
+ end
344
+
345
+ sig { returns(T::Array[T::Hash[Symbol, T.untyped]]) }
346
+ def load_training_buffer
347
+ return [] unless File.exist?(training_buffer_path)
348
+ JSON.parse(File.read(training_buffer_path), symbolize_names: true)
349
+ end
350
+
351
+ sig { params(data: T::Hash[Symbol, T.untyped]).returns(Types::TrainingResult) }
352
+ def deserialize_training_result(data)
353
+ Types::TrainingResult.new(
354
+ task: data[:task],
355
+ solution: data[:solution],
356
+ score: data[:score],
357
+ feedback: data[:feedback],
358
+ timestamp: data[:timestamp]
359
+ )
360
+ end
361
+ end
362
+ end
363
+ ```
364
+
365
+ ### Step 5: Tools (lib/looped/tools/)
366
+
367
+ ```ruby
368
+ # typed: strict
369
+ # frozen_string_literal: true
370
+
371
+ # lib/looped/tools/read_file.rb
372
+ module Looped
373
+ module Tools
374
+ class ReadFile < DSPy::Tools::Base
375
+ extend T::Sig
376
+
377
+ tool_name 'read_file'
378
+ tool_description 'Read contents of a file at the given path'
379
+
380
+ sig { params(path: String).returns(String) }
381
+ def call(path:)
382
+ File.read(path)
383
+ rescue Errno::ENOENT
384
+ "Error: File not found: #{path}"
385
+ rescue Errno::EACCES
386
+ "Error: Permission denied: #{path}"
387
+ rescue => e
388
+ "Error: #{e.message}"
389
+ end
390
+ end
391
+
392
+ # lib/looped/tools/write_file.rb
393
+ class WriteFile < DSPy::Tools::Base
394
+ extend T::Sig
395
+
396
+ tool_name 'write_file'
397
+ tool_description 'Write content to a file at the given path'
398
+
399
+ sig { params(path: String, content: String).returns(String) }
400
+ def call(path:, content:)
401
+ FileUtils.mkdir_p(File.dirname(path))
402
+ File.write(path, content)
403
+ "Successfully wrote #{content.length} bytes to #{path}"
404
+ rescue => e
405
+ "Error: #{e.message}"
406
+ end
407
+ end
408
+
409
+ # lib/looped/tools/search_code.rb
410
+ class SearchCode < DSPy::Tools::Base
411
+ extend T::Sig
412
+
413
+ tool_name 'search_code'
414
+ tool_description 'Search for a pattern in code files using ripgrep'
415
+
416
+ sig { params(pattern: String, path: String, file_type: T.nilable(String)).returns(String) }
417
+ def call(pattern:, path: '.', file_type: nil)
418
+ cmd = ['rg', '--line-number', '--no-heading', pattern, path]
419
+ cmd += ['--type', file_type] if file_type
420
+
421
+ output, status = Open3.capture2(*cmd)
422
+ status.success? ? output : "No matches found for: #{pattern}"
423
+ rescue => e
424
+ "Error: #{e.message}"
425
+ end
426
+ end
427
+
428
+ # lib/looped/tools/run_command.rb
429
+ class RunCommand < DSPy::Tools::Base
430
+ extend T::Sig
431
+
432
+ DEFAULT_TIMEOUT = 30
433
+
434
+ tool_name 'run_command'
435
+ tool_description 'Execute a shell command in a Docker sandbox and return output'
436
+
437
+ sig { params(command: String, timeout: Integer).returns(String) }
438
+ def call(command:, timeout: DEFAULT_TIMEOUT)
439
+ # TODO: Implement Docker sandbox via trusted-sandbox gem
440
+ # For now, basic execution with timeout
441
+ Timeout.timeout(timeout) do
442
+ output, status = Open3.capture2e(command)
443
+ "Exit code: #{status.exitstatus}\n#{output}"
444
+ end
445
+ rescue Timeout::Error
446
+ "Error: Command timed out after #{timeout} seconds"
447
+ rescue => e
448
+ "Error: #{e.message}"
449
+ end
450
+ end
451
+ end
452
+ end
453
+ ```
454
+
455
+ ### Step 6: Judge (lib/looped/judge.rb)
456
+
457
+ ```ruby
458
+ # typed: strict
459
+ # frozen_string_literal: true
460
+
461
+ module Looped
462
+ class Judge < DSPy::Predict
463
+ extend T::Sig
464
+
465
+ sig { void }
466
+ def initialize
467
+ super(JudgeSignature)
468
+ end
469
+
470
+ sig { params(task: String, solution: String, expected_behavior: String).returns(Types::Judgment) }
471
+ def evaluate(task:, solution:, expected_behavior:)
472
+ result = call(
473
+ task: task,
474
+ solution: solution,
475
+ expected_behavior: expected_behavior
476
+ )
477
+
478
+ Types::Judgment.new(
479
+ score: result.score,
480
+ passed: result.passed,
481
+ critique: result.critique,
482
+ suggestions: result.suggestions
483
+ )
484
+ end
485
+ end
486
+ end
487
+ ```
488
+
489
+ ### Step 7: Agent (lib/looped/agent.rb)
490
+
491
+ ```ruby
492
+ # typed: strict
493
+ # frozen_string_literal: true
494
+
495
+ module Looped
496
+ class Agent < DSPy::Module
497
+ extend T::Sig
498
+
499
+ around :update_memory
500
+ around :record_for_training
501
+
502
+ sig { params(tools: T::Array[DSPy::Tools::Base], state: State, max_context_entries: Integer).void }
503
+ def initialize(tools:, state:, max_context_entries: 10)
504
+ super()
505
+ @state = state
506
+ @memory = Memory.new(max_entries: max_context_entries)
507
+ @react = T.let(
508
+ DSPy::ReAct.new(CodingTaskSignature, tools: tools, max_iterations: 15),
509
+ DSPy::ReAct
510
+ )
511
+ @judge = T.let(Judge.new, Judge)
512
+ @current_task = T.let(nil, T.nilable(String))
513
+
514
+ reload_instructions
515
+ end
516
+
517
+ sig { void }
518
+ def reload_instructions
519
+ instructions = @state.load_instructions
520
+ return unless instructions
521
+
522
+ apply_instructions(instructions)
523
+ puts "[agent] Loaded instructions (gen #{instructions.generation}, score #{instructions.score.round(2)})"
524
+ end
525
+
526
+ sig { params(instructions: Types::Instructions).void }
527
+ def apply_instructions(instructions)
528
+ @react.with_instruction(instructions.thought_generator) if instructions.thought_generator
529
+ end
530
+
531
+ sig { returns(T::Hash[Symbol, T.nilable(String)]) }
532
+ def extract_instructions
533
+ {
534
+ thought_generator: @react.named_predictors['thought_generator']&.instruction,
535
+ observation_processor: @react.named_predictors['observation_processor']&.instruction
536
+ }
537
+ end
538
+
539
+ sig { params(task: String, context: String).returns(T.untyped) }
540
+ def forward(task:, context: '')
541
+ @current_task = task
542
+ history = @memory.to_context
543
+
544
+ @react.forward(
545
+ task: task,
546
+ context: context,
547
+ history: history
548
+ )
549
+ end
550
+
551
+ private
552
+
553
+ sig { params(_args: T.untyped, kwargs: T.untyped, block: T.proc.returns(T.untyped)).returns(T.untyped) }
554
+ def update_memory(_args, kwargs, &block)
555
+ result = yield
556
+
557
+ result.history&.each do |step|
558
+ @memory.add(
559
+ action: step[:action],
560
+ input: step[:action_input] || {},
561
+ output: step[:observation] || ''
562
+ )
563
+ end
564
+
565
+ result
566
+ end
567
+
568
+ sig { params(_args: T.untyped, kwargs: T.untyped, block: T.proc.returns(T.untyped)).returns(T.untyped) }
569
+ def record_for_training(_args, kwargs, &block)
570
+ result = yield
571
+ task = @current_task
572
+
573
+ return result unless task
574
+
575
+ judgment = @judge.evaluate(
576
+ task: task,
577
+ solution: result.solution || '',
578
+ expected_behavior: "Task completed successfully"
579
+ )
580
+
581
+ training_result = Types::TrainingResult.new(
582
+ task: task,
583
+ solution: result.solution || '',
584
+ score: judgment.score,
585
+ feedback: judgment.critique,
586
+ timestamp: Time.now.utc.iso8601
587
+ )
588
+
589
+ @state.append_training_result(training_result)
590
+
591
+ result
592
+ end
593
+ end
594
+ end
595
+ ```
596
+
597
+ ### Step 8: Optimizer (lib/looped/optimizer.rb)
598
+
599
+ ```ruby
600
+ # typed: strict
601
+ # frozen_string_literal: true
602
+
603
+ module Looped
604
+ class Optimizer
605
+ extend T::Sig
606
+
607
+ MIN_BUFFER_SIZE = 10
608
+ POLL_INTERVAL = 60
609
+
610
+ sig do
611
+ params(
612
+ state: State,
613
+ agent_builder: T.proc.returns(Agent),
614
+ judge_lm: DSPy::LM,
615
+ on_improvement: T.nilable(T.proc.void)
616
+ ).void
617
+ end
618
+ def initialize(state:, agent_builder:, judge_lm:, on_improvement: nil)
619
+ @state = state
620
+ @agent_builder = agent_builder
621
+ @judge_lm = judge_lm
622
+ @reflection_lm = T.let(DSPy::ReflectionLM.new('openai/gpt-4o-mini'), DSPy::ReflectionLM)
623
+ @on_improvement = on_improvement
624
+ end
625
+
626
+ sig { void }
627
+ def run_forever
628
+ loop do
629
+ begin
630
+ check_and_optimize
631
+ rescue => e
632
+ puts "[optimizer] Error: #{e.message}"
633
+ end
634
+
635
+ sleep POLL_INTERVAL
636
+ end
637
+ end
638
+
639
+ private
640
+
641
+ sig { void }
642
+ def check_and_optimize
643
+ buffer = @state.peek_training_buffer
644
+ return if buffer.size < MIN_BUFFER_SIZE
645
+
646
+ puts "[optimizer] Found #{buffer.size} results. Running GEPA..."
647
+
648
+ buffer = @state.consume_training_buffer
649
+
650
+ trainset = buffer.map do |result|
651
+ DSPy::Example.new(
652
+ inputs: { task: result.task },
653
+ expected: { expected_behavior: result.feedback }
654
+ )
655
+ end
656
+
657
+ train, val = trainset.partition.with_index { |_, i| i % 5 != 0 }
658
+
659
+ agent = @agent_builder.call
660
+ current = @state.load_instructions
661
+
662
+ if current
663
+ agent.apply_instructions(current)
664
+ end
665
+
666
+ gepa = DSPy::Teleprompt::GEPA.new(
667
+ metric: create_metric,
668
+ reflection_lm: @reflection_lm,
669
+ config: { max_metric_calls: 50, minibatch_size: 4 }
670
+ )
671
+
672
+ result = gepa.compile(agent, trainset: train, valset: val)
673
+ current_score = current&.score || 0.0
674
+
675
+ if result.best_score_value > current_score
676
+ puts "[optimizer] Improvement! #{current_score.round(2)} → #{result.best_score_value.round(2)}"
677
+
678
+ @state.save_instructions(
679
+ instructions: result.optimized_program.extract_instructions,
680
+ score: result.best_score_value,
681
+ generation: (current&.generation || 0) + 1
682
+ )
683
+
684
+ @on_improvement&.call
685
+ else
686
+ puts "[optimizer] No improvement. Score: #{result.best_score_value.round(2)}"
687
+ end
688
+ end
689
+
690
+ sig { returns(T.proc.params(example: DSPy::Example, prediction: T.untyped).returns(DSPy::Prediction)) }
691
+ def create_metric
692
+ judge = Judge.new
693
+ judge.configure { |c| c.lm = @judge_lm }
694
+
695
+ lambda do |example, prediction|
696
+ judgment = judge.evaluate(
697
+ task: example.input_values[:task],
698
+ solution: prediction.solution || '',
699
+ expected_behavior: example.expected_values[:expected_behavior]
700
+ )
701
+
702
+ DSPy::Prediction.new(score: judgment.score, feedback: judgment.critique)
703
+ end
704
+ end
705
+ end
706
+ end
707
+ ```
708
+
709
+ ### Step 9: Main Entry Point (lib/looped.rb)
710
+
711
+ ```ruby
712
+ # typed: strict
713
+ # frozen_string_literal: true
714
+
715
+ require 'async'
716
+ require 'dspy'
717
+ require 'dspy/gepa'
718
+
719
+ require_relative 'looped/version'
720
+ require_relative 'looped/types'
721
+ require_relative 'looped/signatures'
722
+ require_relative 'looped/memory'
723
+ require_relative 'looped/state'
724
+ require_relative 'looped/tools/read_file'
725
+ require_relative 'looped/tools/write_file'
726
+ require_relative 'looped/tools/search_code'
727
+ require_relative 'looped/tools/run_command'
728
+ require_relative 'looped/judge'
729
+ require_relative 'looped/agent'
730
+ require_relative 'looped/optimizer'
731
+
732
+ module Looped
733
+ extend T::Sig
734
+
735
+ class << self
736
+ extend T::Sig
737
+
738
+ sig { params(judge_model: T.nilable(String), agent_model: T.nilable(String)).void }
739
+ def start(judge_model: nil, agent_model: nil)
740
+ app = Application.new(judge_model: judge_model, agent_model: agent_model)
741
+ app.run
742
+ end
743
+ end
744
+
745
+ class Application
746
+ extend T::Sig
747
+
748
+ sig { params(judge_model: T.nilable(String), agent_model: T.nilable(String)).void }
749
+ def initialize(judge_model: nil, agent_model: nil)
750
+ @state = T.let(State.new, State)
751
+ @judge_lm = T.let(
752
+ DSPy::LM.new(judge_model || ENV['LOOPED_JUDGE_MODEL'] || 'openai/gpt-4o'),
753
+ DSPy::LM
754
+ )
755
+ @agent_lm = T.let(
756
+ DSPy::LM.new(agent_model || ENV['LOOPED_AGENT_MODEL'] || 'openai/gpt-4o-mini'),
757
+ DSPy::LM
758
+ )
759
+
760
+ @agent = T.let(build_agent, Agent)
761
+ @optimizer = T.let(
762
+ Optimizer.new(
763
+ state: @state,
764
+ agent_builder: -> { build_agent },
765
+ judge_lm: @judge_lm,
766
+ on_improvement: -> { @agent.reload_instructions }
767
+ ),
768
+ Optimizer
769
+ )
770
+ end
771
+
772
+ sig { void }
773
+ def run
774
+ Async do |task|
775
+ optimizer_task = task.async { @optimizer.run_forever }
776
+
777
+ puts "[looped] Optimizer started (background)"
778
+ puts "[looped] Agent ready. Type a task or 'quit' to exit."
779
+
780
+ loop do
781
+ print "\n> "
782
+ input = $stdin.gets&.chomp
783
+ break if input.nil? || input == 'quit'
784
+ next if input.empty?
785
+
786
+ begin
787
+ result = @agent.forward(task: input)
788
+ puts "\n#{result.solution}"
789
+ rescue => e
790
+ puts "[error] #{e.message}"
791
+ end
792
+ end
793
+
794
+ optimizer_task.stop
795
+ puts "\n[looped] Goodbye! State saved to ~/.looped/"
796
+ end
797
+ end
798
+
799
+ private
800
+
801
+ sig { returns(Agent) }
802
+ def build_agent
803
+ tools = T.let([
804
+ Tools::ReadFile.new,
805
+ Tools::WriteFile.new,
806
+ Tools::RunCommand.new,
807
+ Tools::SearchCode.new
808
+ ], T::Array[DSPy::Tools::Base])
809
+
810
+ agent = Agent.new(tools: tools, state: @state)
811
+ agent.configure { |c| c.lm = @agent_lm }
812
+ agent
813
+ end
814
+ end
815
+ end
816
+ ```
817
+
818
+ ## Usage
819
+
820
+ ```bash
821
+ # Single command - optimizer runs as async background task
822
+ looped
823
+
824
+ # Interactive session with GEPA learning in background
825
+ [looped] Optimizer started (background)
826
+ [looped] Agent ready (gen 3, score 0.82)
827
+
828
+ > Fix the failing test in spec/user_spec.rb
829
+ [agent] Reading spec/user_spec.rb...
830
+ [agent] Applied fix. Judge score: 0.9
831
+
832
+ > quit
833
+ [looped] Goodbye! State saved to ~/.looped/
834
+ ```
835
+
836
+ ## Design Decisions
837
+
838
+ 1. **Standalone gem**: `looped` depends on `dspy-rb` and `dspy-gepa`
839
+ 2. **Sorbet types**: Full `T::Struct` types for all data structures
840
+ 3. **Sandbox**: Docker containers via `trusted-sandbox` gem for RunCommand
841
+ 4. **Training data**: Real usage - agent learns from your actual coding tasks
842
+ 5. **Judge model**: Configurable via `LOOPED_JUDGE_MODEL` env var
843
+ 6. **GEPA trigger**: Background async task continuously monitors and optimizes
844
+ 7. **Persistence**: File-based in `~/.looped/`
845
+ 8. **Concurrency**: Async gem - single command runs both agent and optimizer
846
+
847
+ ## Testing Strategy
848
+
849
+ 1. **Unit tests** for Memory, State, Tools (isolated, fast)
850
+ 2. **Integration tests** with VCR for Judge and Agent
851
+ 3. **Smoke test** for GEPA optimization loop
852
+ 4. **TDD approach**: Write failing tests first, then implement
853
+
854
+ ## Documentation
855
+
856
+ - `docs/self-improving-coding-agent.md` - Tutorial article explaining the architecture step-by-step