console_agent 0.8.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fdcfb3c48b2f8421b2187980a453b80324e6dec40ab80e8e55aa1a938355c79c
4
- data.tar.gz: 12fec02740fde7a87bb81e26dd6857263c96f9ebe5a8402040c72279e882e31f
3
+ metadata.gz: dc46d0592feb84b4d85481d1535dccbe417a4445593828424c12a84d96fcbc9c
4
+ data.tar.gz: 10fe29dc81cc425a498c6e7d6c6b82aaa586ec674c081ba3f7b5b1143b68df18
5
5
  SHA512:
6
- metadata.gz: 7af9a3c4fdbdf71abb7452d8e6747ba2b22c64cd08d123b761c5ca7ebb870c3ba9a01e458defe0308dcbf9435ec71879f7f3562503defdfd54e2026d3069679d
7
- data.tar.gz: c84e401d6b6f5c6c7840b5d701d3a0e653ba3ea46651141ae5101b2c93bdca35487566786a87e284dbba97f902c5dad597a00b8dc9fce6e1c01885b01e99a48d
6
+ metadata.gz: 86760d6c3b7c4920fc2c01741be308fc3d3f133e264c8dc37cab6b1ab90e9b920a410d57c86d8f96e743396d6919735d7fad62ee584667c8ea177c4825a12d05
7
+ data.tar.gz: 6446b9b2af4803ccd860fd109484ef37de87850517ad117eb52892974a65a017c1b03188dfc1eb7f24aad3859749ce4f6d282220c8d5d78238e67cfdb7438def
data/CHANGELOG.md ADDED
@@ -0,0 +1,73 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ ## [0.10.0]
6
+
7
+ - Add `/expand` command to view previous results
8
+ - Exclude previous output from context; add tool for LLM to retrieve it on demand
9
+ - Show summarized info per LLM call in `/debug`
10
+
11
+ ## [0.9.0]
12
+
13
+ - Add `/system` and `/context` commands to inspect what is being sent
14
+ - Omit huge output from tool results
15
+ - Don't cancel code execution on incorrect prompt answers
16
+ - Preserve code blocks when compacting; require manual `/compact`
17
+ - Fix authentication when neither method was applied
18
+ - Remove prompt to upgrade model on excessive tool calls
19
+
20
+ ## [0.8.0]
21
+
22
+ - Add authentication function support so host apps can avoid using basic auth
23
+ - Add `/think` and `/cost` commands with Sonnet vs Opus support
24
+ - Gracefully handle token limit exceeded errors
25
+
26
+ ## [0.7.0]
27
+
28
+ - Include binding variables and their classes in the Rails console context
29
+ - Add `ai_setup` command
30
+ - Add `/compact` mechanism for conversation management
31
+ - Catch errors and attempt to auto-fix them
32
+
33
+ ## [0.6.0]
34
+
35
+ - Add core memory (`console_agent.md`) that persists across sessions in the system prompt
36
+ - Add `ai_init` command to seed core memory
37
+ - Allow reading partial files
38
+ - Fix rspec hanging issues
39
+
40
+ ## [0.5.0]
41
+
42
+ - Auto-accept single-step plans
43
+ - Support `>` shorthand to run code directly
44
+ - Add `script/release` for releases
45
+
46
+ ## [0.4.0]
47
+
48
+ - Fix resuming sessions repeatedly
49
+ - Fix terminal flashing/loading in production (kubectl)
50
+ - Better escaping during thinking output
51
+
52
+ ## [0.3.0]
53
+
54
+ - Add plan mechanism with "auto" execution mode
55
+ - Add session logging to DB with `/console_agent` admin UI
56
+ - List and resume past sessions with pagination
57
+ - Add shift-tab for auto-execute mode
58
+ - Add usage display and debug toggle
59
+ - Store sessions incrementally; improved code segment display
60
+
61
+ ## [0.2.0]
62
+
63
+ - Add memory system with individual file storage
64
+ - Add `ask_user` tool
65
+ - Add registry cache
66
+ - Fix REPL up-key and ctrl-a navigation
67
+ - Show tool usage and model processing info
68
+ - Add token count information and debug ability
69
+ - Use tools-based approach instead of sending everything at once
70
+
71
+ ## [0.1.0]
72
+
73
+ - Initial implementation
data/README.md CHANGED
@@ -79,7 +79,10 @@ end
79
79
  | `/usage` | Show token stats |
80
80
  | `/cost` | Show per-model cost breakdown |
81
81
  | `/think` | Upgrade to thinking model (Opus) for the rest of the session |
82
- | `/debug` | Toggle raw API output |
82
+ | `/debug` | Toggle debug summaries (context stats, cost per call) |
83
+ | `/expand <id>` | Show full omitted output |
84
+ | `/context` | Show conversation history as sent to the LLM |
85
+ | `/system` | Show the system prompt |
83
86
  | `/name <label>` | Name the session for easy resume |
84
87
 
85
88
  Prefix input with `>` to run Ruby directly (no LLM round-trip). The result is added to conversation context.
@@ -96,6 +99,8 @@ Say "think harder" in any query to auto-upgrade to the thinking model for that s
96
99
  - **App guide** — `ai_init` generates a guide injected into every system prompt
97
100
  - **Sessions** — name, list, and resume interactive conversations (`ai_setup` to enable)
98
101
  - **History compaction** — `/compact` summarizes long conversations to reduce cost and latency
102
+ - **Output trimming** — older execution outputs are automatically replaced with references; the LLM can recall them on demand via `recall_output`, and you can `/expand <id>` to see them
103
+ - **Debug mode** — `/debug` shows context breakdown, token counts, and per-call cost estimates before and after each LLM call
99
104
 
100
105
  ## Configuration
101
106
 
@@ -13,7 +13,10 @@ module ConsoleAgent
13
13
  username = ConsoleAgent.configuration.admin_username
14
14
  password = ConsoleAgent.configuration.admin_password
15
15
 
16
- return unless username && password
16
+ unless username && password
17
+ head :unauthorized
18
+ return
19
+ end
17
20
 
18
21
  authenticate_or_request_with_http_basic('ConsoleAgent Admin') do |u, p|
19
22
  ActiveSupport::SecurityUtils.secure_compare(u, username) &
@@ -48,6 +48,10 @@ module ConsoleAgent
48
48
 
49
49
  def initialize(binding_context)
50
50
  @binding_context = binding_context
51
+ @omitted_outputs = {}
52
+ @omitted_counter = 0
53
+ @output_store = {}
54
+ @output_counter = 0
51
55
  end
52
56
 
53
57
  def extract_code(response)
@@ -84,7 +88,7 @@ module ConsoleAgent
84
88
  result = binding_context.eval(code, "(console_agent)", 1)
85
89
 
86
90
  $stdout = old_stdout
87
- $stdout.puts colorize("=> #{result.inspect}", :green)
91
+ display_result(result)
88
92
 
89
93
  @last_output = captured_output.string
90
94
  result
@@ -107,6 +111,20 @@ module ConsoleAgent
107
111
  @last_output
108
112
  end
109
113
 
114
+ def expand_output(id)
115
+ @omitted_outputs[id]
116
+ end
117
+
118
+ def store_output(content)
119
+ @output_counter += 1
120
+ @output_store[@output_counter] = content
121
+ @output_counter
122
+ end
123
+
124
+ def recall_output(id)
125
+ @output_store[id]
126
+ end
127
+
110
128
  def last_answer
111
129
  @last_answer
112
130
  end
@@ -126,35 +144,72 @@ module ConsoleAgent
126
144
  @last_answer = answer
127
145
  echo_stdin(answer)
128
146
 
129
- case answer
130
- when 'y', 'yes'
131
- execute(code)
132
- when 'e', 'edit'
133
- edited = open_in_editor(code)
134
- if edited && edited != code
135
- $stdout.puts colorize("# Edited code:", :yellow)
136
- $stdout.puts highlight_code(edited)
137
- $stdout.print colorize("Execute edited code? [y/N] ", :yellow)
138
- edit_answer = $stdin.gets.to_s.strip.downcase
139
- echo_stdin(edit_answer)
140
- if edit_answer == 'y'
141
- execute(edited)
147
+ loop do
148
+ case answer
149
+ when 'y', 'yes', 'a'
150
+ return execute(code)
151
+ when 'e', 'edit'
152
+ edited = open_in_editor(code)
153
+ if edited && edited != code
154
+ $stdout.puts colorize("# Edited code:", :yellow)
155
+ $stdout.puts highlight_code(edited)
156
+ $stdout.print colorize("Execute edited code? [y/N] ", :yellow)
157
+ edit_answer = $stdin.gets.to_s.strip.downcase
158
+ echo_stdin(edit_answer)
159
+ if edit_answer == 'y'
160
+ return execute(edited)
161
+ else
162
+ $stdout.puts colorize("Cancelled.", :yellow)
163
+ return nil
164
+ end
142
165
  else
143
- $stdout.puts colorize("Cancelled.", :yellow)
144
- nil
166
+ return execute(code)
145
167
  end
168
+ when 'n', 'no', ''
169
+ $stdout.puts colorize("Cancelled.", :yellow)
170
+ @last_cancelled = true
171
+ return nil
146
172
  else
147
- execute(code)
173
+ $stdout.print colorize("Execute? [y/N/edit] ", :yellow)
174
+ @on_prompt&.call
175
+ answer = $stdin.gets.to_s.strip.downcase
176
+ @last_answer = answer
177
+ echo_stdin(answer)
148
178
  end
149
- else
150
- $stdout.puts colorize("Cancelled.", :yellow)
151
- @last_cancelled = true
152
- nil
153
179
  end
154
180
  end
155
181
 
156
182
  private
157
183
 
184
+ MAX_DISPLAY_LINES = 10
185
+ MAX_DISPLAY_CHARS = 2000
186
+
187
+ def display_result(result)
188
+ full = "=> #{result.inspect}"
189
+ lines = full.lines
190
+ total_lines = lines.length
191
+ total_chars = full.length
192
+
193
+ if total_lines <= MAX_DISPLAY_LINES && total_chars <= MAX_DISPLAY_CHARS
194
+ $stdout.puts colorize(full, :green)
195
+ else
196
+ # Truncate by lines first, then by chars
197
+ truncated = lines.first(MAX_DISPLAY_LINES).join
198
+ truncated = truncated[0, MAX_DISPLAY_CHARS] if truncated.length > MAX_DISPLAY_CHARS
199
+ $stdout.puts colorize(truncated, :green)
200
+
201
+ omitted_lines = [total_lines - MAX_DISPLAY_LINES, 0].max
202
+ omitted_chars = [total_chars - truncated.length, 0].max
203
+ parts = []
204
+ parts << "#{omitted_lines} lines" if omitted_lines > 0
205
+ parts << "#{omitted_chars} chars" if omitted_chars > 0
206
+
207
+ @omitted_counter += 1
208
+ @omitted_outputs[@omitted_counter] = full
209
+ $stdout.puts colorize(" (omitting #{parts.join(', ')}) /expand #{@omitted_counter} to see all", :yellow)
210
+ end
211
+ end
212
+
158
213
  # Write stdin input to the capture IO only (avoids double-echo on terminal)
159
214
  def echo_stdin(text)
160
215
  $stdout.secondary.write("#{text}\n") if $stdout.respond_to?(:secondary)
@@ -41,24 +41,27 @@ module ConsoleAgent
41
41
  def debug_request(url, body)
42
42
  return unless config.debug
43
43
 
44
- $stderr.puts "\e[33m--- ConsoleAgent DEBUG: REQUEST ---\e[0m"
45
- $stderr.puts "\e[33mURL: #{url}\e[0m"
46
- parsed = body.is_a?(String) ? JSON.parse(body) : body
47
- $stderr.puts "\e[33m#{JSON.pretty_generate(parsed)}\e[0m"
48
- $stderr.puts "\e[33m--- END REQUEST ---\e[0m"
49
- rescue => e
50
- $stderr.puts "\e[33m[debug] #{body}\e[0m"
44
+ parsed = body.is_a?(String) ? (JSON.parse(body) rescue nil) : body
45
+ if parsed
46
+ # Support both symbol and string keys
47
+ model = parsed[:model] || parsed['model']
48
+ msgs = parsed[:messages] || parsed['messages']
49
+ sys = parsed[:system] || parsed['system']
50
+ tools = parsed[:tools] || parsed['tools']
51
+ $stderr.puts "\e[33m[debug] POST #{url} | model: #{model} | #{msgs&.length || 0} msgs | system: #{sys.to_s.length} chars | #{tools&.length || 0} tools\e[0m"
52
+ else
53
+ $stderr.puts "\e[33m[debug] POST #{url}\e[0m"
54
+ end
51
55
  end
52
56
 
53
57
  def debug_response(body)
54
58
  return unless config.debug
55
59
 
56
- $stderr.puts "\e[36m--- ConsoleAgent DEBUG: RESPONSE ---\e[0m"
57
- parsed = body.is_a?(String) ? JSON.parse(body) : body
58
- $stderr.puts "\e[36m#{JSON.pretty_generate(parsed)}\e[0m"
59
- $stderr.puts "\e[36m--- END RESPONSE ---\e[0m"
60
- rescue => e
61
- $stderr.puts "\e[36m[debug] #{body}\e[0m"
60
+ parsed = body.is_a?(String) ? (JSON.parse(body) rescue nil) : body
61
+ if parsed && parsed['usage']
62
+ u = parsed['usage']
63
+ $stderr.puts "\e[36m[debug] response: #{parsed['stop_reason']} | in: #{u['input_tokens']} out: #{u['output_tokens']}\e[0m"
64
+ end
62
65
  end
63
66
 
64
67
  def parse_response(response)
@@ -241,6 +241,11 @@ module ConsoleAgent
241
241
  break if input.downcase == 'exit' || input.downcase == 'quit'
242
242
  next if input.empty?
243
243
 
244
+ if input == '?' || input == '/'
245
+ display_help
246
+ next
247
+ end
248
+
244
249
  if input == '/auto'
245
250
  ConsoleAgent.configuration.auto_execute = !ConsoleAgent.configuration.auto_execute
246
251
  mode = ConsoleAgent.configuration.auto_execute ? 'ON' : 'OFF'
@@ -265,11 +270,32 @@ module ConsoleAgent
265
270
  next
266
271
  end
267
272
 
273
+ if input == '/system'
274
+ @interactive_old_stdout.puts "\e[2m#{context}\e[0m"
275
+ next
276
+ end
277
+
278
+ if input == '/context'
279
+ display_conversation
280
+ next
281
+ end
282
+
268
283
  if input == '/cost'
269
284
  display_cost_summary
270
285
  next
271
286
  end
272
287
 
288
+ if input.start_with?('/expand')
289
+ expand_id = input.sub('/expand', '').strip.to_i
290
+ full_output = @executor.expand_output(expand_id)
291
+ if full_output
292
+ @interactive_old_stdout.puts full_output
293
+ else
294
+ @interactive_old_stdout.puts "\e[33mNo omitted output with id #{expand_id}\e[0m"
295
+ end
296
+ next
297
+ end
298
+
273
299
  if input == '/think'
274
300
  upgrade_to_thinking_model
275
301
  next
@@ -311,7 +337,8 @@ module ConsoleAgent
311
337
 
312
338
  context_msg = "User directly executed code: `#{raw_code}`"
313
339
  context_msg += "\n#{result_str}" unless output_parts.empty?
314
- @history << { role: :user, content: context_msg }
340
+ output_id = output_parts.empty? ? nil : @executor.store_output(result_str)
341
+ @history << { role: :user, content: context_msg, output_id: output_id }
315
342
 
316
343
  @interactive_query ||= input
317
344
  @last_interactive_code = raw_code
@@ -384,18 +411,11 @@ module ConsoleAgent
384
411
  result, tool_messages = send_query(nil, conversation: @history)
385
412
  rescue Providers::ProviderError => e
386
413
  if e.message.include?("prompt is too long") && @history.length >= 6
387
- $stdout.puts "\e[33m Context limit reached. Auto-compacting history...\e[0m"
388
- compact_history
389
- begin
390
- result, tool_messages = send_query(nil, conversation: @history)
391
- rescue Providers::ProviderError => e2
392
- $stderr.puts "\e[31m Still too large after compaction: #{e2.message}\e[0m"
393
- return :error
394
- end
414
+ $stdout.puts "\e[33m Context limit reached. Run /compact to reduce context size, then try again.\e[0m"
395
415
  else
396
416
  $stderr.puts "\e[31mConsoleAgent Error: #{e.class}: #{e.message}\e[0m"
397
- return :error
398
417
  end
418
+ return :error
399
419
  rescue Interrupt
400
420
  $stdout.puts "\n\e[33m Aborted.\e[0m"
401
421
  return :interrupted
@@ -451,7 +471,8 @@ module ConsoleAgent
451
471
  unless output_parts.empty?
452
472
  result_str = output_parts.join("\n\n")
453
473
  result_str = result_str[0..1000] + '...' if result_str.length > 1000
454
- @history << { role: :user, content: "Code was executed. #{result_str}" }
474
+ output_id = @executor.store_output(result_str)
475
+ @history << { role: :user, content: "Code was executed. #{result_str}", output_id: output_id }
455
476
  end
456
477
 
457
478
  :success
@@ -539,6 +560,10 @@ module ConsoleAgent
539
560
  prompt.strip
540
561
  end
541
562
 
563
+ # Number of most recent execution outputs to keep in full in the conversation.
564
+ # Older outputs are replaced with a short reference the LLM can recall via tool.
565
+ RECENT_OUTPUTS_TO_KEEP = 2
566
+
542
567
  def send_query(query, conversation: nil)
543
568
  ConsoleAgent.configuration.validate!
544
569
 
@@ -548,6 +573,8 @@ module ConsoleAgent
548
573
  [{ role: :user, content: query }]
549
574
  end
550
575
 
576
+ messages = trim_old_outputs(messages) if conversation
577
+
551
578
  send_query_with_tools(messages)
552
579
  end
553
580
 
@@ -564,18 +591,8 @@ module ConsoleAgent
564
591
  last_tool_names = []
565
592
 
566
593
  exhausted = false
567
- thinking_suggested = false
568
594
 
569
595
  max_rounds.times do |round|
570
- if round == 5 && !thinking_suggested && !on_thinking_model?
571
- thinking_suggested = true
572
- thinking_name = ConsoleAgent.configuration.resolved_thinking_model
573
- $stdout.puts "\e[33m This query is using many tool rounds. Switch to thinking model (#{thinking_name})? [y/N]\e[0m"
574
- answer = Readline.readline(" ", false).to_s.strip.downcase
575
- if answer == 'y'
576
- upgrade_to_thinking_model
577
- end
578
- end
579
596
  if round == 0
580
597
  $stdout.puts "\e[2m Thinking...\e[0m"
581
598
  else
@@ -588,26 +605,24 @@ module ConsoleAgent
588
605
  $stdout.puts "\e[2m #{llm_status(round, messages, total_input, last_thinking, last_tool_names)}\e[0m"
589
606
  end
590
607
 
608
+ if ConsoleAgent.configuration.debug
609
+ debug_pre_call(round, messages, active_system_prompt, tools, total_input, total_output)
610
+ end
611
+
591
612
  begin
592
613
  result = with_escape_monitoring do
593
614
  provider.chat_with_tools(messages, tools: tools, system_prompt: active_system_prompt)
594
615
  end
595
616
  rescue Providers::ProviderError => e
596
- if e.message.include?("prompt is too long") && messages.length >= 6
597
- $stdout.puts "\e[33m Context limit hit mid-session. Compacting messages...\e[0m"
598
- messages = compact_messages(messages)
599
- unless @_retried_compact
600
- @_retried_compact = true
601
- retry
602
- end
603
- end
604
617
  raise
605
- ensure
606
- @_retried_compact = nil
607
618
  end
608
619
  total_input += result.input_tokens || 0
609
620
  total_output += result.output_tokens || 0
610
621
 
622
+ if ConsoleAgent.configuration.debug
623
+ debug_post_call(round, result, @total_input_tokens + total_input, @total_output_tokens + total_output)
624
+ end
625
+
611
626
  break unless result.tool_use?
612
627
 
613
628
  # Buffer thinking text for display before next LLM call
@@ -636,10 +651,14 @@ module ConsoleAgent
636
651
  end
637
652
 
638
653
  if ConsoleAgent.configuration.debug
639
- $stderr.puts "\e[35m[debug tool result] #{tool_result}\e[0m"
654
+ $stderr.puts "\e[35m[debug] tool result (#{tool_result.to_s.length} chars)\e[0m"
640
655
  end
641
656
 
642
657
  tool_msg = provider.format_tool_result(tc[:id], tool_result)
658
+ # Store large tool results so they can be trimmed from older conversation turns
659
+ if tool_result.to_s.length > 200
660
+ tool_msg[:output_id] = @executor.store_output(tool_result.to_s)
661
+ end
643
662
  messages << tool_msg
644
663
  new_messages << tool_msg
645
664
  end
@@ -724,6 +743,89 @@ module ConsoleAgent
724
743
  status
725
744
  end
726
745
 
746
+ def debug_pre_call(round, messages, system_prompt, tools, total_input, total_output)
747
+ d = "\e[35m"
748
+ r = "\e[0m"
749
+
750
+ # Count message types
751
+ user_msgs = 0
752
+ assistant_msgs = 0
753
+ tool_result_msgs = 0
754
+ tool_use_msgs = 0
755
+ output_msgs = 0
756
+ omitted_msgs = 0
757
+ total_content_chars = system_prompt.to_s.length
758
+
759
+ messages.each do |msg|
760
+ content_str = msg[:content].is_a?(Array) ? msg[:content].to_s : msg[:content].to_s
761
+ total_content_chars += content_str.length
762
+
763
+ role = msg[:role].to_s
764
+ if role == 'tool'
765
+ tool_result_msgs += 1
766
+ elsif msg[:content].is_a?(Array)
767
+ # Anthropic format — check for tool_result or tool_use blocks
768
+ msg[:content].each do |block|
769
+ next unless block.is_a?(Hash)
770
+ if block['type'] == 'tool_result'
771
+ tool_result_msgs += 1
772
+ omitted_msgs += 1 if block['content'].to_s.include?('Output omitted')
773
+ elsif block['type'] == 'tool_use'
774
+ tool_use_msgs += 1
775
+ end
776
+ end
777
+ elsif role == 'user'
778
+ user_msgs += 1
779
+ if content_str.include?('Code was executed') || content_str.include?('directly executed code')
780
+ output_msgs += 1
781
+ omitted_msgs += 1 if content_str.include?('Output omitted')
782
+ end
783
+ elsif role == 'assistant'
784
+ assistant_msgs += 1
785
+ end
786
+ end
787
+
788
+ tool_count = tools.respond_to?(:definitions) ? tools.definitions.length : 0
789
+
790
+ $stderr.puts "#{d}[debug] ── LLM call ##{round + 1} ──#{r}"
791
+ $stderr.puts "#{d}[debug] system prompt: #{format_tokens(system_prompt.to_s.length)} chars#{r}"
792
+ $stderr.puts "#{d}[debug] messages: #{messages.length} (#{user_msgs} user, #{assistant_msgs} assistant, #{tool_result_msgs} tool results, #{tool_use_msgs} tool calls)#{r}"
793
+ $stderr.puts "#{d}[debug] execution outputs: #{output_msgs} (#{omitted_msgs} omitted)#{r}" if output_msgs > 0 || omitted_msgs > 0
794
+ $stderr.puts "#{d}[debug] tools provided: #{tool_count}#{r}"
795
+ $stderr.puts "#{d}[debug] est. content size: #{format_tokens(total_content_chars)} chars#{r}"
796
+ if total_input > 0 || total_output > 0
797
+ $stderr.puts "#{d}[debug] tokens so far: in: #{format_tokens(total_input)} | out: #{format_tokens(total_output)}#{r}"
798
+ end
799
+ end
800
+
801
+ def debug_post_call(round, result, total_input, total_output)
802
+ d = "\e[35m"
803
+ r = "\e[0m"
804
+
805
+ input_t = result.input_tokens || 0
806
+ output_t = result.output_tokens || 0
807
+ model = ConsoleAgent.configuration.resolved_model
808
+ pricing = Configuration::PRICING[model]
809
+
810
+ parts = ["in: #{format_tokens(input_t)}", "out: #{format_tokens(output_t)}"]
811
+
812
+ if pricing
813
+ cost = (input_t * pricing[:input]) + (output_t * pricing[:output])
814
+ session_cost = (total_input * pricing[:input]) + (total_output * pricing[:output])
815
+ parts << "~$#{'%.4f' % cost}"
816
+ $stderr.puts "#{d}[debug] ← response: #{parts.join(' | ')} (session: ~$#{'%.4f' % session_cost})#{r}"
817
+ else
818
+ $stderr.puts "#{d}[debug] ← response: #{parts.join(' | ')}#{r}"
819
+ end
820
+
821
+ if result.tool_use?
822
+ tool_names = result.tool_calls.map { |tc| tc[:name] }
823
+ $stderr.puts "#{d}[debug] tool calls: #{tool_names.join(', ')}#{r}"
824
+ else
825
+ $stderr.puts "#{d}[debug] stop reason: #{result.stop_reason}#{r}"
826
+ end
827
+ end
828
+
727
829
  def format_tokens(count)
728
830
  if count >= 1_000_000
729
831
  "#{(count / 1_000_000.0).round(1)}M"
@@ -987,13 +1089,58 @@ module ConsoleAgent
987
1089
  config.resolved_model == config.resolved_thinking_model
988
1090
  end
989
1091
 
1092
+ # Replace older execution outputs with short references.
1093
+ # Keeps the last RECENT_OUTPUTS_TO_KEEP outputs in full.
1094
+ def trim_old_outputs(messages)
1095
+ # Find indices of messages with output_id (execution outputs and tool results)
1096
+ output_indices = messages.each_with_index
1097
+ .select { |m, _| m[:output_id] }
1098
+ .map { |_, i| i }
1099
+
1100
+ if output_indices.length <= RECENT_OUTPUTS_TO_KEEP
1101
+ return messages.map { |m| m.except(:output_id) }
1102
+ end
1103
+
1104
+ # Indices to trim (all except the most recent N)
1105
+ trim_indices = output_indices[0..-(RECENT_OUTPUTS_TO_KEEP + 1)]
1106
+ messages.each_with_index.map do |msg, i|
1107
+ if trim_indices.include?(i)
1108
+ trim_message(msg)
1109
+ else
1110
+ msg.except(:output_id)
1111
+ end
1112
+ end
1113
+ end
1114
+
1115
+ # Replace the content of a message with a short reference to the stored output.
1116
+ # Handles both regular messages and tool result messages (Anthropic/OpenAI formats).
1117
+ def trim_message(msg)
1118
+ ref = "[Output omitted — use recall_output tool with id #{msg[:output_id]} to retrieve]"
1119
+
1120
+ if msg[:content].is_a?(Array)
1121
+ # Anthropic tool_result format: [{ 'type' => 'tool_result', 'tool_use_id' => '...', 'content' => '...' }]
1122
+ trimmed_content = msg[:content].map do |block|
1123
+ if block.is_a?(Hash) && block['type'] == 'tool_result'
1124
+ block.merge('content' => ref)
1125
+ else
1126
+ block
1127
+ end
1128
+ end
1129
+ { role: msg[:role], content: trimmed_content }
1130
+ elsif msg[:role].to_s == 'tool'
1131
+ # OpenAI tool result format
1132
+ msg.except(:output_id).merge(content: ref)
1133
+ else
1134
+ # Regular user message (code execution result)
1135
+ first_line = msg[:content].to_s.lines.first&.strip || msg[:content]
1136
+ { role: msg[:role], content: "#{first_line}\n#{ref}" }
1137
+ end
1138
+ end
1139
+
990
1140
  def warn_if_history_large
991
1141
  chars = @history.sum { |m| m[:content].to_s.length }
992
1142
 
993
- if chars > 120_000 && @history.length >= 6
994
- $stdout.puts "\e[33m Context growing large (~#{format_tokens(chars)} chars). Auto-compacting...\e[0m"
995
- compact_history
996
- elsif chars > 50_000 && !@compact_warned
1143
+ if chars > 50_000 && !@compact_warned
997
1144
  @compact_warned = true
998
1145
  $stdout.puts "\e[33m Conversation is getting large (~#{format_tokens(chars)} chars). Consider running /compact to reduce context size.\e[0m"
999
1146
  end
@@ -1008,6 +1155,9 @@ module ConsoleAgent
1008
1155
  before_chars = @history.sum { |m| m[:content].to_s.length }
1009
1156
  before_count = @history.length
1010
1157
 
1158
+ # Extract successfully executed code before summarizing
1159
+ executed_code = extract_executed_code(@history)
1160
+
1011
1161
  $stdout.puts "\e[2m Compacting #{before_count} messages (~#{format_tokens(before_chars)} chars)...\e[0m"
1012
1162
 
1013
1163
  system_prompt = <<~PROMPT
@@ -1018,8 +1168,8 @@ module ConsoleAgent
1018
1168
  - Key findings and data discovered (include specific values, IDs, record counts)
1019
1169
  - Current state: what worked, what failed, where things stand
1020
1170
  - Important variable names, model names, or table names referenced
1021
- - Any code that was executed and its results
1022
1171
 
1172
+ Do NOT include code that was executed — that will be preserved separately.
1023
1173
  Be concise but preserve all information that would be needed to continue the conversation naturally.
1024
1174
  Do NOT include any preamble — just output the summary directly.
1025
1175
  PROMPT
@@ -1037,32 +1187,130 @@ module ConsoleAgent
1037
1187
  return
1038
1188
  end
1039
1189
 
1040
- @history = [{ role: :user, content: "CONVERSATION SUMMARY (compacted):\n#{summary}" }]
1190
+ content = "CONVERSATION SUMMARY (compacted):\n#{summary}"
1191
+ unless executed_code.empty?
1192
+ content += "\n\nCODE EXECUTED THIS SESSION (preserved for continuation):\n#{executed_code}"
1193
+ end
1194
+
1195
+ @history = [{ role: :user, content: content }]
1041
1196
  @compact_warned = false
1042
1197
 
1043
1198
  after_chars = @history.first[:content].length
1044
1199
  $stdout.puts "\e[36m Compacted: #{before_count} messages -> 1 summary (~#{format_tokens(before_chars)} -> ~#{format_tokens(after_chars)} chars)\e[0m"
1045
1200
  summary.each_line { |line| $stdout.puts "\e[2m #{line.rstrip}\e[0m" }
1201
+ if !executed_code.empty?
1202
+ $stdout.puts "\e[2m (preserved #{executed_code.scan(/```ruby/).length} executed code block(s))\e[0m"
1203
+ end
1046
1204
  display_usage(result)
1047
1205
  rescue => e
1048
1206
  $stdout.puts "\e[31m Compaction failed: #{e.message}\e[0m"
1049
1207
  end
1050
1208
  end
1051
1209
 
1052
- def compact_messages(messages)
1053
- return messages if messages.length < 6
1210
+ # Extracts code blocks that were successfully executed from conversation history.
1211
+ # Looks for:
1212
+ # 1. Assistant messages with ```ruby blocks followed by "Code was executed." user messages
1213
+ # 2. execute_plan tool calls followed by results without ERROR
1214
+ # Skips code that failed or was declined.
1215
+ def extract_executed_code(history)
1216
+ code_blocks = []
1217
+ history.each_cons(2) do |msg, next_msg|
1218
+ # Pattern 1: Assistant ```ruby blocks with successful execution
1219
+ if msg[:role].to_s == 'assistant' && next_msg[:role].to_s == 'user'
1220
+ content = msg[:content].to_s
1221
+ next_content = next_msg[:content].to_s
1222
+
1223
+ if next_content.start_with?('Code was executed.')
1224
+ content.scan(/```ruby\s*\n(.*?)```/m).each do |match|
1225
+ code = match[0].strip
1226
+ next if code.empty?
1227
+ result_summary = next_content[0..200].gsub("\n", "\n# ")
1228
+ code_blocks << "```ruby\n#{code}\n```\n# #{result_summary}"
1229
+ end
1230
+ end
1231
+ end
1232
+
1233
+ # Pattern 2: execute_plan tool calls in provider-formatted messages
1234
+ if msg[:role].to_s == 'assistant' && msg[:content].is_a?(Array)
1235
+ msg[:content].each do |block|
1236
+ next unless block.is_a?(Hash) && block['type'] == 'tool_use' && block['name'] == 'execute_plan'
1237
+ input = block['input'] || {}
1238
+ steps = input['steps'] || []
1239
+
1240
+ # Find the matching tool_result in subsequent messages
1241
+ tool_id = block['id']
1242
+ result_msg = find_tool_result(history, tool_id)
1243
+ next unless result_msg
1244
+
1245
+ result_text = result_msg.to_s
1246
+ # Extract only steps that succeeded (no ERROR in their result)
1247
+ steps.each_with_index do |step, i|
1248
+ step_num = i + 1
1249
+ # Check if this specific step had an error
1250
+ step_section = result_text[/Step #{step_num}\b.*?(?=Step #{step_num + 1}\b|\z)/m] || ''
1251
+ next if step_section.include?('ERROR:')
1252
+ next if step_section.include?('User declined')
1253
+
1254
+ code = step['code'].to_s.strip
1255
+ next if code.empty?
1256
+ desc = step['description'] || "Step #{step_num}"
1257
+ code_blocks << "```ruby\n# #{desc}\n#{code}\n```"
1258
+ end
1259
+ end
1260
+ end
1261
+ end
1262
+ code_blocks.join("\n\n")
1263
+ end
1054
1264
 
1055
- to_summarize = messages[0...-4]
1056
- to_keep = messages[-4..]
1265
+ def find_tool_result(history, tool_id)
1266
+ history.each do |msg|
1267
+ next unless msg[:content].is_a?(Array)
1268
+ msg[:content].each do |block|
1269
+ next unless block.is_a?(Hash)
1270
+ if block['type'] == 'tool_result' && block['tool_use_id'] == tool_id
1271
+ return block['content']
1272
+ end
1273
+ # OpenAI format
1274
+ if msg[:role].to_s == 'tool' && msg[:tool_call_id] == tool_id
1275
+ return msg[:content]
1276
+ end
1277
+ end
1278
+ end
1279
+ nil
1280
+ end
1057
1281
 
1058
- history_text = to_summarize.map { |m| "#{m[:role]}: #{m[:content].to_s[0..500]}" }.join("\n\n")
1282
+ def display_conversation
1283
+ if @history.empty?
1284
+ @interactive_old_stdout.puts "\e[2m (no conversation history yet)\e[0m"
1285
+ return
1286
+ end
1059
1287
 
1060
- summary_result = provider.chat(
1061
- [{ role: :user, content: "Summarize this conversation context concisely, preserving key facts, IDs, and findings:\n\n#{history_text}" }],
1062
- system_prompt: "You are a conversation summarizer. Be concise but preserve all actionable information."
1063
- )
1288
+ trimmed = trim_old_outputs(@history)
1289
+ @interactive_old_stdout.puts "\e[36m Conversation (#{trimmed.length} messages, as sent to LLM):\e[0m"
1290
+ trimmed.each_with_index do |msg, i|
1291
+ role = msg[:role].to_s
1292
+ content = msg[:content].to_s
1293
+ label = role == 'user' ? "\e[33m[user]\e[0m" : "\e[36m[assistant]\e[0m"
1294
+ @interactive_old_stdout.puts "#{label} #{content}"
1295
+ @interactive_old_stdout.puts if i < trimmed.length - 1
1296
+ end
1297
+ end
1064
1298
 
1065
- [{ role: :user, content: "CONTEXT SUMMARY:\n#{summary_result.text}" }] + to_keep
1299
+ def display_help
1300
+ auto = ConsoleAgent.configuration.auto_execute ? 'ON' : 'OFF'
1301
+ @interactive_old_stdout.puts "\e[36m Commands:\e[0m"
1302
+ @interactive_old_stdout.puts "\e[2m /auto Toggle auto-execute (currently #{auto}) (Shift-Tab)\e[0m"
1303
+ @interactive_old_stdout.puts "\e[2m /think Switch to thinking model\e[0m"
1304
+ @interactive_old_stdout.puts "\e[2m /compact Summarize conversation to reduce context\e[0m"
1305
+ @interactive_old_stdout.puts "\e[2m /usage Show session token totals\e[0m"
1306
+ @interactive_old_stdout.puts "\e[2m /cost Show cost estimate by model\e[0m"
1307
+ @interactive_old_stdout.puts "\e[2m /name <lbl> Name this session for easy resume\e[0m"
1308
+ @interactive_old_stdout.puts "\e[2m /context Show conversation history sent to the LLM\e[0m"
1309
+ @interactive_old_stdout.puts "\e[2m /system Show the system prompt\e[0m"
1310
+ @interactive_old_stdout.puts "\e[2m /expand <id> Show full omitted output\e[0m"
1311
+ @interactive_old_stdout.puts "\e[2m /debug Toggle debug summaries (context stats, cost per call)\e[0m"
1312
+ @interactive_old_stdout.puts "\e[2m > code Execute Ruby directly (skip LLM)\e[0m"
1313
+ @interactive_old_stdout.puts "\e[2m exit/quit Leave interactive mode\e[0m"
1066
1314
  end
1067
1315
 
1068
1316
  def display_exit_info
@@ -170,6 +170,24 @@ module ConsoleAgent
170
170
  handler: ->(args) { code.search_code(args['query'], args['directory']) }
171
171
  )
172
172
 
173
+ if @executor
174
+ register(
175
+ name: 'recall_output',
176
+ description: 'Retrieve a previous code execution output that was omitted from the conversation to save context. Use the output id shown in the "[Output omitted]" placeholder.',
177
+ parameters: {
178
+ 'type' => 'object',
179
+ 'properties' => {
180
+ 'id' => { 'type' => 'integer', 'description' => 'The output id to retrieve' }
181
+ },
182
+ 'required' => ['id']
183
+ },
184
+ handler: ->(args) {
185
+ result = @executor.recall_output(args['id'].to_i)
186
+ result || "No output found with id #{args['id']}"
187
+ }
188
+ )
189
+ end
190
+
173
191
  unless @mode == :init
174
192
  register(
175
193
  name: 'ask_user',
@@ -1,3 +1,3 @@
1
1
  module ConsoleAgent
2
- VERSION = '0.8.0'.freeze
2
+ VERSION = '0.10.0'.freeze
3
3
  end
@@ -35,7 +35,7 @@ ConsoleAgent.configure do |config|
35
35
  # config.connection_class = Sharding::CentralizedModel
36
36
 
37
37
  # Admin UI credentials (mount ConsoleAgent::Engine => '/console_agent' in routes.rb)
38
- # When nil, no authentication is required (convenient for development)
38
+ # When nil, all requests are denied. Set credentials or use config.authenticate.
39
39
  # config.admin_username = 'admin'
40
40
  # config.admin_password = 'changeme'
41
41
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: console_agent
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.8.0
4
+ version: 0.10.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Cortfr
@@ -86,6 +86,7 @@ executables: []
86
86
  extensions: []
87
87
  extra_rdoc_files: []
88
88
  files:
89
+ - CHANGELOG.md
89
90
  - LICENSE
90
91
  - README.md
91
92
  - app/controllers/console_agent/application_controller.rb