rails_console_ai 0.27.0 → 0.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: db0c0b3b95cc349906845d4058e1eadc6be4ab67e25f32a4b8e4262e61340a87
4
- data.tar.gz: 0ba79ae87d8975193b2b6033279dc41a8841cf71e7f015ce8f36dafa6d62d463
3
+ metadata.gz: 30c7955113bc3e4ce45989fcc0e93034185fbabc0e95acfa01df56ec397900b5
4
+ data.tar.gz: 87b0399d973e448404b234820067ed2f20d4148f78613963904c9e288a17d87a
5
5
  SHA512:
6
- metadata.gz: 615da46e5aa24149783309d69a3b739db1170459c1bf6d929c8ecfc875fbc0654117a7c5d15057aaf7d421660afa9959192f4112b25e722e5e224ce3de768fb0
7
- data.tar.gz: cf807c3dfc0fae08a79013606c5e00383bc26994461c02e4135f8efb495540a0f677bff62eae01c2f396e9ce2de8ede7c82a95415b69831425447aa93361dbdb
6
+ metadata.gz: ad1091c711239e286408f0133378c3597c83075b3f9435a40f43bf2df4fa279cf459cf3af84c58027842d2da4c8bccc5d11bfbec616c6e3ec1212784ad59b764
7
+ data.tar.gz: 61ea9cc3afd50b35ec3d0740b723a076175731d1289c4a98ee5727aaec78c993abb0b78637cb6ac7ef4b550b81b473ff287d83c32405c8fbd345dbf05b0b3b72
data/CHANGELOG.md CHANGED
@@ -2,6 +2,19 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file.
4
4
 
5
+ ## [0.29.0]
6
+
7
+ - Allow steering Slack conversations mid-run by sending follow-up messages that are folded in as user guidance at the next tool-loop boundary
8
+ - Propagate steering guidance into sub-agent runs so interruptions are seen by both the main engine and any active sub-agent
9
+
10
+ ## [0.28.0]
11
+
12
+ - Add `bin/smoke_model.rb` to smoke-test new models (plain, tool, parallel, cache checks)
13
+ - Support Claude Opus 4.7 by omitting the `temperature` parameter for models that reject it
14
+ - Show both estimated request tokens and total billed tokens in LLM round status
15
+ - Auto-upgrade to thinking model on "think harder/deeper/carefully" phrases in Slack as well as console
16
+ - Fix cancelled code execution state persisting into the next user turn
17
+
5
18
  ## [0.26.0]
6
19
 
7
20
  - Add sub-agent support
data/README.md CHANGED
@@ -352,6 +352,39 @@ end
352
352
 
353
353
  Timeout is automatically raised to 300s minimum for local models to account for slower inference.
354
354
 
355
+ ### Testing a new model
356
+
357
+ Before adopting a new Claude model, smoke-test it against the Anthropic or Bedrock provider with `bin/smoke_model.rb`. The script runs four checks and exits non-zero on any failure:
358
+
359
+ | check | what it verifies |
360
+ | -------- | -------------------------------------------------------------------------------- |
361
+ | plain | the model returns text for a basic prompt |
362
+ | tool | a single tool call → tool result → final answer round-trip works |
363
+ | parallel | the model issues multiple tool calls in one response when asked |
364
+ | cache | a long system prompt is written to and read from the prompt cache (with retry) |
365
+
366
+ ```bash
367
+ # Anthropic — provider inferred from the `claude-` prefix
368
+ ANTHROPIC_API_KEY=sk-ant-... bin/smoke_model.rb --model claude-opus-4-7
369
+
370
+ # Bedrock — provider inferred from the regional `us.anthropic.` prefix.
371
+ # Requires the aws-sdk-bedrockruntime gem and AWS credentials in the environment.
372
+ bin/smoke_model.rb --model us.anthropic.claude-opus-4-7
373
+
374
+ # Bedrock in another region
375
+ bin/smoke_model.rb --model eu.anthropic.claude-opus-4-7 --region eu-west-1
376
+
377
+ # Subset of checks, e.g. when iterating on cache behavior
378
+ bin/smoke_model.rb --model claude-sonnet-4-6 --checks cache
379
+
380
+ # Force a provider when the model ID is ambiguous
381
+ bin/smoke_model.rb --provider anthropic --model claude-opus-4-7
382
+ ```
383
+
384
+ `DEBUG=1` enables the providers' raw request/response logging.
385
+
386
+ If the model rejects a parameter the gem sends by default (e.g. opus-4-7 deprecated `temperature`), add the model ID to `Configuration::MODELS_WITHOUT_TEMPERATURE` in `lib/rails_console_ai/configuration.rb` so the providers omit the field.
387
+
355
388
  ## Configuration
356
389
 
357
390
  ```ruby
@@ -223,8 +223,7 @@ module RailsConsoleAi
223
223
  # Add to Readline history
224
224
  Readline::HISTORY.push(input) unless input == Readline::HISTORY.to_a.last
225
225
 
226
- # Auto-upgrade to thinking model on "think harder" phrases
227
- @engine.upgrade_to_thinking_model if input =~ /think\s*harder/i
226
+ @engine.maybe_auto_upgrade_thinking(input)
228
227
 
229
228
  @engine.set_interactive_query(input)
230
229
  @engine.add_user_message(input)
@@ -11,6 +11,9 @@ module RailsConsoleAi
11
11
  @thread_ts = thread_ts
12
12
  @user_name = user_name
13
13
  @reply_queue = Queue.new
14
+ @guidance_main = []
15
+ @guidance_sub = []
16
+ @guidance_mutex = Mutex.new
14
17
  @cancelled = false
15
18
  @log_prefix = "[#{@channel_id}/#{@thread_ts}] @#{@user_name}"
16
19
  @output_log = StringIO.new
@@ -18,6 +21,36 @@ module RailsConsoleAi
18
21
 
19
22
  def cancel!
20
23
  @cancelled = true
24
+ @guidance_mutex.synchronize do
25
+ @guidance_main.clear
26
+ @guidance_sub.clear
27
+ end
28
+ end
29
+
30
+ # Guidance is broadcast to both the main-engine queue and the sub-agent queue
31
+ # so a steering message arriving during a sub-agent run is seen by both layers
32
+ # (sub-agent reacts immediately; main engine reacts after delegate_task returns).
33
+ def add_guidance(text)
34
+ @guidance_mutex.synchronize do
35
+ @guidance_main << text
36
+ @guidance_sub << text
37
+ end
38
+ end
39
+
40
+ def drain_guidance(scope: :main)
41
+ @guidance_mutex.synchronize do
42
+ arr = scope == :sub ? @guidance_sub : @guidance_main
43
+ pending = arr.dup
44
+ arr.clear
45
+ pending
46
+ end
47
+ end
48
+
49
+ def pending_guidance?(scope: :main)
50
+ @guidance_mutex.synchronize do
51
+ arr = scope == :sub ? @guidance_sub : @guidance_main
52
+ !arr.empty?
53
+ end
21
54
  end
22
55
 
23
56
  def cancelled?
@@ -64,6 +64,18 @@ module RailsConsoleAi
64
64
  @parent.cancelled?
65
65
  end
66
66
 
67
+ def pending_guidance?
68
+ @parent.respond_to?(:pending_guidance?) && @parent.pending_guidance?(scope: :sub)
69
+ end
70
+
71
+ def drain_guidance
72
+ @parent.respond_to?(:drain_guidance) ? @parent.drain_guidance(scope: :sub) : []
73
+ end
74
+
75
+ def add_guidance(text)
76
+ @parent.add_guidance(text) if @parent.respond_to?(:add_guidance)
77
+ end
78
+
67
79
  def supports_danger?
68
80
  false # Sub-agents must never silently bypass safety guards
69
81
  end
@@ -1,3 +1,5 @@
1
+ require 'set'
2
+
1
3
  module RailsConsoleAi
2
4
  class Configuration
3
5
  PROVIDERS = %i[anthropic openai local bedrock].freeze
@@ -18,6 +20,17 @@ module RailsConsoleAi
18
20
  'claude-opus-4-6' => 4_096,
19
21
  }.freeze
20
22
 
23
+ # Models that reject the `temperature` parameter. Configuration#resolved_temperature
24
+ # returns nil for these so providers can omit the field from the request.
25
+ MODELS_WITHOUT_TEMPERATURE = Set.new(%w[
26
+ claude-opus-4-7
27
+ anthropic.claude-opus-4-7
28
+ us.anthropic.claude-opus-4-7
29
+ eu.anthropic.claude-opus-4-7
30
+ jp.anthropic.claude-opus-4-7
31
+ global.anthropic.claude-opus-4-7
32
+ ]).freeze
33
+
21
34
  attr_accessor :provider, :api_key, :model, :thinking_model, :max_tokens,
22
35
  :auto_execute, :temperature,
23
36
  :timeout, :debug, :max_tool_rounds,
@@ -179,6 +192,13 @@ module RailsConsoleAi
179
192
  DEFAULT_MAX_TOKENS.fetch(resolved_model, 4096)
180
193
  end
181
194
 
195
+ # Returns nil for models that reject the `temperature` parameter (e.g. opus-4-7).
196
+ # Providers should use this in place of @temperature.
197
+ def resolved_temperature
198
+ return nil if MODELS_WITHOUT_TEMPERATURE.include?(resolved_model)
199
+ @temperature
200
+ end
201
+
182
202
  def resolved_thinking_model
183
203
  return @thinking_model if @thinking_model && !@thinking_model.empty?
184
204
 
@@ -110,6 +110,7 @@ module RailsConsoleAi
110
110
  init_interactive unless @interactive_start
111
111
  @channel.log_input(text) if @channel.respond_to?(:log_input)
112
112
  @interactive_query ||= text
113
+ maybe_auto_upgrade_thinking(text)
113
114
  @history << { role: :user, content: text }
114
115
 
115
116
  status = send_and_execute
@@ -450,6 +451,13 @@ module RailsConsoleAi
450
451
  parts.compact.join("\n\n")
451
452
  end
452
453
 
454
+ AUTO_THINK_PATTERN = /\bthink\s+(harder|deeper|hard|carefully|more\s+carefully)\b/i
455
+
456
+ def maybe_auto_upgrade_thinking(text)
457
+ return unless text.is_a?(String) && text =~ AUTO_THINK_PATTERN
458
+ upgrade_to_thinking_model
459
+ end
460
+
453
461
  def upgrade_to_thinking_model
454
462
  config = RailsConsoleAi.configuration
455
463
  current = effective_model
@@ -777,6 +785,7 @@ module RailsConsoleAi
777
785
  require 'rails_console_ai/tools/registry'
778
786
  tools = tools_override || Tools::Registry.new(executor: @executor, channel: @channel)
779
787
  active_system_prompt = system_prompt || context
788
+ @executor.reset_cancelled! if @executor
780
789
  max_rounds = RailsConsoleAi.configuration.max_tool_rounds
781
790
  total_input = 0
782
791
  total_output = 0
@@ -794,21 +803,32 @@ module RailsConsoleAi
794
803
  break
795
804
  end
796
805
 
806
+ if round > 0 && @channel.respond_to?(:pending_guidance?) && @channel.pending_guidance?
807
+ pending = @channel.drain_guidance
808
+ guidance_text = format_user_interruption(pending)
809
+ guidance_msg = { role: :user, content: guidance_text }
810
+ messages << guidance_msg
811
+ new_messages << guidance_msg
812
+ @channel.display_status(" Steering: incorporating user guidance.")
813
+ end
814
+
797
815
  if round == 0
798
816
  @channel.display_status(" Thinking...")
799
- else
800
- if last_thinking
801
- last_thinking.split("\n").each do |line|
802
- @channel.display_thinking(" #{line}")
803
- end
817
+ elsif last_thinking
818
+ last_thinking.split("\n").each do |line|
819
+ @channel.display_thinking(" #{line}")
804
820
  end
805
- @channel.display_status(" #{llm_status(round, messages, total_input, last_thinking, last_tool_names)}")
806
821
  end
807
822
 
808
823
  # Trim large tool outputs between rounds to prevent context explosion.
809
824
  # The LLM can still retrieve omitted outputs via recall_output.
810
825
  messages = trim_large_outputs(messages) if round > 0
811
826
 
827
+ if round > 0
828
+ req_tokens = estimate_request_tokens(messages)
829
+ @channel.display_status(" #{llm_status(round, messages, req_tokens, total_input, last_thinking, last_tool_names)}")
830
+ end
831
+
812
832
  if RailsConsoleAi.configuration.debug
813
833
  debug_pre_call(round, messages, active_system_prompt, tools, total_input, total_output)
814
834
  end
@@ -1012,6 +1032,11 @@ module RailsConsoleAi
1012
1032
 
1013
1033
  # --- Formatting helpers ---
1014
1034
 
1035
+ def estimate_request_tokens(messages)
1036
+ chars = messages.sum { |m| (m[:content] || m['content']).to_s.length }
1037
+ chars / 4
1038
+ end
1039
+
1015
1040
  def format_tokens(count)
1016
1041
  if count >= 1_000_000
1017
1042
  "#{(count / 1_000_000.0).round(1)}M"
@@ -1136,9 +1161,32 @@ module RailsConsoleAi
1136
1161
  str.length > max ? str[0..max] + '...' : str
1137
1162
  end
1138
1163
 
1139
- def llm_status(round, messages, tokens_so_far, last_thinking = nil, last_tool_names = [])
1164
+ # Wraps mid-task user messages with explicit framing so the model treats them
1165
+ # as a real-time interruption that supersedes the prior task, rather than as
1166
+ # a reply to the most recent tool result.
1167
+ def format_user_interruption(messages)
1168
+ joined = messages.map { |t| t.to_s.strip }.reject(&:empty?).join("\n\n")
1169
+ <<~MSG.strip
1170
+ [INTERRUPTION FROM USER — REAL-TIME MESSAGE]
1171
+
1172
+ The user sent the following message while you were working on the previous step.
1173
+ They sent it before seeing the result of your last tool call, so it is NOT a
1174
+ reply to that result. It is your most recent direction from the user and
1175
+ supersedes the prior task.
1176
+
1177
+ If they are telling you to stop, halt completely and acknowledge — do not
1178
+ autonomously switch to a different method to accomplish the original task.
1179
+ If their instruction is unclear, ask them what they want before continuing.
1180
+
1181
+ User message:
1182
+ "#{joined}"
1183
+ MSG
1184
+ end
1185
+
1186
+ def llm_status(round, messages, req_tokens, total_billed, last_thinking = nil, last_tool_names = [])
1140
1187
  status = "Calling LLM (round #{round + 1}, #{messages.length} msgs"
1141
- status += ", ~#{format_tokens(tokens_so_far)} ctx" if tokens_so_far > 0
1188
+ status += ", ~#{format_tokens(req_tokens)} ctx" if req_tokens > 0
1189
+ status += ", ~#{format_tokens(total_billed)} total" if total_billed > 0
1142
1190
  status += ")"
1143
1191
  if !last_thinking && last_tool_names.any?
1144
1192
  counts = last_tool_names.tally
@@ -206,6 +206,10 @@ module RailsConsoleAi
206
206
  @last_cancelled
207
207
  end
208
208
 
209
+ def reset_cancelled!
210
+ @last_cancelled = false
211
+ end
212
+
209
213
  def confirm_and_execute(code)
210
214
  return nil if code.nil? || code.strip.empty?
211
215
 
@@ -51,9 +51,10 @@ module RailsConsoleAi
51
51
  body = {
52
52
  model: config.resolved_model,
53
53
  max_tokens: config.resolved_max_tokens,
54
- temperature: config.temperature,
55
54
  messages: format_messages(messages)
56
55
  }
56
+ temp = config.resolved_temperature
57
+ body[:temperature] = temp unless temp.nil?
57
58
  if system_prompt
58
59
  body[:system] = [
59
60
  { 'type' => 'text', 'text' => system_prompt, 'cache_control' => { 'type' => 'ephemeral' } }
@@ -41,13 +41,13 @@ module RailsConsoleAi
41
41
  private
42
42
 
43
43
  def call_api(messages, system_prompt: nil, tools: nil)
44
+ inference = { max_tokens: config.resolved_max_tokens }
45
+ temp = config.resolved_temperature
46
+ inference[:temperature] = temp unless temp.nil?
44
47
  params = {
45
48
  model_id: config.resolved_model,
46
49
  messages: format_messages(messages),
47
- inference_config: {
48
- max_tokens: config.resolved_max_tokens,
49
- temperature: config.temperature
50
- }
50
+ inference_config: inference
51
51
  }
52
52
  if system_prompt
53
53
  sys_blocks = [{ text: system_prompt }]
@@ -21,9 +21,10 @@ module RailsConsoleAi
21
21
  body = {
22
22
  model: config.resolved_model,
23
23
  max_tokens: config.resolved_max_tokens,
24
- temperature: config.temperature,
25
24
  messages: formatted
26
25
  }
26
+ temp = config.resolved_temperature
27
+ body[:temperature] = temp unless temp.nil?
27
28
  body[:tools] = tools.to_openai_format if tools
28
29
 
29
30
  estimated_input_tokens = estimate_tokens(formatted, system_prompt, tools)
@@ -51,9 +51,10 @@ module RailsConsoleAi
51
51
  body = {
52
52
  model: config.resolved_model,
53
53
  max_tokens: config.resolved_max_tokens,
54
- temperature: config.temperature,
55
54
  messages: formatted
56
55
  }
56
+ temp = config.resolved_temperature
57
+ body[:temperature] = temp unless temp.nil?
57
58
  body[:tools] = tools.to_openai_format if tools
58
59
 
59
60
  json_body = JSON.generate(body)
@@ -491,6 +491,14 @@ module RailsConsoleAi
491
491
  return
492
492
  end
493
493
 
494
+ # If the engine is mid-run, treat the message as steering guidance to be
495
+ # folded in at the next tool-loop boundary instead of restarting.
496
+ if session[:thread]&.alive? && channel.respond_to?(:add_guidance)
497
+ channel.add_guidance(text)
498
+ channel.display("Got it. One moment.")
499
+ return
500
+ end
501
+
494
502
  # Otherwise treat as a new message in the conversation
495
503
  replace_session_thread(session) do
496
504
  Thread.current.report_on_exception = false
@@ -66,6 +66,12 @@ module RailsConsoleAi
66
66
  max_rounds.times do |round|
67
67
  break if channel.cancelled?
68
68
 
69
+ if round > 0 && channel.respond_to?(:pending_guidance?) && channel.pending_guidance?
70
+ pending = channel.drain_guidance
71
+ messages << { role: :user, content: format_user_interruption(pending) }
72
+ channel.display_status(" Steering: incorporating user guidance.")
73
+ end
74
+
69
75
  if round == 0
70
76
  channel.display_status("Thinking...")
71
77
  end
@@ -147,6 +153,25 @@ module RailsConsoleAi
147
153
  result&.text || '(sub-agent returned no result)'
148
154
  end
149
155
 
156
+ def format_user_interruption(messages)
157
+ joined = messages.map { |t| t.to_s.strip }.reject(&:empty?).join("\n\n")
158
+ <<~MSG.strip
159
+ [INTERRUPTION FROM USER — REAL-TIME MESSAGE]
160
+
161
+ The user sent the following message while you were working. They sent it
162
+ before seeing your latest tool result, so it is NOT a reply to that result.
163
+ It is your most recent direction from the user and supersedes the prior task.
164
+
165
+ If they are telling you to stop, halt immediately and finish with a brief
166
+ acknowledgement — do not switch to a different method to accomplish the
167
+ original task on your own. If unclear, return what you have so far and let
168
+ the parent agent ask the user.
169
+
170
+ User message:
171
+ "#{joined}"
172
+ MSG
173
+ end
174
+
150
175
  def build_provider
151
176
  config = RailsConsoleAi.configuration
152
177
  model_override = @agent_config['model'] || config.sub_agent_model
@@ -1,3 +1,3 @@
1
1
  module RailsConsoleAi
2
- VERSION = '0.27.0'.freeze
2
+ VERSION = '0.29.0'.freeze
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rails_console_ai
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.27.0
4
+ version: 0.29.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Cortfr