rails_console_ai 0.27.0 → 0.28.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: db0c0b3b95cc349906845d4058e1eadc6be4ab67e25f32a4b8e4262e61340a87
4
- data.tar.gz: 0ba79ae87d8975193b2b6033279dc41a8841cf71e7f015ce8f36dafa6d62d463
3
+ metadata.gz: 50a2c9dce686cffa315e1fb5004f0ad0a9cbc67459d3a546b28ad1b95d0d1798
4
+ data.tar.gz: 7eea0529e3a3e4f9a4d60cf8e6b882e5dc825052c3793817013cf5cc6e617d22
5
5
  SHA512:
6
- metadata.gz: 615da46e5aa24149783309d69a3b739db1170459c1bf6d929c8ecfc875fbc0654117a7c5d15057aaf7d421660afa9959192f4112b25e722e5e224ce3de768fb0
7
- data.tar.gz: cf807c3dfc0fae08a79013606c5e00383bc26994461c02e4135f8efb495540a0f677bff62eae01c2f396e9ce2de8ede7c82a95415b69831425447aa93361dbdb
6
+ metadata.gz: 4ed2a4ad47456f7e28682e0302d346fd04e3778433015855fc2199d4b1dc3077e8ec96efec5eb1b9ae049f6ed5f8c5ac6944e354488313b1f20e8f4354a0bdd1
7
+ data.tar.gz: c3a9d0faaada962952b9995ac0e07b79384536dc03970c0da17fe0e2eef843edea799d2dfa8698363a869d74e2aa7d86f2dbe80088234206693d86dd29ef45b4
data/CHANGELOG.md CHANGED
@@ -2,6 +2,14 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file.
4
4
 
5
+ ## [0.28.0]
6
+
7
+ - Add `bin/smoke_model.rb` to smoke-test new models (plain, tool, parallel, cache checks)
8
+ - Support Claude Opus 4.7 by omitting the `temperature` parameter for models that reject it
9
+ - Show both estimated request tokens and total billed tokens in LLM round status
10
+ - Auto-upgrade to thinking model on "think harder/deeper/carefully" phrases in Slack as well as console
11
+ - Fix cancelled code execution state persisting into the next user turn
12
+
5
13
  ## [0.26.0]
6
14
 
7
15
  - Add sub-agent support
data/README.md CHANGED
@@ -352,6 +352,39 @@ end
352
352
 
353
353
  Timeout is automatically raised to 300s minimum for local models to account for slower inference.
354
354
 
355
+ ### Testing a new model
356
+
357
+ Before adopting a new Claude model, smoke-test it against the Anthropic or Bedrock provider with `bin/smoke_model.rb`. The script runs four checks and exits non-zero on any failure:
358
+
359
+ | check | what it verifies |
360
+ | -------- | -------------------------------------------------------------------------------- |
361
+ | plain | the model returns text for a basic prompt |
362
+ | tool | a single tool call → tool result → final answer round-trip works |
363
+ | parallel | the model issues multiple tool calls in one response when asked |
364
+ | cache | a long system prompt is written to and read from the prompt cache (with retry) |
365
+
366
+ ```bash
367
+ # Anthropic — provider inferred from the `claude-` prefix
368
+ ANTHROPIC_API_KEY=sk-ant-... bin/smoke_model.rb --model claude-opus-4-7
369
+
370
+ # Bedrock — provider inferred from the regional `us.anthropic.` prefix.
371
+ # Requires the aws-sdk-bedrockruntime gem and AWS credentials in the environment.
372
+ bin/smoke_model.rb --model us.anthropic.claude-opus-4-7
373
+
374
+ # Bedrock in another region
375
+ bin/smoke_model.rb --model eu.anthropic.claude-opus-4-7 --region eu-west-1
376
+
377
+ # Subset of checks, e.g. when iterating on cache behavior
378
+ bin/smoke_model.rb --model claude-sonnet-4-6 --checks cache
379
+
380
+ # Force a provider when the model ID is ambiguous
381
+ bin/smoke_model.rb --provider anthropic --model claude-opus-4-7
382
+ ```
383
+
384
+ `DEBUG=1` enables the providers' raw request/response logging.
385
+
386
+ If the model rejects a parameter the gem sends by default (e.g. opus-4-7 deprecated `temperature`), add the model ID to `Configuration::MODELS_WITHOUT_TEMPERATURE` in `lib/rails_console_ai/configuration.rb` so the providers omit the field.
387
+
355
388
  ## Configuration
356
389
 
357
390
  ```ruby
@@ -223,8 +223,7 @@ module RailsConsoleAi
223
223
  # Add to Readline history
224
224
  Readline::HISTORY.push(input) unless input == Readline::HISTORY.to_a.last
225
225
 
226
- # Auto-upgrade to thinking model on "think harder" phrases
227
- @engine.upgrade_to_thinking_model if input =~ /think\s*harder/i
226
+ @engine.maybe_auto_upgrade_thinking(input)
228
227
 
229
228
  @engine.set_interactive_query(input)
230
229
  @engine.add_user_message(input)
@@ -1,3 +1,5 @@
1
+ require 'set'
2
+
1
3
  module RailsConsoleAi
2
4
  class Configuration
3
5
  PROVIDERS = %i[anthropic openai local bedrock].freeze
@@ -18,6 +20,17 @@ module RailsConsoleAi
18
20
  'claude-opus-4-6' => 4_096,
19
21
  }.freeze
20
22
 
23
+ # Models that reject the `temperature` parameter. Configuration#resolved_temperature
24
+ # returns nil for these so providers can omit the field from the request.
25
+ MODELS_WITHOUT_TEMPERATURE = Set.new(%w[
26
+ claude-opus-4-7
27
+ anthropic.claude-opus-4-7
28
+ us.anthropic.claude-opus-4-7
29
+ eu.anthropic.claude-opus-4-7
30
+ jp.anthropic.claude-opus-4-7
31
+ global.anthropic.claude-opus-4-7
32
+ ]).freeze
33
+
21
34
  attr_accessor :provider, :api_key, :model, :thinking_model, :max_tokens,
22
35
  :auto_execute, :temperature,
23
36
  :timeout, :debug, :max_tool_rounds,
@@ -179,6 +192,13 @@ module RailsConsoleAi
179
192
  DEFAULT_MAX_TOKENS.fetch(resolved_model, 4096)
180
193
  end
181
194
 
195
+ # Returns nil for models that reject the `temperature` parameter (e.g. opus-4-7).
196
+ # Providers should use this in place of @temperature.
197
+ def resolved_temperature
198
+ return nil if MODELS_WITHOUT_TEMPERATURE.include?(resolved_model)
199
+ @temperature
200
+ end
201
+
182
202
  def resolved_thinking_model
183
203
  return @thinking_model if @thinking_model && !@thinking_model.empty?
184
204
 
@@ -110,6 +110,7 @@ module RailsConsoleAi
110
110
  init_interactive unless @interactive_start
111
111
  @channel.log_input(text) if @channel.respond_to?(:log_input)
112
112
  @interactive_query ||= text
113
+ maybe_auto_upgrade_thinking(text)
113
114
  @history << { role: :user, content: text }
114
115
 
115
116
  status = send_and_execute
@@ -450,6 +451,13 @@ module RailsConsoleAi
450
451
  parts.compact.join("\n\n")
451
452
  end
452
453
 
454
+ AUTO_THINK_PATTERN = /\bthink\s+(harder|deeper|hard|carefully|more\s+carefully)\b/i
455
+
456
+ def maybe_auto_upgrade_thinking(text)
457
+ return unless text.is_a?(String) && text =~ AUTO_THINK_PATTERN
458
+ upgrade_to_thinking_model
459
+ end
460
+
453
461
  def upgrade_to_thinking_model
454
462
  config = RailsConsoleAi.configuration
455
463
  current = effective_model
@@ -777,6 +785,7 @@ module RailsConsoleAi
777
785
  require 'rails_console_ai/tools/registry'
778
786
  tools = tools_override || Tools::Registry.new(executor: @executor, channel: @channel)
779
787
  active_system_prompt = system_prompt || context
788
+ @executor.reset_cancelled! if @executor
780
789
  max_rounds = RailsConsoleAi.configuration.max_tool_rounds
781
790
  total_input = 0
782
791
  total_output = 0
@@ -796,19 +805,21 @@ module RailsConsoleAi
796
805
 
797
806
  if round == 0
798
807
  @channel.display_status(" Thinking...")
799
- else
800
- if last_thinking
801
- last_thinking.split("\n").each do |line|
802
- @channel.display_thinking(" #{line}")
803
- end
808
+ elsif last_thinking
809
+ last_thinking.split("\n").each do |line|
810
+ @channel.display_thinking(" #{line}")
804
811
  end
805
- @channel.display_status(" #{llm_status(round, messages, total_input, last_thinking, last_tool_names)}")
806
812
  end
807
813
 
808
814
  # Trim large tool outputs between rounds to prevent context explosion.
809
815
  # The LLM can still retrieve omitted outputs via recall_output.
810
816
  messages = trim_large_outputs(messages) if round > 0
811
817
 
818
+ if round > 0
819
+ req_tokens = estimate_request_tokens(messages)
820
+ @channel.display_status(" #{llm_status(round, messages, req_tokens, total_input, last_thinking, last_tool_names)}")
821
+ end
822
+
812
823
  if RailsConsoleAi.configuration.debug
813
824
  debug_pre_call(round, messages, active_system_prompt, tools, total_input, total_output)
814
825
  end
@@ -1012,6 +1023,11 @@ module RailsConsoleAi
1012
1023
 
1013
1024
  # --- Formatting helpers ---
1014
1025
 
1026
+ def estimate_request_tokens(messages)
1027
+ chars = messages.sum { |m| (m[:content] || m['content']).to_s.length }
1028
+ chars / 4
1029
+ end
1030
+
1015
1031
  def format_tokens(count)
1016
1032
  if count >= 1_000_000
1017
1033
  "#{(count / 1_000_000.0).round(1)}M"
@@ -1136,9 +1152,10 @@ module RailsConsoleAi
1136
1152
  str.length > max ? str[0..max] + '...' : str
1137
1153
  end
1138
1154
 
1139
- def llm_status(round, messages, tokens_so_far, last_thinking = nil, last_tool_names = [])
1155
+ def llm_status(round, messages, req_tokens, total_billed, last_thinking = nil, last_tool_names = [])
1140
1156
  status = "Calling LLM (round #{round + 1}, #{messages.length} msgs"
1141
- status += ", ~#{format_tokens(tokens_so_far)} ctx" if tokens_so_far > 0
1157
+ status += ", ~#{format_tokens(req_tokens)} ctx" if req_tokens > 0
1158
+ status += ", ~#{format_tokens(total_billed)} total" if total_billed > 0
1142
1159
  status += ")"
1143
1160
  if !last_thinking && last_tool_names.any?
1144
1161
  counts = last_tool_names.tally
@@ -206,6 +206,10 @@ module RailsConsoleAi
206
206
  @last_cancelled
207
207
  end
208
208
 
209
+ def reset_cancelled!
210
+ @last_cancelled = false
211
+ end
212
+
209
213
  def confirm_and_execute(code)
210
214
  return nil if code.nil? || code.strip.empty?
211
215
 
@@ -51,9 +51,10 @@ module RailsConsoleAi
51
51
  body = {
52
52
  model: config.resolved_model,
53
53
  max_tokens: config.resolved_max_tokens,
54
- temperature: config.temperature,
55
54
  messages: format_messages(messages)
56
55
  }
56
+ temp = config.resolved_temperature
57
+ body[:temperature] = temp unless temp.nil?
57
58
  if system_prompt
58
59
  body[:system] = [
59
60
  { 'type' => 'text', 'text' => system_prompt, 'cache_control' => { 'type' => 'ephemeral' } }
@@ -41,13 +41,13 @@ module RailsConsoleAi
41
41
  private
42
42
 
43
43
  def call_api(messages, system_prompt: nil, tools: nil)
44
+ inference = { max_tokens: config.resolved_max_tokens }
45
+ temp = config.resolved_temperature
46
+ inference[:temperature] = temp unless temp.nil?
44
47
  params = {
45
48
  model_id: config.resolved_model,
46
49
  messages: format_messages(messages),
47
- inference_config: {
48
- max_tokens: config.resolved_max_tokens,
49
- temperature: config.temperature
50
- }
50
+ inference_config: inference
51
51
  }
52
52
  if system_prompt
53
53
  sys_blocks = [{ text: system_prompt }]
@@ -21,9 +21,10 @@ module RailsConsoleAi
21
21
  body = {
22
22
  model: config.resolved_model,
23
23
  max_tokens: config.resolved_max_tokens,
24
- temperature: config.temperature,
25
24
  messages: formatted
26
25
  }
26
+ temp = config.resolved_temperature
27
+ body[:temperature] = temp unless temp.nil?
27
28
  body[:tools] = tools.to_openai_format if tools
28
29
 
29
30
  estimated_input_tokens = estimate_tokens(formatted, system_prompt, tools)
@@ -51,9 +51,10 @@ module RailsConsoleAi
51
51
  body = {
52
52
  model: config.resolved_model,
53
53
  max_tokens: config.resolved_max_tokens,
54
- temperature: config.temperature,
55
54
  messages: formatted
56
55
  }
56
+ temp = config.resolved_temperature
57
+ body[:temperature] = temp unless temp.nil?
57
58
  body[:tools] = tools.to_openai_format if tools
58
59
 
59
60
  json_body = JSON.generate(body)
@@ -1,3 +1,3 @@
1
1
  module RailsConsoleAi
2
- VERSION = '0.27.0'.freeze
2
+ VERSION = '0.28.0'.freeze
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rails_console_ai
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.27.0
4
+ version: 0.28.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Cortfr