smart_prompt 0.5.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 6417bfd0f16178f4e5b2d26b9c14f3ce8a22e8d9ab77f3cacf1845fa6531d6d0
4
- data.tar.gz: 9eae765158a312e1605aad80917d59180109bb0a8c2603a2e2b7b95dd7284b00
3
+ metadata.gz: 73eed476ea088ca5d249ae100e774c4bbe4235242773c49b478ac60304f2250e
4
+ data.tar.gz: 7e28652297f66ab0829ea9f24927bfa0d8f8618915dc4e2d969f7733c29ca8bc
5
5
  SHA512:
6
- metadata.gz: 4109ce9c8131961870d2c9b83d3d2ed534ebdf2ba42b34674516b61058034bab9af6c53079354d39a154759ca02bb39751eeb2c055465f2dfb7db8f2a3a12682
7
- data.tar.gz: e12c080c4d95a7196e91497b02ef493a11b45e3da7ba6cdb5e342c00c40abb686fa1aac3e6c0db9e3905f5f18adab0395559188fe0fae5837c1128dad70f236b
6
+ metadata.gz: cfed4b173e3382fd59c8be7dd7712dd3e2da56948cf605e1e402e0c998bd20e6f0867915c9d8f749e6f794a63d796a013ea4c5aabad0177df17b2649e931c181
7
+ data.tar.gz: 21a684256011301008700ce3b66bf2220778ced62d1ecb4195fc610d492123a14717420f671af3499662f4df4fb9004948a7aeb5b9788756e715be2fdc9c344a
data/CHANGELOG.md CHANGED
@@ -5,7 +5,7 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
- ## [0.4.0] - 2026-06-21
8
+ ## [0.5.1] - 2026-06-21
9
9
  ### Added
10
10
  - **SenseNova (商汤日日新) support** — unified `SenseNovaAdapter` covering chat (商量), multimodal vision, Cupido embeddings, and 秒画 text-to-image, with SSE streaming and reasoning-field handling
11
11
  - **智谱 AI (BigModel / GLM) support** — unified `ZhipuAIAdapter` covering all REST categories: chat (GLM-4), vision (GLM-4V), embeddings (embedding-3), text-to-image (CogView), text-to-video (CogVideoX async), TTS (GLM-TTS), ASR (GLM-ASR), and rerank
@@ -13,9 +13,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
13
13
  - Intelligent conversation history management (sliding-window, relevance-based, summary-based, hybrid strategies) with session isolation, compression, persistence, and LRU caching
14
14
  - Token counter, message/session models, and persistence layer
15
15
  - Example configs, workers, and self-contained examples for every provider
16
+ - Integrated upstream gemma4 multimodal support (`use_model`, `thinking`, `image`/`audio`/`video`, `multimodal_prompt`) and `request_options` plumbing
16
17
 
17
18
  ### Fixed
18
19
  - Expose `engine` on `WorkerContext` so workers can reach a configured adapter directly (fixes the `engine.llms[...]` pattern used by media workers)
20
+ - `Worker#execute` default session_id was hard-coded to `"default"`, leaving the per-worker session branch as dead code and collapsing all history-using workers onto one shared session; now generates `worker_<name>_<ts>`
21
+ - `AnthropicAdapter`: add `extract_content_from_response` and stop double-wrapping multimodal (array) content
22
+ - file_upload multimodal fix: base64-encode local image/audio/video files instead of passing raw paths
23
+
24
+ ## [0.4.1] - 2026-04-22
25
+ ### Fixed
26
+ - Re-release package with `lib/smart_prompt/anthropic_adapter.rb`, which is required by the gem entrypoint.
27
+
28
+ ## [0.4.0] - 2026-04-22
29
+ ### Added
30
+ - Anthropic adapter support.
31
+
32
+ ## [0.3.6] - 2026-04-08
33
+ ### Changed
34
+ - Bumped `ruby-openai` dependency from `8.1.0` to `8.3.0`
19
35
 
20
36
  ## [0.3.2] - 2025-05-18
21
37
  ### Added
@@ -46,4 +62,4 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
46
62
  - Initial gem release
47
63
  - Llama.cpp adapter
48
64
  - Basic configuration parameters
49
- - Environment bug fixes
65
+ - Environment bug fixes
data/README.cn.md CHANGED
@@ -95,12 +95,33 @@ llms:
95
95
  adapter: openai
96
96
  url: http://localhost:11434/
97
97
  default_model: deepseek-r1
98
+ gemma4_local:
99
+ adapter: openai
100
+ url: http://localhost:8000/v1
101
+ api_key: dummy
102
+ default_model: gemma-4-12B-it
103
+ temperature: 1.0
104
+ top_p: 0.95
105
+ top_k: 64
98
106
  deepseek:
99
107
  adapter: openai
100
108
  url: https://api.deepseek.com
101
109
  api_key: ENV["DSKEY"]
102
110
  default_model: deepseek-reasoner
103
111
 
112
+ # 模型别名配置
113
+ models:
114
+ local/qwen3.5:
115
+ use: local
116
+ model: qwen3.5
117
+ deepseekv3.2:
118
+ use: SiliconFlow
119
+ model: Pro/deepseek-ai/DeepSeek-V3.2
120
+ gemma4/12b:
121
+ use: gemma4_local
122
+ model: gemma-4-12B-it
123
+ max_tokens: 1024
124
+
104
125
  # 默认设置
105
126
  default_llm: SiliconFlow
106
127
  template_path: "./templates"
@@ -128,9 +149,8 @@ logger_file: "./logs/smart_prompt.log"
128
149
  **workers/chat_worker.rb**:
129
150
  ```ruby
130
151
  SmartPrompt.define_worker :chat_assistant do
131
- # 使用特定的 LLM
132
- use "SiliconFlow"
133
- model "deepseek-ai/DeepSeek-V3"
152
+ # 使用配置好的模型别名
153
+ use_model "deepseekv3.2"
134
154
  # 设置系统消息
135
155
  sys_msg("你是一个有用的 AI 助手。", params)
136
156
  # 使用模板和参数
@@ -182,6 +202,26 @@ engine.call_worker_by_stream(:streaming_chat, {
182
202
  end
183
203
  ```
184
204
 
205
+ ### Gemma 4 12B 多模态
206
+
207
+ Gemma 4 12B 可以通过 LiteRT-LM、LM Studio、Ollama、llama.cpp 等 OpenAI 兼容本地服务接入。SmartPrompt 会把图片放在文本前、音频放在文本后,以匹配 Gemma 4 的多模态最佳实践。
208
+
209
+ ```ruby
210
+ SmartPrompt.define_worker :gemma_multimodal_assistant do
211
+ use_model "gemma4/12b"
212
+ thinking params.fetch(:thinking, true)
213
+ sys_msg("你是一个严谨的本地多模态助手。", params)
214
+
215
+ image(params[:image], token_budget: params[:token_budget] || 280) if params[:image]
216
+ video(params[:video], fps: 1, max_seconds: 60) if params[:video]
217
+ audio(params[:audio]) if params[:audio]
218
+ prompt(params[:message])
219
+
220
+ request_options(response_format: { type: "json_object" }) if params[:json]
221
+ send_msg
222
+ end
223
+ ```
224
+
185
225
  ### 工具集成
186
226
 
187
227
  ```ruby
@@ -561,6 +601,17 @@ llms:
561
601
  model: "FunAudioLLM/CosyVoice2-0.5B"
562
602
  ```
563
603
 
604
+ ### 模型别名配置
605
+
606
+ ```yaml
607
+ models:
608
+ model_alias:
609
+ use: "llm_name"
610
+ model: "model_identifier"
611
+ ```
612
+
613
+ 在 worker 中,`use_model "model_alias"` 等价于调用 `use "llm_name"` 和 `model "model_identifier"`。
614
+
564
615
  ### 路径配置
565
616
 
566
617
  ```yaml
@@ -685,4 +736,4 @@ end
685
736
 
686
737
  ---
687
738
 
688
- **SmartPrompt** - 让 Ruby 应用中的 LLM 集成变得简单、强大且优雅。
739
+ **SmartPrompt** - 让 Ruby 应用中的 LLM 集成变得简单、强大且优雅。
data/README.md CHANGED
@@ -95,12 +95,33 @@ llms:
95
95
  adapter: openai
96
96
  url: http://localhost:11434/
97
97
  default_model: deepseek-r1
98
+ gemma4_local:
99
+ adapter: openai
100
+ url: http://localhost:8000/v1
101
+ api_key: dummy
102
+ default_model: gemma-4-12B-it
103
+ temperature: 1.0
104
+ top_p: 0.95
105
+ top_k: 64
98
106
  deepseek:
99
107
  adapter: openai
100
108
  url: https://api.deepseek.com
101
109
  api_key: ENV["DSKEY"]
102
110
  default_model: deepseek-reasoner
103
111
 
112
+ # Model aliases
113
+ models:
114
+ local/qwen3.5:
115
+ use: local
116
+ model: qwen3.5
117
+ deepseekv3.2:
118
+ use: SiliconFlow
119
+ model: Pro/deepseek-ai/DeepSeek-V3.2
120
+ gemma4/12b:
121
+ use: gemma4_local
122
+ model: gemma-4-12B-it
123
+ max_tokens: 1024
124
+
104
125
  # Default settings
105
126
  default_llm: SiliconFlow
106
127
  template_path: "./templates"
@@ -128,9 +149,8 @@ Create worker files in your `workers/` directory:
128
149
  **workers/chat_worker.rb**:
129
150
  ```ruby
130
151
  SmartPrompt.define_worker :chat_assistant do
131
- # Use a specific LLM
132
- use "SiliconFlow"
133
- model "deepseek-ai/DeepSeek-V3"
152
+ # Use a configured model alias
153
+ use_model "deepseekv3.2"
134
154
  # Set system message
135
155
  sys_msg("You are a helpful AI assistant.", params)
136
156
  # Use template with parameters
@@ -182,6 +202,26 @@ engine.call_worker_by_stream(:streaming_chat, {
182
202
  end
183
203
  ```
184
204
 
205
+ ### Gemma 4 12B Multimodal
206
+
207
+ Gemma 4 12B can be connected through OpenAI-compatible local servers such as LiteRT-LM, LM Studio, Ollama, or llama.cpp. SmartPrompt places images before text and audio after text to match Gemma 4 multimodal best practices.
208
+
209
+ ```ruby
210
+ SmartPrompt.define_worker :gemma_multimodal_assistant do
211
+ use_model "gemma4/12b"
212
+ thinking params.fetch(:thinking, true)
213
+ sys_msg("You are a precise local multimodal assistant.", params)
214
+
215
+ image(params[:image], token_budget: params[:token_budget] || 280) if params[:image]
216
+ video(params[:video], fps: 1, max_seconds: 60) if params[:video]
217
+ audio(params[:audio]) if params[:audio]
218
+ prompt(params[:message])
219
+
220
+ request_options(response_format: { type: "json_object" }) if params[:json]
221
+ send_msg
222
+ end
223
+ ```
224
+
185
225
  ### Tool Integration
186
226
 
187
227
  ```ruby
@@ -565,6 +605,17 @@ llms:
565
605
  model: "FunAudioLLM/CosyVoice2-0.5B"
566
606
  ```
567
607
 
608
+ ### Model Alias Configuration
609
+
610
+ ```yaml
611
+ models:
612
+ model_alias:
613
+ use: "llm_name"
614
+ model: "model_identifier"
615
+ ```
616
+
617
+ In a worker, `use_model "model_alias"` is equivalent to calling `use "llm_name"` and `model "model_identifier"`.
618
+
568
619
  ### Path Configuration
569
620
 
570
621
  ```yaml
@@ -689,4 +740,4 @@ This project is licensed under the MIT License - see the [LICENSE.txt](LICENSE.t
689
740
 
690
741
  ---
691
742
 
692
- **SmartPrompt** - Making LLM integration in Ruby applications simple, powerful, and elegant.
743
+ **SmartPrompt** - Making LLM integration in Ruby applications simple, powerful, and elegant.