smart_prompt 0.5.0 → 0.5.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +25 -2
- data/README.cn.md +56 -4
- data/README.md +56 -4
- data/docs/ANTHROPIC_EXAMPLES.md +559 -0
- data/docs/CONVERSATION_INTEGRATION_SUMMARY.md +155 -0
- data/docs/HISTORY_EXAMPLES_README.md +533 -0
- data/docs/HISTORY_MANAGEMENT_GUIDE.md +797 -0
- data/docs/MONITORING_GUIDE.md +278 -0
- data/docs/MULTIMODAL_README.md +265 -0
- data/docs/RELEVANCE_BASED_STRATEGY_IMPLEMENTATION.md +124 -0
- data/docs/STT_README.md +302 -0
- data/docs/TTS_README.md +303 -0
- data/docs/VIDEO_GENERATION_README.md +246 -0
- data/docs/delete_files_list.md +124 -0
- data/lib/smart_prompt/anthropic_adapter.rb +167 -140
- data/lib/smart_prompt/conversation.rb +195 -42
- data/lib/smart_prompt/engine.rb +20 -10
- data/lib/smart_prompt/openai_adapter.rb +25 -1
- data/lib/smart_prompt/sensenova_adapter.rb +34 -211
- data/lib/smart_prompt/version.rb +1 -1
- data/lib/smart_prompt/worker.rb +5 -2
- data/lib/smart_prompt/zhipu_adapter.rb +51 -575
- data/lib/smart_prompt.rb +3 -1
- metadata +33 -22
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 1477a83b116691863533a9b1726e40be07b03b4faa67b60ebc72fe6a290d60f1
|
|
4
|
+
data.tar.gz: c9e71e998318d186f296495679573ccb2ad9b539420b3b9c3ee02314db8e2d8b
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: a433f3724915b38af6e3a1e66d3f52568ec305f85d3b038643ffeb1ae5522547a8e5c6769e6c53cf1a081ca353587ddd0156987a4d2c705f25f267e388a7f5b9
|
|
7
|
+
data.tar.gz: 90b3ba6033912705cc096b17765f0ad221d6f8a6e6b3b2a4350345d4b0777ebdee359d30604320918891ee7965cb55528a0aab4c12b9d9eebd67237ff52d7876
|
data/CHANGELOG.md
CHANGED
|
@@ -5,7 +5,14 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
-
## [
|
|
8
|
+
## [Unreleased]
|
|
9
|
+
### Added
|
|
10
|
+
- **硅基流动 (SiliconFlow) support** — unified `SiliconFlowAdapter` covering all REST categories: chat, multimodal vision (image/video/audio), embeddings (BAAI/bge-m3), rerank (bge-reranker-v2-m3), text-to-image (Kolors), image-edit (Qwen-Image-Edit), text-to-video (Wan2.2 async submit/poll), TTS (CosyVoice2), ASR (SenseVoiceSmall), and custom-voice management — with SSE streaming, reasoning_content passthrough, and the provider-correct async video flow (`POST /video/submit` → `requestId`, `POST /video/status` → `results.videos[].url`)
|
|
11
|
+
|
|
12
|
+
### Changed
|
|
13
|
+
- **Refactored the Zhipu / SenseNova / SiliconFlow adapters** — extracted byte-identical cross-provider logic into four shared concerns under `lib/smart_prompt/concerns/` (`HTTPClient`, `MultimodalMessages`, `OpenAIChatShaping`, `ImagePersistence`), and split the Zhipu and SiliconFlow adapters into per-modality capability modules under `lib/smart_prompt/adapters/<provider>/` (`Text` / `Embed` / `Image` / `Video` / `Voice` / `Rerank`). Pure internal refactor — no public-API change (`send_request` stays 5-arg, all DSL-delegated method names preserved), behavior unchanged. ~286 lines removed and the previously triplicated HTTP / multimodal / chat-shaping / image-persistence code now has a single source.
|
|
14
|
+
|
|
15
|
+
## [0.5.1] - 2026-06-21
|
|
9
16
|
### Added
|
|
10
17
|
- **SenseNova (商汤日日新) support** — unified `SenseNovaAdapter` covering chat (商量), multimodal vision, Cupido embeddings, and 秒画 text-to-image, with SSE streaming and reasoning-field handling
|
|
11
18
|
- **智谱 AI (BigModel / GLM) support** — unified `ZhipuAIAdapter` covering all REST categories: chat (GLM-4), vision (GLM-4V), embeddings (embedding-3), text-to-image (CogView), text-to-video (CogVideoX async), TTS (GLM-TTS), ASR (GLM-ASR), and rerank
|
|
@@ -13,9 +20,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
13
20
|
- Intelligent conversation history management (sliding-window, relevance-based, summary-based, hybrid strategies) with session isolation, compression, persistence, and LRU caching
|
|
14
21
|
- Token counter, message/session models, and persistence layer
|
|
15
22
|
- Example configs, workers, and self-contained examples for every provider
|
|
23
|
+
- Integrated upstream gemma4 multimodal support (`use_model`, `thinking`, `image`/`audio`/`video`, `multimodal_prompt`) and `request_options` plumbing
|
|
16
24
|
|
|
17
25
|
### Fixed
|
|
18
26
|
- Expose `engine` on `WorkerContext` so workers can reach a configured adapter directly (fixes the `engine.llms[...]` pattern used by media workers)
|
|
27
|
+
- `Worker#execute` default session_id was hard-coded to `"default"`, leaving the per-worker session branch as dead code and collapsing all history-using workers onto one shared session; now generates `worker_<name>_<ts>`
|
|
28
|
+
- `AnthropicAdapter`: add `extract_content_from_response` and stop double-wrapping multimodal (array) content
|
|
29
|
+
- file_upload multimodal fix: base64-encode local image/audio/video files instead of passing raw paths
|
|
30
|
+
|
|
31
|
+
## [0.4.1] - 2026-04-22
|
|
32
|
+
### Fixed
|
|
33
|
+
- Re-release package with `lib/smart_prompt/anthropic_adapter.rb`, which is required by the gem entrypoint.
|
|
34
|
+
|
|
35
|
+
## [0.4.0] - 2026-04-22
|
|
36
|
+
### Added
|
|
37
|
+
- Anthropic adapter support.
|
|
38
|
+
|
|
39
|
+
## [0.3.6] - 2026-04-08
|
|
40
|
+
### Changed
|
|
41
|
+
- Bumped `ruby-openai` dependency from `8.1.0` to `8.3.0`
|
|
19
42
|
|
|
20
43
|
## [0.3.2] - 2025-05-18
|
|
21
44
|
### Added
|
|
@@ -46,4 +69,4 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
46
69
|
- Initial gem release
|
|
47
70
|
- Llama.cpp adapter
|
|
48
71
|
- Basic configuration parameters
|
|
49
|
-
- Environment bug fixes
|
|
72
|
+
- Environment bug fixes
|
data/README.cn.md
CHANGED
|
@@ -14,6 +14,7 @@ SmartPrompt 是一个强大的 Ruby gem,提供了优雅的领域特定语言
|
|
|
14
14
|
- **Anthropic Claude**: 原生支持 Claude 模型及多模态能力
|
|
15
15
|
- **商汤 SenseNova(日日新)**: 单一适配器覆盖商量文本对话、图文多模态、Cupido 向量、秒画文生图四类 API,详见 `examples/sensenova_example.rb`
|
|
16
16
|
- **智谱 AI(BigModel / GLM)**: 单一适配器覆盖全部模型类别——文本对话(GLM-4)、图文多模态(GLM-4V)、向量(embedding-3)、文生图(CogView)、文生视频(CogVideoX)、语音合成(GLM-TTS)、语音识别(GLM-ASR),详见 `examples/zhipu_example.rb`
|
|
17
|
+
- **硅基流动(SiliconFlow)**: 单一适配器覆盖全部模型类别——文本对话、图文多模态、向量(bge-m3)、重排(rerank)、文生图(Kolors)、图像编辑(Qwen-Image-Edit)、文生视频(Wan2.2 异步)、语音合成(CosyVoice2)、语音识别(SenseVoiceSmall)及自定义音色管理,详见 `examples/siliconflow_example.rb`
|
|
17
18
|
- **Llama.cpp 集成**: 直接集成本地 Llama.cpp 服务器
|
|
18
19
|
- **可扩展适配器**: 易于扩展的适配器系统,支持新的 LLM 提供商
|
|
19
20
|
- **统一接口**: 无论底层 LLM 提供商如何,都使用相同的 API
|
|
@@ -95,12 +96,33 @@ llms:
|
|
|
95
96
|
adapter: openai
|
|
96
97
|
url: http://localhost:11434/
|
|
97
98
|
default_model: deepseek-r1
|
|
99
|
+
gemma4_local:
|
|
100
|
+
adapter: openai
|
|
101
|
+
url: http://localhost:8000/v1
|
|
102
|
+
api_key: dummy
|
|
103
|
+
default_model: gemma-4-12B-it
|
|
104
|
+
temperature: 1.0
|
|
105
|
+
top_p: 0.95
|
|
106
|
+
top_k: 64
|
|
98
107
|
deepseek:
|
|
99
108
|
adapter: openai
|
|
100
109
|
url: https://api.deepseek.com
|
|
101
110
|
api_key: ENV["DSKEY"]
|
|
102
111
|
default_model: deepseek-reasoner
|
|
103
112
|
|
|
113
|
+
# 模型别名配置
|
|
114
|
+
models:
|
|
115
|
+
local/qwen3.5:
|
|
116
|
+
use: local
|
|
117
|
+
model: qwen3.5
|
|
118
|
+
deepseekv3.2:
|
|
119
|
+
use: SiliconFlow
|
|
120
|
+
model: Pro/deepseek-ai/DeepSeek-V3.2
|
|
121
|
+
gemma4/12b:
|
|
122
|
+
use: gemma4_local
|
|
123
|
+
model: gemma-4-12B-it
|
|
124
|
+
max_tokens: 1024
|
|
125
|
+
|
|
104
126
|
# 默认设置
|
|
105
127
|
default_llm: SiliconFlow
|
|
106
128
|
template_path: "./templates"
|
|
@@ -128,9 +150,8 @@ logger_file: "./logs/smart_prompt.log"
|
|
|
128
150
|
**workers/chat_worker.rb**:
|
|
129
151
|
```ruby
|
|
130
152
|
SmartPrompt.define_worker :chat_assistant do
|
|
131
|
-
#
|
|
132
|
-
|
|
133
|
-
model "deepseek-ai/DeepSeek-V3"
|
|
153
|
+
# 使用配置好的模型别名
|
|
154
|
+
use_model "deepseekv3.2"
|
|
134
155
|
# 设置系统消息
|
|
135
156
|
sys_msg("你是一个有用的 AI 助手。", params)
|
|
136
157
|
# 使用模板和参数
|
|
@@ -182,6 +203,26 @@ engine.call_worker_by_stream(:streaming_chat, {
|
|
|
182
203
|
end
|
|
183
204
|
```
|
|
184
205
|
|
|
206
|
+
### Gemma 4 12B 多模态
|
|
207
|
+
|
|
208
|
+
Gemma 4 12B 可以通过 LiteRT-LM、LM Studio、Ollama、llama.cpp 等 OpenAI 兼容本地服务接入。SmartPrompt 会把图片放在文本前、音频放在文本后,以匹配 Gemma 4 的多模态最佳实践。
|
|
209
|
+
|
|
210
|
+
```ruby
|
|
211
|
+
SmartPrompt.define_worker :gemma_multimodal_assistant do
|
|
212
|
+
use_model "gemma4/12b"
|
|
213
|
+
thinking params.fetch(:thinking, true)
|
|
214
|
+
sys_msg("你是一个严谨的本地多模态助手。", params)
|
|
215
|
+
|
|
216
|
+
image(params[:image], token_budget: params[:token_budget] || 280) if params[:image]
|
|
217
|
+
video(params[:video], fps: 1, max_seconds: 60) if params[:video]
|
|
218
|
+
audio(params[:audio]) if params[:audio]
|
|
219
|
+
prompt(params[:message])
|
|
220
|
+
|
|
221
|
+
request_options(response_format: { type: "json_object" }) if params[:json]
|
|
222
|
+
send_msg
|
|
223
|
+
end
|
|
224
|
+
```
|
|
225
|
+
|
|
185
226
|
### 工具集成
|
|
186
227
|
|
|
187
228
|
```ruby
|
|
@@ -561,6 +602,17 @@ llms:
|
|
|
561
602
|
model: "FunAudioLLM/CosyVoice2-0.5B"
|
|
562
603
|
```
|
|
563
604
|
|
|
605
|
+
### 模型别名配置
|
|
606
|
+
|
|
607
|
+
```yaml
|
|
608
|
+
models:
|
|
609
|
+
model_alias:
|
|
610
|
+
use: "llm_name"
|
|
611
|
+
model: "model_identifier"
|
|
612
|
+
```
|
|
613
|
+
|
|
614
|
+
在 worker 中,`use_model "model_alias"` 等价于调用 `use "llm_name"` 和 `model "model_identifier"`。
|
|
615
|
+
|
|
564
616
|
### 路径配置
|
|
565
617
|
|
|
566
618
|
```yaml
|
|
@@ -685,4 +737,4 @@ end
|
|
|
685
737
|
|
|
686
738
|
---
|
|
687
739
|
|
|
688
|
-
**SmartPrompt** - 让 Ruby 应用中的 LLM 集成变得简单、强大且优雅。
|
|
740
|
+
**SmartPrompt** - 让 Ruby 应用中的 LLM 集成变得简单、强大且优雅。
|
data/README.md
CHANGED
|
@@ -14,6 +14,7 @@ SmartPrompt is a powerful Ruby gem that provides an elegant domain-specific lang
|
|
|
14
14
|
- **Anthropic Claude**: Native support for Claude models with multimodal capabilities
|
|
15
15
|
- **SenseNova (商汤日日新)**: One adapter covers chat (商量), multimodal vision (图文多模态), Cupido embeddings (向量), and 秒画 text-to-image — see `examples/sensenova_example.rb`
|
|
16
16
|
- **智谱 AI (BigModel / GLM)**: One adapter covers all categories — chat (GLM-4), vision (GLM-4V), embeddings (embedding-3), text-to-image (CogView), text-to-video (CogVideoX), TTS (GLM-TTS), ASR (GLM-ASR) — see `examples/zhipu_example.rb`
|
|
17
|
+
- **硅基流动 (SiliconFlow)**: One adapter covers all categories — chat, multimodal vision, embeddings, rerank, text-to-image (Kolors), image-edit (Qwen-Image-Edit), text-to-video (Wan2.2 async), TTS (CosyVoice2), ASR (SenseVoiceSmall) — see `examples/siliconflow_example.rb`
|
|
17
18
|
- **Llama.cpp Integration**: Direct integration with local Llama.cpp servers
|
|
18
19
|
- **Extensible Adapters**: Easy-to-extend adapter system for new LLM providers
|
|
19
20
|
- **Unified Interface**: Same API regardless of the underlying LLM provider
|
|
@@ -95,12 +96,33 @@ llms:
|
|
|
95
96
|
adapter: openai
|
|
96
97
|
url: http://localhost:11434/
|
|
97
98
|
default_model: deepseek-r1
|
|
99
|
+
gemma4_local:
|
|
100
|
+
adapter: openai
|
|
101
|
+
url: http://localhost:8000/v1
|
|
102
|
+
api_key: dummy
|
|
103
|
+
default_model: gemma-4-12B-it
|
|
104
|
+
temperature: 1.0
|
|
105
|
+
top_p: 0.95
|
|
106
|
+
top_k: 64
|
|
98
107
|
deepseek:
|
|
99
108
|
adapter: openai
|
|
100
109
|
url: https://api.deepseek.com
|
|
101
110
|
api_key: ENV["DSKEY"]
|
|
102
111
|
default_model: deepseek-reasoner
|
|
103
112
|
|
|
113
|
+
# Model aliases
|
|
114
|
+
models:
|
|
115
|
+
local/qwen3.5:
|
|
116
|
+
use: local
|
|
117
|
+
model: qwen3.5
|
|
118
|
+
deepseekv3.2:
|
|
119
|
+
use: SiliconFlow
|
|
120
|
+
model: Pro/deepseek-ai/DeepSeek-V3.2
|
|
121
|
+
gemma4/12b:
|
|
122
|
+
use: gemma4_local
|
|
123
|
+
model: gemma-4-12B-it
|
|
124
|
+
max_tokens: 1024
|
|
125
|
+
|
|
104
126
|
# Default settings
|
|
105
127
|
default_llm: SiliconFlow
|
|
106
128
|
template_path: "./templates"
|
|
@@ -128,9 +150,8 @@ Create worker files in your `workers/` directory:
|
|
|
128
150
|
**workers/chat_worker.rb**:
|
|
129
151
|
```ruby
|
|
130
152
|
SmartPrompt.define_worker :chat_assistant do
|
|
131
|
-
# Use a
|
|
132
|
-
|
|
133
|
-
model "deepseek-ai/DeepSeek-V3"
|
|
153
|
+
# Use a configured model alias
|
|
154
|
+
use_model "deepseekv3.2"
|
|
134
155
|
# Set system message
|
|
135
156
|
sys_msg("You are a helpful AI assistant.", params)
|
|
136
157
|
# Use template with parameters
|
|
@@ -182,6 +203,26 @@ engine.call_worker_by_stream(:streaming_chat, {
|
|
|
182
203
|
end
|
|
183
204
|
```
|
|
184
205
|
|
|
206
|
+
### Gemma 4 12B Multimodal
|
|
207
|
+
|
|
208
|
+
Gemma 4 12B can be connected through OpenAI-compatible local servers such as LiteRT-LM, LM Studio, Ollama, or llama.cpp. SmartPrompt places images before text and audio after text to match Gemma 4 multimodal best practices.
|
|
209
|
+
|
|
210
|
+
```ruby
|
|
211
|
+
SmartPrompt.define_worker :gemma_multimodal_assistant do
|
|
212
|
+
use_model "gemma4/12b"
|
|
213
|
+
thinking params.fetch(:thinking, true)
|
|
214
|
+
sys_msg("You are a precise local multimodal assistant.", params)
|
|
215
|
+
|
|
216
|
+
image(params[:image], token_budget: params[:token_budget] || 280) if params[:image]
|
|
217
|
+
video(params[:video], fps: 1, max_seconds: 60) if params[:video]
|
|
218
|
+
audio(params[:audio]) if params[:audio]
|
|
219
|
+
prompt(params[:message])
|
|
220
|
+
|
|
221
|
+
request_options(response_format: { type: "json_object" }) if params[:json]
|
|
222
|
+
send_msg
|
|
223
|
+
end
|
|
224
|
+
```
|
|
225
|
+
|
|
185
226
|
### Tool Integration
|
|
186
227
|
|
|
187
228
|
```ruby
|
|
@@ -565,6 +606,17 @@ llms:
|
|
|
565
606
|
model: "FunAudioLLM/CosyVoice2-0.5B"
|
|
566
607
|
```
|
|
567
608
|
|
|
609
|
+
### Model Alias Configuration
|
|
610
|
+
|
|
611
|
+
```yaml
|
|
612
|
+
models:
|
|
613
|
+
model_alias:
|
|
614
|
+
use: "llm_name"
|
|
615
|
+
model: "model_identifier"
|
|
616
|
+
```
|
|
617
|
+
|
|
618
|
+
In a worker, `use_model "model_alias"` is equivalent to calling `use "llm_name"` and `model "model_identifier"`.
|
|
619
|
+
|
|
568
620
|
### Path Configuration
|
|
569
621
|
|
|
570
622
|
```yaml
|
|
@@ -689,4 +741,4 @@ This project is licensed under the MIT License - see the [LICENSE.txt](LICENSE.t
|
|
|
689
741
|
|
|
690
742
|
---
|
|
691
743
|
|
|
692
|
-
**SmartPrompt** - Making LLM integration in Ruby applications simple, powerful, and elegant.
|
|
744
|
+
**SmartPrompt** - Making LLM integration in Ruby applications simple, powerful, and elegant.
|