smart_prompt 0.5.0 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -2
- data/README.cn.md +55 -4
- data/README.md +55 -4
- data/docs/ANTHROPIC_EXAMPLES.md +559 -0
- data/docs/CONVERSATION_INTEGRATION_SUMMARY.md +155 -0
- data/docs/HISTORY_EXAMPLES_README.md +533 -0
- data/docs/HISTORY_MANAGEMENT_GUIDE.md +797 -0
- data/docs/MONITORING_GUIDE.md +278 -0
- data/docs/MULTIMODAL_README.md +265 -0
- data/docs/RELEVANCE_BASED_STRATEGY_IMPLEMENTATION.md +124 -0
- data/docs/STT_README.md +302 -0
- data/docs/TTS_README.md +303 -0
- data/docs/VIDEO_GENERATION_README.md +246 -0
- data/docs/delete_files_list.md +124 -0
- data/lib/smart_prompt/anthropic_adapter.rb +167 -140
- data/lib/smart_prompt/conversation.rb +195 -42
- data/lib/smart_prompt/engine.rb +20 -10
- data/lib/smart_prompt/openai_adapter.rb +25 -1
- data/lib/smart_prompt/version.rb +1 -1
- data/lib/smart_prompt/worker.rb +5 -2
- data/lib/smart_prompt.rb +2 -1
- metadata +33 -22
|
@@ -0,0 +1,278 @@
|
|
|
1
|
+
# History Manager Monitoring Guide
|
|
2
|
+
|
|
3
|
+
This guide explains the logging and monitoring features available in the SmartPrompt History Manager.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
The History Manager includes comprehensive logging and monitoring capabilities to help you:
|
|
8
|
+
- Track system operations and performance
|
|
9
|
+
- Debug context selection decisions
|
|
10
|
+
- Monitor resource usage and cache efficiency
|
|
11
|
+
- Export metrics for external monitoring systems
|
|
12
|
+
|
|
13
|
+
## Configuration
|
|
14
|
+
|
|
15
|
+
Enable monitoring in your configuration:
|
|
16
|
+
|
|
17
|
+
```ruby
|
|
18
|
+
config = {
|
|
19
|
+
monitoring: {
|
|
20
|
+
enabled: true, # Enable/disable monitoring
|
|
21
|
+
log_level: :info # Log level: :debug, :info, :warn, :error
|
|
22
|
+
}
|
|
23
|
+
}
|
|
24
|
+
|
|
25
|
+
manager = SmartPrompt::HistoryManager.new(config)
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Log Levels
|
|
29
|
+
|
|
30
|
+
### INFO Level
|
|
31
|
+
Logs important operations:
|
|
32
|
+
- Session creation and deletion
|
|
33
|
+
- Message additions
|
|
34
|
+
- Context retrievals
|
|
35
|
+
- Session clearing and exports
|
|
36
|
+
- Cleanup operations
|
|
37
|
+
|
|
38
|
+
Example:
|
|
39
|
+
```
|
|
40
|
+
[HistoryManager] Session user_123 created with config: max_messages=100, max_tokens=4000
|
|
41
|
+
[HistoryManager] Message added to session user_123: role=user, tokens=15
|
|
42
|
+
[HistoryManager] Session user_123 cleared: 10 -> 1 messages (keep_system=true)
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### DEBUG Level
|
|
46
|
+
Logs detailed information for debugging:
|
|
47
|
+
- Cache hits and misses
|
|
48
|
+
- Context selection decisions
|
|
49
|
+
- Strategy selection in hybrid mode
|
|
50
|
+
- Token trimming operations
|
|
51
|
+
- Importance scores
|
|
52
|
+
|
|
53
|
+
Example:
|
|
54
|
+
```
|
|
55
|
+
[HistoryManager] Session user_123 retrieved from cache
|
|
56
|
+
[SlidingWindowStrategy] selected 5/10 messages (window_size=5, system=0, recent=5)
|
|
57
|
+
[RelevanceBasedStrategy] calculated scores for 30 messages, top 5 scores: [0.85, 0.82, 0.79, 0.75, 0.72]
|
|
58
|
+
[HybridStrategy] Adaptive mode: selected RelevanceBasedStrategy for 30 messages
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
### ERROR Level
|
|
62
|
+
Logs errors with context:
|
|
63
|
+
- Persistence failures
|
|
64
|
+
- Compression errors
|
|
65
|
+
- Operation failures
|
|
66
|
+
|
|
67
|
+
Example:
|
|
68
|
+
```
|
|
69
|
+
[HistoryManager] Persistence failed for session user_123: Errno::ENOSPC - No space left on device
|
|
70
|
+
[HistoryManager] Failed to add message to session user_123: ArgumentError - Invalid message format
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
## Statistics and Metrics
|
|
74
|
+
|
|
75
|
+
### Session-Specific Statistics
|
|
76
|
+
|
|
77
|
+
Get statistics for a specific session:
|
|
78
|
+
|
|
79
|
+
```ruby
|
|
80
|
+
stats = manager.get_stats("session_id")
|
|
81
|
+
|
|
82
|
+
# Returns:
|
|
83
|
+
{
|
|
84
|
+
session_id: "session_id",
|
|
85
|
+
message_count: 25,
|
|
86
|
+
total_tokens: 1500,
|
|
87
|
+
created_at: <Time>,
|
|
88
|
+
updated_at: <Time>,
|
|
89
|
+
config: { ... }
|
|
90
|
+
}
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
### System-Wide Statistics
|
|
94
|
+
|
|
95
|
+
Get statistics for all sessions:
|
|
96
|
+
|
|
97
|
+
```ruby
|
|
98
|
+
stats = manager.get_stats
|
|
99
|
+
|
|
100
|
+
# Returns:
|
|
101
|
+
{
|
|
102
|
+
# Session metrics
|
|
103
|
+
active_sessions: 10,
|
|
104
|
+
sessions_created: 50,
|
|
105
|
+
sessions_deleted: 40,
|
|
106
|
+
|
|
107
|
+
# Message metrics
|
|
108
|
+
total_messages: 250,
|
|
109
|
+
messages_added: 300,
|
|
110
|
+
messages_per_session_avg: 25.0,
|
|
111
|
+
|
|
112
|
+
# Token metrics
|
|
113
|
+
total_tokens: 15000,
|
|
114
|
+
tokens_per_session_avg: 1500.0,
|
|
115
|
+
tokens_per_message_avg: 60.0,
|
|
116
|
+
|
|
117
|
+
# Cache metrics
|
|
118
|
+
cache_size: 100,
|
|
119
|
+
cache_hits: 850,
|
|
120
|
+
cache_misses: 150,
|
|
121
|
+
cache_hit_rate: 0.85,
|
|
122
|
+
|
|
123
|
+
# Operation metrics
|
|
124
|
+
context_retrievals: 200,
|
|
125
|
+
|
|
126
|
+
# Compression metrics
|
|
127
|
+
compression_operations: 5,
|
|
128
|
+
tokens_saved_by_compression: 5000,
|
|
129
|
+
|
|
130
|
+
# Error metrics
|
|
131
|
+
persistence_errors: 2
|
|
132
|
+
}
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
## Metrics Export
|
|
136
|
+
|
|
137
|
+
Export metrics in various formats for integration with monitoring systems.
|
|
138
|
+
|
|
139
|
+
### Prometheus Format
|
|
140
|
+
|
|
141
|
+
```ruby
|
|
142
|
+
metrics = manager.export_metrics(format: :prometheus)
|
|
143
|
+
|
|
144
|
+
# Returns Prometheus-style metrics:
|
|
145
|
+
# HELP smart_prompt_active_sessions Number of active sessions in cache
|
|
146
|
+
# TYPE smart_prompt_active_sessions gauge
|
|
147
|
+
smart_prompt_active_sessions 10
|
|
148
|
+
|
|
149
|
+
# HELP smart_prompt_cache_hit_rate Cache hit rate (0.0-1.0)
|
|
150
|
+
# TYPE smart_prompt_cache_hit_rate gauge
|
|
151
|
+
smart_prompt_cache_hit_rate 0.85
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
### JSON Format
|
|
155
|
+
|
|
156
|
+
```ruby
|
|
157
|
+
metrics = manager.export_metrics(format: :json)
|
|
158
|
+
|
|
159
|
+
# Returns JSON string with all metrics
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
### Hash Format
|
|
163
|
+
|
|
164
|
+
```ruby
|
|
165
|
+
metrics = manager.export_metrics(format: :hash)
|
|
166
|
+
|
|
167
|
+
# Returns Ruby hash with all metrics
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
## Context Selection Debugging
|
|
171
|
+
|
|
172
|
+
When debug logging is enabled, you can see detailed information about context selection:
|
|
173
|
+
|
|
174
|
+
```ruby
|
|
175
|
+
# Enable debug logging
|
|
176
|
+
SmartPrompt.logger.level = Logger::DEBUG
|
|
177
|
+
|
|
178
|
+
# Retrieve context
|
|
179
|
+
context = manager.get_context("session_id", max_tokens: 1000)
|
|
180
|
+
|
|
181
|
+
# Debug output shows:
|
|
182
|
+
# - Which strategy was selected (for hybrid mode)
|
|
183
|
+
# - How many messages were considered
|
|
184
|
+
# - How many messages were selected
|
|
185
|
+
# - Token counts before and after trimming
|
|
186
|
+
# - Importance scores for relevance-based selection
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
## Performance Monitoring
|
|
190
|
+
|
|
191
|
+
Track key performance indicators:
|
|
192
|
+
|
|
193
|
+
```ruby
|
|
194
|
+
stats = manager.get_stats
|
|
195
|
+
|
|
196
|
+
# Cache efficiency
|
|
197
|
+
cache_hit_rate = stats[:cache_hit_rate]
|
|
198
|
+
puts "Cache hit rate: #{(cache_hit_rate * 100).round(2)}%"
|
|
199
|
+
|
|
200
|
+
# Average resource usage
|
|
201
|
+
avg_messages = stats[:messages_per_session_avg]
|
|
202
|
+
avg_tokens = stats[:tokens_per_session_avg]
|
|
203
|
+
puts "Avg messages per session: #{avg_messages.round(2)}"
|
|
204
|
+
puts "Avg tokens per session: #{avg_tokens.round(2)}"
|
|
205
|
+
|
|
206
|
+
# Compression effectiveness
|
|
207
|
+
if stats[:compression_operations] > 0
|
|
208
|
+
tokens_saved = stats[:tokens_saved_by_compression]
|
|
209
|
+
puts "Tokens saved by compression: #{tokens_saved}"
|
|
210
|
+
end
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
## Error Tracking
|
|
214
|
+
|
|
215
|
+
Monitor errors to identify issues:
|
|
216
|
+
|
|
217
|
+
```ruby
|
|
218
|
+
stats = manager.get_stats
|
|
219
|
+
|
|
220
|
+
if stats[:persistence_errors] > 0
|
|
221
|
+
puts "Warning: #{stats[:persistence_errors]} persistence errors occurred"
|
|
222
|
+
# Check logs for details
|
|
223
|
+
end
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
## Best Practices
|
|
227
|
+
|
|
228
|
+
1. **Use INFO level in production** - Provides good visibility without excessive logging
|
|
229
|
+
2. **Use DEBUG level for troubleshooting** - Helps diagnose context selection issues
|
|
230
|
+
3. **Monitor cache hit rate** - Low hit rates may indicate cache size is too small
|
|
231
|
+
4. **Track compression metrics** - Verify compression is reducing token usage
|
|
232
|
+
5. **Export metrics regularly** - Integrate with monitoring systems like Prometheus
|
|
233
|
+
6. **Set up alerts** - Alert on high error rates or low cache hit rates
|
|
234
|
+
|
|
235
|
+
## Example: Complete Monitoring Setup
|
|
236
|
+
|
|
237
|
+
```ruby
|
|
238
|
+
require 'smart_prompt'
|
|
239
|
+
require 'logger'
|
|
240
|
+
|
|
241
|
+
# Configure logger
|
|
242
|
+
SmartPrompt.logger = Logger.new('history_manager.log')
|
|
243
|
+
SmartPrompt.logger.level = Logger::INFO
|
|
244
|
+
|
|
245
|
+
# Configure manager with monitoring
|
|
246
|
+
config = {
|
|
247
|
+
cache_size: 100,
|
|
248
|
+
monitoring: {
|
|
249
|
+
enabled: true,
|
|
250
|
+
log_level: :info
|
|
251
|
+
}
|
|
252
|
+
}
|
|
253
|
+
|
|
254
|
+
manager = SmartPrompt::HistoryManager.new(config)
|
|
255
|
+
|
|
256
|
+
# Use the manager
|
|
257
|
+
manager.add_message("user_123", { role: "user", content: "Hello" })
|
|
258
|
+
|
|
259
|
+
# Periodically export metrics
|
|
260
|
+
Thread.new do
|
|
261
|
+
loop do
|
|
262
|
+
sleep 60 # Every minute
|
|
263
|
+
metrics = manager.export_metrics(format: :prometheus)
|
|
264
|
+
File.write('metrics.prom', metrics)
|
|
265
|
+
end
|
|
266
|
+
end
|
|
267
|
+
|
|
268
|
+
# Check statistics
|
|
269
|
+
stats = manager.get_stats
|
|
270
|
+
puts "Active sessions: #{stats[:active_sessions]}"
|
|
271
|
+
puts "Cache hit rate: #{(stats[:cache_hit_rate] * 100).round(2)}%"
|
|
272
|
+
```
|
|
273
|
+
|
|
274
|
+
## See Also
|
|
275
|
+
|
|
276
|
+
- [History Optimization Design Document](.kiro/specs/history-optimization/design.md)
|
|
277
|
+
- [Monitoring Example](examples/monitoring_example.rb)
|
|
278
|
+
- [History Manager Tests](test/monitoring_test.rb)
|
|
@@ -0,0 +1,265 @@
|
|
|
1
|
+
# SmartPrompt 多模态功能扩展
|
|
2
|
+
|
|
3
|
+
本文档介绍 SmartPrompt 新增的多模态功能,支持图像和视频分析。
|
|
4
|
+
|
|
5
|
+
## 新增适配器
|
|
6
|
+
|
|
7
|
+
### MultimodalAdapter
|
|
8
|
+
|
|
9
|
+
新的 `MultimodalAdapter` 扩展了原有的 OpenAI 兼容适配器,支持 SiliconFlow 的多模态视觉模型。
|
|
10
|
+
|
|
11
|
+
## 支持的功能
|
|
12
|
+
|
|
13
|
+
### 1. 图像分析
|
|
14
|
+
- 单张图像分析
|
|
15
|
+
- 多张图像比较
|
|
16
|
+
- 文档文字提取
|
|
17
|
+
- 场景描述
|
|
18
|
+
|
|
19
|
+
### 2. 视频分析
|
|
20
|
+
- 视频内容理解
|
|
21
|
+
- 帧提取控制
|
|
22
|
+
- 时序分析
|
|
23
|
+
|
|
24
|
+
### 3. 多模态对话
|
|
25
|
+
- 图像+文本组合输入
|
|
26
|
+
- 视频+文本组合输入
|
|
27
|
+
- 多图像+文本组合输入
|
|
28
|
+
|
|
29
|
+
## 快速开始
|
|
30
|
+
|
|
31
|
+
### 1. 配置
|
|
32
|
+
|
|
33
|
+
创建配置文件 `config/multimodal_config.yml`:
|
|
34
|
+
|
|
35
|
+
```yaml
|
|
36
|
+
adapters:
|
|
37
|
+
multimodal: "MultimodalAdapter"
|
|
38
|
+
|
|
39
|
+
llms:
|
|
40
|
+
qwen_vl:
|
|
41
|
+
adapter: "multimodal"
|
|
42
|
+
url: "https://api.siliconflow.cn/v1/"
|
|
43
|
+
api_key: ENV["SILICONFLOW_API_KEY"]
|
|
44
|
+
default_model: "Qwen/Qwen2.5-VL-7B-Instruct"
|
|
45
|
+
|
|
46
|
+
default_llm: "qwen_vl"
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### 2. 创建工作流
|
|
50
|
+
|
|
51
|
+
在 `workers/` 目录中创建工作流定义:
|
|
52
|
+
|
|
53
|
+
```ruby
|
|
54
|
+
# workers/multimodal_workers.rb
|
|
55
|
+
SmartPrompt.define_worker :image_analyzer do
|
|
56
|
+
use "qwen_vl"
|
|
57
|
+
model "Qwen/Qwen2.5-VL-7B-Instruct"
|
|
58
|
+
|
|
59
|
+
messages = [
|
|
60
|
+
{
|
|
61
|
+
role: "user",
|
|
62
|
+
content: [
|
|
63
|
+
{ type: "text", text: params[:question] },
|
|
64
|
+
{ type: "image_url", image_url: { url: params[:image_url], detail: "auto" } }
|
|
65
|
+
]
|
|
66
|
+
}
|
|
67
|
+
]
|
|
68
|
+
|
|
69
|
+
sys_msg("你是一个专业的图像分析助手。", params)
|
|
70
|
+
params.merge(messages: messages)
|
|
71
|
+
send_msg
|
|
72
|
+
end
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### 3. 使用示例
|
|
76
|
+
|
|
77
|
+
```ruby
|
|
78
|
+
require 'smart_prompt'
|
|
79
|
+
|
|
80
|
+
# 初始化引擎
|
|
81
|
+
engine = SmartPrompt::Engine.new('config/multimodal_config.yml')
|
|
82
|
+
|
|
83
|
+
# 图像分析
|
|
84
|
+
result = engine.call_worker(:image_analyzer, {
|
|
85
|
+
image_url: "https://example.com/image.jpg",
|
|
86
|
+
question: "描述这张图片的内容"
|
|
87
|
+
})
|
|
88
|
+
|
|
89
|
+
puts result
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
## API 参考
|
|
93
|
+
|
|
94
|
+
### MultimodalAdapter 方法
|
|
95
|
+
|
|
96
|
+
#### `analyze_image(image_input, prompt, model = nil, detail: "auto", max_tokens: nil)`
|
|
97
|
+
|
|
98
|
+
分析单张图像。
|
|
99
|
+
|
|
100
|
+
**参数:**
|
|
101
|
+
- `image_input`: 图像 URL 或本地文件路径
|
|
102
|
+
- `prompt`: 分析提示文本
|
|
103
|
+
- `model`: 可选模型名称
|
|
104
|
+
- `detail`: 图像细节级别("low", "high", "auto")
|
|
105
|
+
- `max_tokens`: 最大输出 token 数
|
|
106
|
+
|
|
107
|
+
#### `analyze_video(video_input, prompt, model = nil, max_frames: 10, fps: 1, detail: "auto")`
|
|
108
|
+
|
|
109
|
+
分析视频内容。
|
|
110
|
+
|
|
111
|
+
**参数:**
|
|
112
|
+
- `video_input`: 视频 URL
|
|
113
|
+
- `prompt`: 分析提示文本
|
|
114
|
+
- `model`: 可选模型名称
|
|
115
|
+
- `max_frames`: 最大提取帧数
|
|
116
|
+
- `fps`: 帧率
|
|
117
|
+
- `detail`: 细节级别
|
|
118
|
+
|
|
119
|
+
#### `analyze_multiple_images(images, prompt, model = nil, detail: "auto")`
|
|
120
|
+
|
|
121
|
+
分析多张图像。
|
|
122
|
+
|
|
123
|
+
**参数:**
|
|
124
|
+
- `images`: 图像 URL 数组
|
|
125
|
+
- `prompt`: 分析提示文本
|
|
126
|
+
- `model`: 可选模型名称
|
|
127
|
+
- `detail`: 图像细节级别
|
|
128
|
+
|
|
129
|
+
### 消息格式
|
|
130
|
+
|
|
131
|
+
多模态消息使用标准 OpenAI 格式,支持 `image_url` 和 `video_url` 类型:
|
|
132
|
+
|
|
133
|
+
```ruby
|
|
134
|
+
messages = [
|
|
135
|
+
{
|
|
136
|
+
role: "user",
|
|
137
|
+
content: [
|
|
138
|
+
{ type: "text", text: "分析这张图片" },
|
|
139
|
+
{
|
|
140
|
+
type: "image_url",
|
|
141
|
+
image_url: {
|
|
142
|
+
url: "https://example.com/image.jpg",
|
|
143
|
+
detail: "auto"
|
|
144
|
+
}
|
|
145
|
+
}
|
|
146
|
+
]
|
|
147
|
+
}
|
|
148
|
+
]
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
## 支持的多模态模型
|
|
152
|
+
|
|
153
|
+
### SiliconFlow 支持的多模态模型
|
|
154
|
+
|
|
155
|
+
- **Qwen2.5-VL 系列**: 视觉语言模型
|
|
156
|
+
- **Qwen3-Omni 系列**: 全模态模型(视觉/音频/视频)
|
|
157
|
+
- **DeepSeek-VL2**: 视觉语言模型
|
|
158
|
+
- **GLM 系列**: 视觉语言模型
|
|
159
|
+
|
|
160
|
+
## 配置参数
|
|
161
|
+
|
|
162
|
+
### 图像参数
|
|
163
|
+
|
|
164
|
+
- `detail`: 控制图像处理细节级别
|
|
165
|
+
- `"low"`: 低分辨率,更快处理
|
|
166
|
+
- `"high"`: 高分辨率,更准确
|
|
167
|
+
- `"auto"`: 自动选择(推荐)
|
|
168
|
+
|
|
169
|
+
### 视频参数
|
|
170
|
+
|
|
171
|
+
- `max_frames`: 从视频中提取的最大帧数
|
|
172
|
+
- `fps`: 帧率,控制帧提取频率
|
|
173
|
+
|
|
174
|
+
## 错误处理
|
|
175
|
+
|
|
176
|
+
适配器包含完整的错误处理机制:
|
|
177
|
+
|
|
178
|
+
- 网络连接错误
|
|
179
|
+
- API 认证错误
|
|
180
|
+
- 文件格式错误
|
|
181
|
+
- 响应解析错误
|
|
182
|
+
|
|
183
|
+
## 示例工作流
|
|
184
|
+
|
|
185
|
+
### 1. 图像分析工作流
|
|
186
|
+
|
|
187
|
+
```ruby
|
|
188
|
+
SmartPrompt.define_worker :product_analyzer do
|
|
189
|
+
use "qwen_vl"
|
|
190
|
+
model "Qwen/Qwen2.5-VL-7B-Instruct"
|
|
191
|
+
|
|
192
|
+
messages = [
|
|
193
|
+
{
|
|
194
|
+
role: "user",
|
|
195
|
+
content: [
|
|
196
|
+
{ type: "text", text: "分析这个产品图片,包括产品类型、颜色、特征和可能的用途" },
|
|
197
|
+
{ type: "image_url", image_url: { url: params[:product_image], detail: "high" } }
|
|
198
|
+
]
|
|
199
|
+
}
|
|
200
|
+
]
|
|
201
|
+
|
|
202
|
+
sys_msg("你是一个专业的产品分析师。", params)
|
|
203
|
+
params.merge(messages: messages)
|
|
204
|
+
send_msg
|
|
205
|
+
end
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
### 2. 视频摘要工作流
|
|
209
|
+
|
|
210
|
+
```ruby
|
|
211
|
+
SmartPrompt.define_worker :video_summarizer do
|
|
212
|
+
use "qwen_vl"
|
|
213
|
+
model "Qwen/Qwen2.5-VL-7B-Instruct"
|
|
214
|
+
|
|
215
|
+
messages = [
|
|
216
|
+
{
|
|
217
|
+
role: "user",
|
|
218
|
+
content: [
|
|
219
|
+
{ type: "text", text: "请总结这个视频的主要内容" },
|
|
220
|
+
{
|
|
221
|
+
type: "video_url",
|
|
222
|
+
video_url: {
|
|
223
|
+
url: params[:video_url],
|
|
224
|
+
detail: "auto",
|
|
225
|
+
max_frames: params[:max_frames] || 20,
|
|
226
|
+
fps: params[:fps] || 2
|
|
227
|
+
}
|
|
228
|
+
}
|
|
229
|
+
]
|
|
230
|
+
}
|
|
231
|
+
]
|
|
232
|
+
|
|
233
|
+
sys_msg("你是一个专业的视频摘要助手。", params)
|
|
234
|
+
params.merge(messages: messages)
|
|
235
|
+
send_msg
|
|
236
|
+
end
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
## 最佳实践
|
|
240
|
+
|
|
241
|
+
1. **图像细节级别**: 对于文字提取使用 `"high"`,对于一般分析使用 `"auto"`
|
|
242
|
+
2. **视频帧率**: 根据视频长度调整,长视频使用较低帧率
|
|
243
|
+
3. **错误处理**: 总是包含适当的错误处理逻辑
|
|
244
|
+
4. **API 限制**: 注意 SiliconFlow 的 API 调用限制
|
|
245
|
+
|
|
246
|
+
## 故障排除
|
|
247
|
+
|
|
248
|
+
### 常见问题
|
|
249
|
+
|
|
250
|
+
1. **图像无法加载**: 检查 URL 可访问性或文件路径
|
|
251
|
+
2. **视频处理超时**: 减少 `max_frames` 或降低 `fps`
|
|
252
|
+
3. **API 认证失败**: 检查 API 密钥和环境变量
|
|
253
|
+
4. **内存不足**: 减少同时处理的图像数量
|
|
254
|
+
|
|
255
|
+
### 调试模式
|
|
256
|
+
|
|
257
|
+
启用详细日志记录:
|
|
258
|
+
|
|
259
|
+
```yaml
|
|
260
|
+
logger_file: "./logs/smart_prompt.log"
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
## 扩展开发
|
|
264
|
+
|
|
265
|
+
如需扩展更多多模态功能,可以参考现有的适配器架构,继承 `LLMAdapter` 类并实现相应的方法。
|
|
@@ -0,0 +1,124 @@
|
|
|
1
|
+
# RelevanceBasedStrategy Implementation Summary
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
Successfully implemented the RelevanceBasedStrategy class for the SmartPrompt history optimization feature. This strategy selects messages based on a combination of recency and semantic relevance to the current message.
|
|
5
|
+
|
|
6
|
+
## Implementation Details
|
|
7
|
+
|
|
8
|
+
### Core Features Implemented
|
|
9
|
+
1. **RelevanceBasedStrategy Class** (`lib/smart_prompt/relevance_based_strategy.rb`)
|
|
10
|
+
- Implements the ContextStrategy interface
|
|
11
|
+
- Configurable top-k message selection
|
|
12
|
+
- Weighted scoring combining recency and relevance
|
|
13
|
+
- Keyword-based similarity using Jaccard index
|
|
14
|
+
- Optional embedding-based similarity with fallback
|
|
15
|
+
- Token limit enforcement
|
|
16
|
+
- Temporal ordering preservation
|
|
17
|
+
|
|
18
|
+
2. **Key Methods**
|
|
19
|
+
- `select_messages`: Main selection logic with relevance scoring
|
|
20
|
+
- `calculate_score`: Combines recency and relevance weights
|
|
21
|
+
- `calculate_keyword_similarity`: Jaccard similarity for text comparison
|
|
22
|
+
- `calculate_semantic_similarity`: Embedding-based similarity with error handling
|
|
23
|
+
- `cosine_similarity`: Vector similarity calculation
|
|
24
|
+
- `trim_to_token_limit`: Ensures token constraints are met
|
|
25
|
+
- `should_compress?`: Recommends compression at 3x top_k threshold
|
|
26
|
+
|
|
27
|
+
3. **Configuration Options**
|
|
28
|
+
- `top_k`: Number of messages to select (default: 10)
|
|
29
|
+
- `recency_weight`: Weight for recency in scoring (default: 0.3)
|
|
30
|
+
- `relevance_weight`: Weight for relevance in scoring (default: 0.7)
|
|
31
|
+
- `embedding_service`: Optional service for semantic embeddings
|
|
32
|
+
|
|
33
|
+
## Testing
|
|
34
|
+
|
|
35
|
+
### Unit Tests (`test/relevance_based_strategy_test.rb`)
|
|
36
|
+
- 17 test cases covering:
|
|
37
|
+
- Empty input handling
|
|
38
|
+
- Fallback to recency when no current message
|
|
39
|
+
- Relevance-based selection with current message
|
|
40
|
+
- Keyword similarity calculation
|
|
41
|
+
- Cosine similarity calculation
|
|
42
|
+
- Token limit enforcement
|
|
43
|
+
- Configuration options
|
|
44
|
+
- Error handling and edge cases
|
|
45
|
+
|
|
46
|
+
### Integration Tests (`test/relevance_based_strategy_integration_test.rb`)
|
|
47
|
+
- 5 test cases covering:
|
|
48
|
+
- Integration with Session class
|
|
49
|
+
- Token limit respect in real scenarios
|
|
50
|
+
- Empty session handling
|
|
51
|
+
- System message handling
|
|
52
|
+
- Compression threshold detection
|
|
53
|
+
|
|
54
|
+
### Test Results
|
|
55
|
+
- All 22 tests pass successfully
|
|
56
|
+
- No diagnostics errors
|
|
57
|
+
- Proper error handling verified
|
|
58
|
+
|
|
59
|
+
## Example Usage
|
|
60
|
+
|
|
61
|
+
```ruby
|
|
62
|
+
# Create strategy with custom configuration
|
|
63
|
+
strategy = SmartPrompt::RelevanceBasedStrategy.new(
|
|
64
|
+
top_k: 5,
|
|
65
|
+
recency_weight: 0.3,
|
|
66
|
+
relevance_weight: 0.7
|
|
67
|
+
)
|
|
68
|
+
|
|
69
|
+
# Select relevant messages
|
|
70
|
+
current_message = SmartPrompt::Message.new(
|
|
71
|
+
role: "user",
|
|
72
|
+
content: "Tell me about neural networks"
|
|
73
|
+
)
|
|
74
|
+
|
|
75
|
+
selected = strategy.select_messages(
|
|
76
|
+
session.get_messages,
|
|
77
|
+
max_tokens: 100,
|
|
78
|
+
current_message: current_message
|
|
79
|
+
)
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
## Requirements Validation
|
|
83
|
+
|
|
84
|
+
All task requirements have been met:
|
|
85
|
+
|
|
86
|
+
✅ **Requirement 6.1**: Context strategy parameter support
|
|
87
|
+
✅ **Requirement 8.1**: Semantic importance-based prioritization
|
|
88
|
+
✅ **Requirement 8.2**: Multiple strategy support
|
|
89
|
+
✅ **Requirement 8.3**: Semantic importance scoring
|
|
90
|
+
✅ **Requirement 10.2**: Semantically related message inclusion
|
|
91
|
+
✅ **Requirement 10.5**: Vector similarity support (when embeddings available)
|
|
92
|
+
|
|
93
|
+
## Files Created/Modified
|
|
94
|
+
|
|
95
|
+
### Created
|
|
96
|
+
- `lib/smart_prompt/relevance_based_strategy.rb` - Main implementation
|
|
97
|
+
- `test/relevance_based_strategy_test.rb` - Unit tests
|
|
98
|
+
- `test/relevance_based_strategy_integration_test.rb` - Integration tests
|
|
99
|
+
- `examples/relevance_based_strategy_example.rb` - Usage example
|
|
100
|
+
|
|
101
|
+
### Modified
|
|
102
|
+
- `lib/smart_prompt.rb` - Added require statement for new strategy
|
|
103
|
+
|
|
104
|
+
## Key Design Decisions
|
|
105
|
+
|
|
106
|
+
1. **Keyword Similarity**: Uses Jaccard index for simple, effective text comparison
|
|
107
|
+
2. **Fallback Mechanism**: Gracefully falls back to keyword similarity if embeddings fail
|
|
108
|
+
3. **Temporal Ordering**: Maintains conversation flow by re-ordering selected messages by timestamp
|
|
109
|
+
4. **Token Trimming**: Removes oldest messages first when enforcing token limits
|
|
110
|
+
5. **Compression Threshold**: Recommends compression at 3x top_k to balance memory and quality
|
|
111
|
+
|
|
112
|
+
## Performance Characteristics
|
|
113
|
+
|
|
114
|
+
- **Time Complexity**: O(n log n) where n is message count (due to sorting)
|
|
115
|
+
- **Space Complexity**: O(n) for scoring all messages
|
|
116
|
+
- **Token Calculation**: Cached in Message objects for efficiency
|
|
117
|
+
|
|
118
|
+
## Future Enhancements
|
|
119
|
+
|
|
120
|
+
The implementation supports optional embedding services for more sophisticated semantic similarity. When an embedding service is provided, the strategy will use vector-based cosine similarity instead of keyword matching.
|
|
121
|
+
|
|
122
|
+
## Conclusion
|
|
123
|
+
|
|
124
|
+
The RelevanceBasedStrategy is fully implemented, tested, and integrated into the SmartPrompt framework. It provides intelligent message selection based on semantic relevance, making it ideal for complex discussions where context matters more than simple recency.
|