smart_prompt 0.5.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,278 @@
1
+ # History Manager Monitoring Guide
2
+
3
+ This guide explains the logging and monitoring features available in the SmartPrompt History Manager.
4
+
5
+ ## Overview
6
+
7
+ The History Manager includes comprehensive logging and monitoring capabilities to help you:
8
+ - Track system operations and performance
9
+ - Debug context selection decisions
10
+ - Monitor resource usage and cache efficiency
11
+ - Export metrics for external monitoring systems
12
+
13
+ ## Configuration
14
+
15
+ Enable monitoring in your configuration:
16
+
17
+ ```ruby
18
+ config = {
19
+ monitoring: {
20
+ enabled: true, # Enable/disable monitoring
21
+ log_level: :info # Log level: :debug, :info, :warn, :error
22
+ }
23
+ }
24
+
25
+ manager = SmartPrompt::HistoryManager.new(config)
26
+ ```
27
+
28
+ ## Log Levels
29
+
30
+ ### INFO Level
31
+ Logs important operations:
32
+ - Session creation and deletion
33
+ - Message additions
34
+ - Context retrievals
35
+ - Session clearing and exports
36
+ - Cleanup operations
37
+
38
+ Example:
39
+ ```
40
+ [HistoryManager] Session user_123 created with config: max_messages=100, max_tokens=4000
41
+ [HistoryManager] Message added to session user_123: role=user, tokens=15
42
+ [HistoryManager] Session user_123 cleared: 10 -> 1 messages (keep_system=true)
43
+ ```
44
+
45
+ ### DEBUG Level
46
+ Logs detailed information for debugging:
47
+ - Cache hits and misses
48
+ - Context selection decisions
49
+ - Strategy selection in hybrid mode
50
+ - Token trimming operations
51
+ - Importance scores
52
+
53
+ Example:
54
+ ```
55
+ [HistoryManager] Session user_123 retrieved from cache
56
+ [SlidingWindowStrategy] selected 5/10 messages (window_size=5, system=0, recent=5)
57
+ [RelevanceBasedStrategy] calculated scores for 30 messages, top 5 scores: [0.85, 0.82, 0.79, 0.75, 0.72]
58
+ [HybridStrategy] Adaptive mode: selected RelevanceBasedStrategy for 30 messages
59
+ ```
60
+
61
+ ### ERROR Level
62
+ Logs errors with context:
63
+ - Persistence failures
64
+ - Compression errors
65
+ - Operation failures
66
+
67
+ Example:
68
+ ```
69
+ [HistoryManager] Persistence failed for session user_123: Errno::ENOSPC - No space left on device
70
+ [HistoryManager] Failed to add message to session user_123: ArgumentError - Invalid message format
71
+ ```
72
+
73
+ ## Statistics and Metrics
74
+
75
+ ### Session-Specific Statistics
76
+
77
+ Get statistics for a specific session:
78
+
79
+ ```ruby
80
+ stats = manager.get_stats("session_id")
81
+
82
+ # Returns:
83
+ {
84
+ session_id: "session_id",
85
+ message_count: 25,
86
+ total_tokens: 1500,
87
+ created_at: <Time>,
88
+ updated_at: <Time>,
89
+ config: { ... }
90
+ }
91
+ ```
92
+
93
+ ### System-Wide Statistics
94
+
95
+ Get statistics for all sessions:
96
+
97
+ ```ruby
98
+ stats = manager.get_stats
99
+
100
+ # Returns:
101
+ {
102
+ # Session metrics
103
+ active_sessions: 10,
104
+ sessions_created: 50,
105
+ sessions_deleted: 40,
106
+
107
+ # Message metrics
108
+ total_messages: 250,
109
+ messages_added: 300,
110
+ messages_per_session_avg: 25.0,
111
+
112
+ # Token metrics
113
+ total_tokens: 15000,
114
+ tokens_per_session_avg: 1500.0,
115
+ tokens_per_message_avg: 60.0,
116
+
117
+ # Cache metrics
118
+ cache_size: 100,
119
+ cache_hits: 850,
120
+ cache_misses: 150,
121
+ cache_hit_rate: 0.85,
122
+
123
+ # Operation metrics
124
+ context_retrievals: 200,
125
+
126
+ # Compression metrics
127
+ compression_operations: 5,
128
+ tokens_saved_by_compression: 5000,
129
+
130
+ # Error metrics
131
+ persistence_errors: 2
132
+ }
133
+ ```
134
+
135
+ ## Metrics Export
136
+
137
+ Export metrics in various formats for integration with monitoring systems.
138
+
139
+ ### Prometheus Format
140
+
141
+ ```ruby
142
+ metrics = manager.export_metrics(format: :prometheus)
143
+
144
+ # Returns Prometheus-style metrics:
145
+ # HELP smart_prompt_active_sessions Number of active sessions in cache
146
+ # TYPE smart_prompt_active_sessions gauge
147
+ smart_prompt_active_sessions 10
148
+
149
+ # HELP smart_prompt_cache_hit_rate Cache hit rate (0.0-1.0)
150
+ # TYPE smart_prompt_cache_hit_rate gauge
151
+ smart_prompt_cache_hit_rate 0.85
152
+ ```
153
+
154
+ ### JSON Format
155
+
156
+ ```ruby
157
+ metrics = manager.export_metrics(format: :json)
158
+
159
+ # Returns JSON string with all metrics
160
+ ```
161
+
162
+ ### Hash Format
163
+
164
+ ```ruby
165
+ metrics = manager.export_metrics(format: :hash)
166
+
167
+ # Returns Ruby hash with all metrics
168
+ ```
169
+
170
+ ## Context Selection Debugging
171
+
172
+ When debug logging is enabled, you can see detailed information about context selection:
173
+
174
+ ```ruby
175
+ # Enable debug logging
176
+ SmartPrompt.logger.level = Logger::DEBUG
177
+
178
+ # Retrieve context
179
+ context = manager.get_context("session_id", max_tokens: 1000)
180
+
181
+ # Debug output shows:
182
+ # - Which strategy was selected (for hybrid mode)
183
+ # - How many messages were considered
184
+ # - How many messages were selected
185
+ # - Token counts before and after trimming
186
+ # - Importance scores for relevance-based selection
187
+ ```
188
+
189
+ ## Performance Monitoring
190
+
191
+ Track key performance indicators:
192
+
193
+ ```ruby
194
+ stats = manager.get_stats
195
+
196
+ # Cache efficiency
197
+ cache_hit_rate = stats[:cache_hit_rate]
198
+ puts "Cache hit rate: #{(cache_hit_rate * 100).round(2)}%"
199
+
200
+ # Average resource usage
201
+ avg_messages = stats[:messages_per_session_avg]
202
+ avg_tokens = stats[:tokens_per_session_avg]
203
+ puts "Avg messages per session: #{avg_messages.round(2)}"
204
+ puts "Avg tokens per session: #{avg_tokens.round(2)}"
205
+
206
+ # Compression effectiveness
207
+ if stats[:compression_operations] > 0
208
+ tokens_saved = stats[:tokens_saved_by_compression]
209
+ puts "Tokens saved by compression: #{tokens_saved}"
210
+ end
211
+ ```
212
+
213
+ ## Error Tracking
214
+
215
+ Monitor errors to identify issues:
216
+
217
+ ```ruby
218
+ stats = manager.get_stats
219
+
220
+ if stats[:persistence_errors] > 0
221
+ puts "Warning: #{stats[:persistence_errors]} persistence errors occurred"
222
+ # Check logs for details
223
+ end
224
+ ```
225
+
226
+ ## Best Practices
227
+
228
+ 1. **Use INFO level in production** - Provides good visibility without excessive logging
229
+ 2. **Use DEBUG level for troubleshooting** - Helps diagnose context selection issues
230
+ 3. **Monitor cache hit rate** - Low hit rates may indicate cache size is too small
231
+ 4. **Track compression metrics** - Verify compression is reducing token usage
232
+ 5. **Export metrics regularly** - Integrate with monitoring systems like Prometheus
233
+ 6. **Set up alerts** - Alert on high error rates or low cache hit rates
234
+
235
+ ## Example: Complete Monitoring Setup
236
+
237
+ ```ruby
238
+ require 'smart_prompt'
239
+ require 'logger'
240
+
241
+ # Configure logger
242
+ SmartPrompt.logger = Logger.new('history_manager.log')
243
+ SmartPrompt.logger.level = Logger::INFO
244
+
245
+ # Configure manager with monitoring
246
+ config = {
247
+ cache_size: 100,
248
+ monitoring: {
249
+ enabled: true,
250
+ log_level: :info
251
+ }
252
+ }
253
+
254
+ manager = SmartPrompt::HistoryManager.new(config)
255
+
256
+ # Use the manager
257
+ manager.add_message("user_123", { role: "user", content: "Hello" })
258
+
259
+ # Periodically export metrics
260
+ Thread.new do
261
+ loop do
262
+ sleep 60 # Every minute
263
+ metrics = manager.export_metrics(format: :prometheus)
264
+ File.write('metrics.prom', metrics)
265
+ end
266
+ end
267
+
268
+ # Check statistics
269
+ stats = manager.get_stats
270
+ puts "Active sessions: #{stats[:active_sessions]}"
271
+ puts "Cache hit rate: #{(stats[:cache_hit_rate] * 100).round(2)}%"
272
+ ```
273
+
274
+ ## See Also
275
+
276
+ - [History Optimization Design Document](.kiro/specs/history-optimization/design.md)
277
+ - [Monitoring Example](examples/monitoring_example.rb)
278
+ - [History Manager Tests](test/monitoring_test.rb)
@@ -0,0 +1,265 @@
1
+ # SmartPrompt 多模态功能扩展
2
+
3
+ 本文档介绍 SmartPrompt 新增的多模态功能,支持图像和视频分析。
4
+
5
+ ## 新增适配器
6
+
7
+ ### MultimodalAdapter
8
+
9
+ 新的 `MultimodalAdapter` 扩展了原有的 OpenAI 兼容适配器,支持 SiliconFlow 的多模态视觉模型。
10
+
11
+ ## 支持的功能
12
+
13
+ ### 1. 图像分析
14
+ - 单张图像分析
15
+ - 多张图像比较
16
+ - 文档文字提取
17
+ - 场景描述
18
+
19
+ ### 2. 视频分析
20
+ - 视频内容理解
21
+ - 帧提取控制
22
+ - 时序分析
23
+
24
+ ### 3. 多模态对话
25
+ - 图像+文本组合输入
26
+ - 视频+文本组合输入
27
+ - 多图像+文本组合输入
28
+
29
+ ## 快速开始
30
+
31
+ ### 1. 配置
32
+
33
+ 创建配置文件 `config/multimodal_config.yml`:
34
+
35
+ ```yaml
36
+ adapters:
37
+ multimodal: "MultimodalAdapter"
38
+
39
+ llms:
40
+ qwen_vl:
41
+ adapter: "multimodal"
42
+ url: "https://api.siliconflow.cn/v1/"
43
+ api_key: ENV["SILICONFLOW_API_KEY"]
44
+ default_model: "Qwen/Qwen2.5-VL-7B-Instruct"
45
+
46
+ default_llm: "qwen_vl"
47
+ ```
48
+
49
+ ### 2. 创建工作流
50
+
51
+ 在 `workers/` 目录中创建工作流定义:
52
+
53
+ ```ruby
54
+ # workers/multimodal_workers.rb
55
+ SmartPrompt.define_worker :image_analyzer do
56
+ use "qwen_vl"
57
+ model "Qwen/Qwen2.5-VL-7B-Instruct"
58
+
59
+ messages = [
60
+ {
61
+ role: "user",
62
+ content: [
63
+ { type: "text", text: params[:question] },
64
+ { type: "image_url", image_url: { url: params[:image_url], detail: "auto" } }
65
+ ]
66
+ }
67
+ ]
68
+
69
+ sys_msg("你是一个专业的图像分析助手。", params)
70
+ params.merge(messages: messages)
71
+ send_msg
72
+ end
73
+ ```
74
+
75
+ ### 3. 使用示例
76
+
77
+ ```ruby
78
+ require 'smart_prompt'
79
+
80
+ # 初始化引擎
81
+ engine = SmartPrompt::Engine.new('config/multimodal_config.yml')
82
+
83
+ # 图像分析
84
+ result = engine.call_worker(:image_analyzer, {
85
+ image_url: "https://example.com/image.jpg",
86
+ question: "描述这张图片的内容"
87
+ })
88
+
89
+ puts result
90
+ ```
91
+
92
+ ## API 参考
93
+
94
+ ### MultimodalAdapter 方法
95
+
96
+ #### `analyze_image(image_input, prompt, model = nil, detail: "auto", max_tokens: nil)`
97
+
98
+ 分析单张图像。
99
+
100
+ **参数:**
101
+ - `image_input`: 图像 URL 或本地文件路径
102
+ - `prompt`: 分析提示文本
103
+ - `model`: 可选模型名称
104
+ - `detail`: 图像细节级别("low", "high", "auto")
105
+ - `max_tokens`: 最大输出 token 数
106
+
107
+ #### `analyze_video(video_input, prompt, model = nil, max_frames: 10, fps: 1, detail: "auto")`
108
+
109
+ 分析视频内容。
110
+
111
+ **参数:**
112
+ - `video_input`: 视频 URL
113
+ - `prompt`: 分析提示文本
114
+ - `model`: 可选模型名称
115
+ - `max_frames`: 最大提取帧数
116
+ - `fps`: 帧率
117
+ - `detail`: 细节级别
118
+
119
+ #### `analyze_multiple_images(images, prompt, model = nil, detail: "auto")`
120
+
121
+ 分析多张图像。
122
+
123
+ **参数:**
124
+ - `images`: 图像 URL 数组
125
+ - `prompt`: 分析提示文本
126
+ - `model`: 可选模型名称
127
+ - `detail`: 图像细节级别
128
+
129
+ ### 消息格式
130
+
131
+ 多模态消息使用标准 OpenAI 格式,支持 `image_url` 和 `video_url` 类型:
132
+
133
+ ```ruby
134
+ messages = [
135
+ {
136
+ role: "user",
137
+ content: [
138
+ { type: "text", text: "分析这张图片" },
139
+ {
140
+ type: "image_url",
141
+ image_url: {
142
+ url: "https://example.com/image.jpg",
143
+ detail: "auto"
144
+ }
145
+ }
146
+ ]
147
+ }
148
+ ]
149
+ ```
150
+
151
+ ## 支持的多模态模型
152
+
153
+ ### SiliconFlow 支持的多模态模型
154
+
155
+ - **Qwen2.5-VL 系列**: 视觉语言模型
156
+ - **Qwen3-Omni 系列**: 全模态模型(视觉/音频/视频)
157
+ - **DeepSeek-VL2**: 视觉语言模型
158
+ - **GLM 系列**: 视觉语言模型
159
+
160
+ ## 配置参数
161
+
162
+ ### 图像参数
163
+
164
+ - `detail`: 控制图像处理细节级别
165
+ - `"low"`: 低分辨率,更快处理
166
+ - `"high"`: 高分辨率,更准确
167
+ - `"auto"`: 自动选择(推荐)
168
+
169
+ ### 视频参数
170
+
171
+ - `max_frames`: 从视频中提取的最大帧数
172
+ - `fps`: 帧率,控制帧提取频率
173
+
174
+ ## 错误处理
175
+
176
+ 适配器包含完整的错误处理机制:
177
+
178
+ - 网络连接错误
179
+ - API 认证错误
180
+ - 文件格式错误
181
+ - 响应解析错误
182
+
183
+ ## 示例工作流
184
+
185
+ ### 1. 图像分析工作流
186
+
187
+ ```ruby
188
+ SmartPrompt.define_worker :product_analyzer do
189
+ use "qwen_vl"
190
+ model "Qwen/Qwen2.5-VL-7B-Instruct"
191
+
192
+ messages = [
193
+ {
194
+ role: "user",
195
+ content: [
196
+ { type: "text", text: "分析这个产品图片,包括产品类型、颜色、特征和可能的用途" },
197
+ { type: "image_url", image_url: { url: params[:product_image], detail: "high" } }
198
+ ]
199
+ }
200
+ ]
201
+
202
+ sys_msg("你是一个专业的产品分析师。", params)
203
+ params.merge(messages: messages)
204
+ send_msg
205
+ end
206
+ ```
207
+
208
+ ### 2. 视频摘要工作流
209
+
210
+ ```ruby
211
+ SmartPrompt.define_worker :video_summarizer do
212
+ use "qwen_vl"
213
+ model "Qwen/Qwen2.5-VL-7B-Instruct"
214
+
215
+ messages = [
216
+ {
217
+ role: "user",
218
+ content: [
219
+ { type: "text", text: "请总结这个视频的主要内容" },
220
+ {
221
+ type: "video_url",
222
+ video_url: {
223
+ url: params[:video_url],
224
+ detail: "auto",
225
+ max_frames: params[:max_frames] || 20,
226
+ fps: params[:fps] || 2
227
+ }
228
+ }
229
+ ]
230
+ }
231
+ ]
232
+
233
+ sys_msg("你是一个专业的视频摘要助手。", params)
234
+ params.merge(messages: messages)
235
+ send_msg
236
+ end
237
+ ```
238
+
239
+ ## 最佳实践
240
+
241
+ 1. **图像细节级别**: 对于文字提取使用 `"high"`,对于一般分析使用 `"auto"`
242
+ 2. **视频帧率**: 根据视频长度调整,长视频使用较低帧率
243
+ 3. **错误处理**: 总是包含适当的错误处理逻辑
244
+ 4. **API 限制**: 注意 SiliconFlow 的 API 调用限制
245
+
246
+ ## 故障排除
247
+
248
+ ### 常见问题
249
+
250
+ 1. **图像无法加载**: 检查 URL 可访问性或文件路径
251
+ 2. **视频处理超时**: 减少 `max_frames` 或降低 `fps`
252
+ 3. **API 认证失败**: 检查 API 密钥和环境变量
253
+ 4. **内存不足**: 减少同时处理的图像数量
254
+
255
+ ### 调试模式
256
+
257
+ 启用详细日志记录:
258
+
259
+ ```yaml
260
+ logger_file: "./logs/smart_prompt.log"
261
+ ```
262
+
263
+ ## 扩展开发
264
+
265
+ 如需扩展更多多模态功能,可以参考现有的适配器架构,继承 `LLMAdapter` 类并实现相应的方法。
@@ -0,0 +1,124 @@
1
+ # RelevanceBasedStrategy Implementation Summary
2
+
3
+ ## Overview
4
+ Successfully implemented the RelevanceBasedStrategy class for the SmartPrompt history optimization feature. This strategy selects messages based on a combination of recency and semantic relevance to the current message.
5
+
6
+ ## Implementation Details
7
+
8
+ ### Core Features Implemented
9
+ 1. **RelevanceBasedStrategy Class** (`lib/smart_prompt/relevance_based_strategy.rb`)
10
+ - Implements the ContextStrategy interface
11
+ - Configurable top-k message selection
12
+ - Weighted scoring combining recency and relevance
13
+ - Keyword-based similarity using Jaccard index
14
+ - Optional embedding-based similarity with fallback
15
+ - Token limit enforcement
16
+ - Temporal ordering preservation
17
+
18
+ 2. **Key Methods**
19
+ - `select_messages`: Main selection logic with relevance scoring
20
+ - `calculate_score`: Combines recency and relevance weights
21
+ - `calculate_keyword_similarity`: Jaccard similarity for text comparison
22
+ - `calculate_semantic_similarity`: Embedding-based similarity with error handling
23
+ - `cosine_similarity`: Vector similarity calculation
24
+ - `trim_to_token_limit`: Ensures token constraints are met
25
+ - `should_compress?`: Recommends compression at 3x top_k threshold
26
+
27
+ 3. **Configuration Options**
28
+ - `top_k`: Number of messages to select (default: 10)
29
+ - `recency_weight`: Weight for recency in scoring (default: 0.3)
30
+ - `relevance_weight`: Weight for relevance in scoring (default: 0.7)
31
+ - `embedding_service`: Optional service for semantic embeddings
32
+
33
+ ## Testing
34
+
35
+ ### Unit Tests (`test/relevance_based_strategy_test.rb`)
36
+ - 17 test cases covering:
37
+ - Empty input handling
38
+ - Fallback to recency when no current message
39
+ - Relevance-based selection with current message
40
+ - Keyword similarity calculation
41
+ - Cosine similarity calculation
42
+ - Token limit enforcement
43
+ - Configuration options
44
+ - Error handling and edge cases
45
+
46
+ ### Integration Tests (`test/relevance_based_strategy_integration_test.rb`)
47
+ - 5 test cases covering:
48
+ - Integration with Session class
49
+ - Token limit respect in real scenarios
50
+ - Empty session handling
51
+ - System message handling
52
+ - Compression threshold detection
53
+
54
+ ### Test Results
55
+ - All 22 tests pass successfully
56
+ - No diagnostics errors
57
+ - Proper error handling verified
58
+
59
+ ## Example Usage
60
+
61
+ ```ruby
62
+ # Create strategy with custom configuration
63
+ strategy = SmartPrompt::RelevanceBasedStrategy.new(
64
+ top_k: 5,
65
+ recency_weight: 0.3,
66
+ relevance_weight: 0.7
67
+ )
68
+
69
+ # Select relevant messages
70
+ current_message = SmartPrompt::Message.new(
71
+ role: "user",
72
+ content: "Tell me about neural networks"
73
+ )
74
+
75
+ selected = strategy.select_messages(
76
+ session.get_messages,
77
+ max_tokens: 100,
78
+ current_message: current_message
79
+ )
80
+ ```
81
+
82
+ ## Requirements Validation
83
+
84
+ All task requirements have been met:
85
+
86
+ ✅ **Requirement 6.1**: Context strategy parameter support
87
+ ✅ **Requirement 8.1**: Semantic importance-based prioritization
88
+ ✅ **Requirement 8.2**: Multiple strategy support
89
+ ✅ **Requirement 8.3**: Semantic importance scoring
90
+ ✅ **Requirement 10.2**: Semantically related message inclusion
91
+ ✅ **Requirement 10.5**: Vector similarity support (when embeddings available)
92
+
93
+ ## Files Created/Modified
94
+
95
+ ### Created
96
+ - `lib/smart_prompt/relevance_based_strategy.rb` - Main implementation
97
+ - `test/relevance_based_strategy_test.rb` - Unit tests
98
+ - `test/relevance_based_strategy_integration_test.rb` - Integration tests
99
+ - `examples/relevance_based_strategy_example.rb` - Usage example
100
+
101
+ ### Modified
102
+ - `lib/smart_prompt.rb` - Added require statement for new strategy
103
+
104
+ ## Key Design Decisions
105
+
106
+ 1. **Keyword Similarity**: Uses Jaccard index for simple, effective text comparison
107
+ 2. **Fallback Mechanism**: Gracefully falls back to keyword similarity if embeddings fail
108
+ 3. **Temporal Ordering**: Maintains conversation flow by re-ordering selected messages by timestamp
109
+ 4. **Token Trimming**: Removes oldest messages first when enforcing token limits
110
+ 5. **Compression Threshold**: Recommends compression at 3x top_k to balance memory and quality
111
+
112
+ ## Performance Characteristics
113
+
114
+ - **Time Complexity**: O(n log n) where n is message count (due to sorting)
115
+ - **Space Complexity**: O(n) for scoring all messages
116
+ - **Token Calculation**: Cached in Message objects for efficiency
117
+
118
+ ## Future Enhancements
119
+
120
+ The implementation supports optional embedding services for more sophisticated semantic similarity. When an embedding service is provided, the strategy will use vector-based cosine similarity instead of keyword matching.
121
+
122
+ ## Conclusion
123
+
124
+ The RelevanceBasedStrategy is fully implemented, tested, and integrated into the SmartPrompt framework. It provides intelligent message selection based on semantic relevance, making it ideal for complex discussions where context matters more than simple recency.