smart_prompt 0.5.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,797 @@
1
+ # History Management Usage Guide
2
+
3
+ ## Table of Contents
4
+
5
+ 1. [Overview](#overview)
6
+ 2. [Quick Start](#quick-start)
7
+ 3. [Configuration](#configuration)
8
+ 4. [Basic Usage](#basic-usage)
9
+ 5. [Context Strategies](#context-strategies)
10
+ 6. [Session Management](#session-management)
11
+ 7. [Advanced Features](#advanced-features)
12
+ 8. [Migration Guide](#migration-guide)
13
+ 9. [Best Practices](#best-practices)
14
+ 10. [Troubleshooting](#troubleshooting)
15
+
16
+ ## Overview
17
+
18
+ SmartPrompt's History Management system provides intelligent conversation history management with:
19
+
20
+ - **Session Isolation**: Each conversation has its own independent history
21
+ - **Automatic Compression**: Reduce token usage while preserving context
22
+ - **Multiple Strategies**: Choose how messages are selected for context
23
+ - **Persistence**: Save and restore conversations across restarts
24
+ - **Performance**: LRU caching and async I/O for optimal performance
25
+ - **Monitoring**: Built-in metrics and logging for debugging
26
+
27
+ ## Quick Start
28
+
29
+ ### 1. Enable History Management
30
+
31
+ Add to your `config/anthropic_config.yml`:
32
+
33
+ ```yaml
34
+ history:
35
+ cache_size: 100
36
+ session_defaults:
37
+ max_messages: 100
38
+ max_tokens: 4000
39
+ context_strategy: sliding_window
40
+ preserve_system_messages: true
41
+ persistence:
42
+ enabled: true
43
+ storage_path: "./history_data"
44
+ async: true
45
+ ```
46
+
47
+ ### 2. Use in Your Workers
48
+
49
+ ```ruby
50
+ SmartPrompt.define_worker :chat do
51
+ use "deepseek_anthropic"
52
+ model "deepseek-chat"
53
+
54
+ sys_msg("You are a helpful assistant.", params)
55
+ prompt(params[:message], with_history: true)
56
+ send_msg
57
+ end
58
+ ```
59
+
60
+ ### 3. Call the Worker
61
+
62
+ ```ruby
63
+ engine = SmartPrompt::Engine.new('config/anthropic_config.yml')
64
+
65
+ # First message
66
+ response1 = engine.call_worker(:chat, {
67
+ message: "What is Ruby?"
68
+ })
69
+
70
+ # Second message (with context from first)
71
+ response2 = engine.call_worker(:chat, {
72
+ message: "Can you show me an example?"
73
+ })
74
+ ```
75
+
76
+ ## Configuration
77
+
78
+ ### Complete Configuration Reference
79
+
80
+ ```yaml
81
+ history:
82
+ # Cache Configuration
83
+ cache_size: 100 # Max sessions in memory (LRU eviction)
84
+
85
+ # Session Defaults
86
+ session_defaults:
87
+ max_messages: 100 # Max messages per session
88
+ max_tokens: 4000 # Max tokens per session
89
+ context_strategy: sliding_window # Default strategy
90
+ preserve_system_messages: true # Keep system messages
91
+
92
+ # Strategy Configurations
93
+ strategies:
94
+ sliding_window:
95
+ window_size: 10 # Recent messages to keep
96
+ preserve_system: true
97
+
98
+ relevance_based:
99
+ top_k: 10 # Most relevant messages
100
+ recency_weight: 0.3 # Recency importance (0-1)
101
+ relevance_weight: 0.7 # Relevance importance (0-1)
102
+ embedding_service: null # Optional embedding service
103
+
104
+ summary_based:
105
+ summary_threshold: 20 # Trigger summarization
106
+ keep_recent: 5 # Recent messages to keep
107
+ compression_ratio: 0.5 # Target compression
108
+
109
+ hybrid:
110
+ mode: adaptive # 'adaptive' or 'combined'
111
+ sliding_window: {}
112
+ relevance_based: {}
113
+ summary_based: {}
114
+
115
+ # Compression
116
+ compression:
117
+ enabled: true
118
+ auto_compress_threshold: 50
119
+ compression_ratio: 0.5
120
+ llm_adapter: null
121
+
122
+ # Persistence
123
+ persistence:
124
+ enabled: true
125
+ backend: filesystem
126
+ storage_path: "./history_data"
127
+ async: true
128
+
129
+ # Cleanup
130
+ cleanup:
131
+ auto_cleanup: false
132
+ cleanup_interval: 3600 # 1 hour
133
+ session_ttl: 86400 # 24 hours
134
+ cleanup_callback: null
135
+
136
+ # Monitoring
137
+ monitoring:
138
+ enabled: true
139
+ log_level: info # debug, info, warn, error
140
+ metrics_format: prometheus
141
+ ```
142
+
143
+ ### Configuration Presets
144
+
145
+ #### High-Volume Chat Application
146
+
147
+ ```yaml
148
+ history:
149
+ cache_size: 1000
150
+ session_defaults:
151
+ max_messages: 50
152
+ max_tokens: 2000
153
+ context_strategy: sliding_window
154
+ cleanup:
155
+ auto_cleanup: true
156
+ session_ttl: 3600 # 1 hour
157
+ ```
158
+
159
+ #### Long-Running Conversations
160
+
161
+ ```yaml
162
+ history:
163
+ session_defaults:
164
+ max_messages: 500
165
+ max_tokens: 16000
166
+ context_strategy: summary_based
167
+ compression:
168
+ enabled: true
169
+ auto_compress_threshold: 100
170
+ ```
171
+
172
+ #### Semantic Search Application
173
+
174
+ ```yaml
175
+ history:
176
+ session_defaults:
177
+ context_strategy: relevance_based
178
+ strategies:
179
+ relevance_based:
180
+ top_k: 20
181
+ recency_weight: 0.2
182
+ relevance_weight: 0.8
183
+ ```
184
+
185
+ ## Basic Usage
186
+
187
+ ### Simple Chat with History
188
+
189
+ ```ruby
190
+ SmartPrompt.define_worker :simple_chat do
191
+ use "deepseek_anthropic"
192
+ model "deepseek-chat"
193
+
194
+ sys_msg("You are a helpful assistant.", params)
195
+ prompt(params[:message], with_history: true)
196
+ send_msg
197
+ end
198
+
199
+ # Usage
200
+ engine = SmartPrompt::Engine.new('config.yml')
201
+ response = engine.call_worker(:simple_chat, {
202
+ message: "Hello!"
203
+ })
204
+ ```
205
+
206
+ ### Chat with Explicit Session ID
207
+
208
+ ```ruby
209
+ SmartPrompt.define_worker :session_chat do
210
+ use "deepseek_anthropic"
211
+ model "deepseek-chat"
212
+
213
+ session_id = params[:session_id] || "default"
214
+
215
+ sys_msg("You are a helpful assistant.", params)
216
+ prompt(params[:message], with_history: true)
217
+ send_msg
218
+ end
219
+
220
+ # Usage - separate conversations
221
+ response1 = engine.call_worker(:session_chat, {
222
+ session_id: "user_123",
223
+ message: "What's the weather?"
224
+ })
225
+
226
+ response2 = engine.call_worker(:session_chat, {
227
+ session_id: "user_456",
228
+ message: "Tell me a joke"
229
+ })
230
+ ```
231
+
232
+ ### Streaming with History
233
+
234
+ ```ruby
235
+ SmartPrompt.define_worker :streaming_chat do
236
+ use "deepseek_anthropic"
237
+ model "deepseek-chat"
238
+
239
+ sys_msg("You are a helpful assistant.", params)
240
+ prompt(params[:message], with_history: true)
241
+ send_msg_by_stream(params)
242
+ end
243
+
244
+ # Usage
245
+ engine.call_worker_by_stream(:streaming_chat, {
246
+ message: "Tell me a story"
247
+ }) do |chunk, bytesize|
248
+ print chunk.dig("choices", 0, "delta", "content")
249
+ end
250
+ ```
251
+
252
+ ## Context Strategies
253
+
254
+ ### 1. Sliding Window Strategy
255
+
256
+ Keeps the most recent N messages. Best for real-time chat and short conversations.
257
+
258
+ ```ruby
259
+ SmartPrompt.define_worker :sliding_chat do
260
+ use "deepseek_anthropic"
261
+ model "deepseek-chat"
262
+
263
+ session_config = {
264
+ max_messages: 20,
265
+ max_tokens: 2000,
266
+ context_strategy: :sliding_window
267
+ }
268
+
269
+ sys_msg("You are a customer support assistant.", params)
270
+ prompt(params[:message], with_history: true)
271
+ params.merge(session_config: session_config)
272
+ send_msg
273
+ end
274
+ ```
275
+
276
+ **When to use:**
277
+ - Real-time chat applications
278
+ - Customer support conversations
279
+ - Short, focused interactions
280
+ - When recent context is most important
281
+
282
+ ### 2. Relevance-Based Strategy
283
+
284
+ Selects messages based on semantic similarity to current message. Best for Q&A and knowledge bases.
285
+
286
+ ```ruby
287
+ SmartPrompt.define_worker :relevance_chat do
288
+ use "deepseek_anthropic"
289
+ model "deepseek-chat"
290
+
291
+ session_config = {
292
+ max_messages: 100,
293
+ max_tokens: 4000,
294
+ context_strategy: :relevance_based
295
+ }
296
+
297
+ sys_msg("You are a knowledgeable assistant.", params)
298
+ prompt(params[:message], with_history: true)
299
+ params.merge(session_config: session_config)
300
+ send_msg
301
+ end
302
+ ```
303
+
304
+ **When to use:**
305
+ - Q&A systems
306
+ - Knowledge base assistants
307
+ - Context-aware applications
308
+ - When semantic relevance matters more than recency
309
+
310
+ ### 3. Summary-Based Strategy
311
+
312
+ Automatically compresses old messages into summaries. Best for long conversations.
313
+
314
+ ```ruby
315
+ SmartPrompt.define_worker :summary_chat do
316
+ use "deepseek_anthropic"
317
+ model "deepseek-chat"
318
+
319
+ session_config = {
320
+ max_messages: 200,
321
+ max_tokens: 8000,
322
+ context_strategy: :summary_based
323
+ }
324
+
325
+ sys_msg("You are a thoughtful assistant.", params)
326
+ prompt(params[:message], with_history: true)
327
+ params.merge(session_config: session_config)
328
+ send_msg
329
+ end
330
+ ```
331
+
332
+ **When to use:**
333
+ - Extended conversations
334
+ - Documentation generation
335
+ - Long-running dialogues
336
+ - When token efficiency is critical
337
+
338
+ ### 4. Hybrid Strategy
339
+
340
+ Adaptively combines multiple strategies. Best for general-purpose applications.
341
+
342
+ ```ruby
343
+ SmartPrompt.define_worker :hybrid_chat do
344
+ use "deepseek_anthropic"
345
+ model "deepseek-chat"
346
+
347
+ session_config = {
348
+ max_messages: 150,
349
+ max_tokens: 6000,
350
+ context_strategy: :hybrid
351
+ }
352
+
353
+ sys_msg("You are an intelligent assistant.", params)
354
+ prompt(params[:message], with_history: true)
355
+ params.merge(session_config: session_config)
356
+ send_msg
357
+ end
358
+ ```
359
+
360
+ **When to use:**
361
+ - General-purpose applications
362
+ - Varied conversation types
363
+ - When you want automatic optimization
364
+ - Production applications with diverse use cases
365
+
366
+ ## Session Management
367
+
368
+ ### Clear Session History
369
+
370
+ ```ruby
371
+ # Clear all messages except system messages
372
+ engine.history_manager.clear_session("user_123", keep_system_messages: true)
373
+
374
+ # Clear all messages including system messages
375
+ engine.history_manager.clear_session("user_123", keep_system_messages: false)
376
+ ```
377
+
378
+ ### Export Session Data
379
+
380
+ ```ruby
381
+ # Export as JSON string
382
+ json_data = engine.history_manager.export_session("user_123", format: :json)
383
+
384
+ # Export as Hash
385
+ hash_data = engine.history_manager.export_session("user_123", format: :hash)
386
+
387
+ # Save to file
388
+ File.write("session_backup.json", json_data)
389
+ ```
390
+
391
+ ### Search Messages
392
+
393
+ ```ruby
394
+ # Search for messages containing specific text
395
+ results = engine.history_manager.search_messages("user_123", "Ruby programming")
396
+
397
+ results.each do |message|
398
+ puts "#{message.role}: #{message.content}"
399
+ end
400
+ ```
401
+
402
+ ### Get Session Statistics
403
+
404
+ ```ruby
405
+ # Session-specific stats
406
+ stats = engine.history_manager.get_stats("user_123")
407
+ puts "Messages: #{stats[:message_count]}"
408
+ puts "Tokens: #{stats[:total_tokens]}"
409
+
410
+ # System-wide stats
411
+ system_stats = engine.history_manager.get_stats
412
+ puts "Active sessions: #{system_stats[:active_sessions]}"
413
+ puts "Cache hit rate: #{system_stats[:cache_hit_rate]}"
414
+ ```
415
+
416
+ ### Delete Session
417
+
418
+ ```ruby
419
+ # Completely remove session from memory and disk
420
+ engine.history_manager.delete_session("user_123")
421
+ ```
422
+
423
+ ### Check Session Existence
424
+
425
+ ```ruby
426
+ if engine.history_manager.session_exists?("user_123")
427
+ puts "Session exists"
428
+ end
429
+ ```
430
+
431
+ ### List All Sessions
432
+
433
+ ```ruby
434
+ session_ids = engine.history_manager.session_ids
435
+ puts "Active sessions: #{session_ids.join(', ')}"
436
+ ```
437
+
438
+ ## Advanced Features
439
+
440
+ ### Custom Session Configuration
441
+
442
+ ```ruby
443
+ SmartPrompt.define_worker :custom_chat do
444
+ use "deepseek_anthropic"
445
+ model "deepseek-chat"
446
+
447
+ # Fine-tuned configuration
448
+ session_config = {
449
+ max_messages: 100,
450
+ max_tokens: 8000,
451
+ context_strategy: :relevance_based,
452
+ preserve_system_messages: true,
453
+ strategy_config: {
454
+ top_k: 15,
455
+ recency_weight: 0.4,
456
+ relevance_weight: 0.6
457
+ }
458
+ }
459
+
460
+ sys_msg("You are a code reviewer.", params)
461
+ prompt(params[:message], with_history: true)
462
+ params.merge(session_config: session_config)
463
+ send_msg
464
+ end
465
+ ```
466
+
467
+ ### Multi-User Applications
468
+
469
+ ```ruby
470
+ SmartPrompt.define_worker :multi_user_chat do
471
+ use "deepseek_anthropic"
472
+ model "deepseek-chat"
473
+
474
+ # Isolate sessions by user ID
475
+ user_id = params[:user_id] || raise("user_id required")
476
+ session_id = "user_#{user_id}"
477
+
478
+ sys_msg("You are #{params[:user_name]}'s assistant.", params)
479
+ prompt(params[:message], with_history: true)
480
+ params.merge(session_id: session_id)
481
+ send_msg
482
+ end
483
+
484
+ # Usage
485
+ response = engine.call_worker(:multi_user_chat, {
486
+ user_id: "123",
487
+ user_name: "Alice",
488
+ message: "Hello!"
489
+ })
490
+ ```
491
+
492
+ ### Monitoring and Metrics
493
+
494
+ ```ruby
495
+ # Get Prometheus-formatted metrics
496
+ metrics = engine.history_manager.export_metrics(format: :prometheus)
497
+ puts metrics
498
+
499
+ # Get JSON metrics
500
+ json_metrics = engine.history_manager.export_metrics(format: :json)
501
+
502
+ # Get raw hash
503
+ hash_metrics = engine.history_manager.export_metrics(format: :hash)
504
+ ```
505
+
506
+ ### Manual Cleanup
507
+
508
+ ```ruby
509
+ # Manually trigger cleanup of expired sessions
510
+ expired = engine.history_manager.cleanup_expired_sessions
511
+ puts "Cleaned up #{expired.count} sessions"
512
+ ```
513
+
514
+ ### Custom Cleanup Logic
515
+
516
+ ```yaml
517
+ # In config file
518
+ history:
519
+ cleanup:
520
+ auto_cleanup: true
521
+ cleanup_interval: 3600
522
+ cleanup_callback: !ruby/object:Proc |
523
+ lambda do |session, age|
524
+ # Custom logic: cleanup if inactive for 2 hours
525
+ age > 7200
526
+ end
527
+ ```
528
+
529
+ ## Migration Guide
530
+
531
+ ### From Old History Implementation
532
+
533
+ The new history management system is **backward compatible**. Existing code continues to work without changes.
534
+
535
+ #### Old Code (Still Works)
536
+
537
+ ```ruby
538
+ SmartPrompt.define_worker :old_chat do
539
+ use "deepseek_anthropic"
540
+ model "deepseek-chat"
541
+
542
+ sys_msg("You are a helpful assistant.", params)
543
+ prompt(params[:message], with_history: true)
544
+ send_msg(with_history: true)
545
+ end
546
+ ```
547
+
548
+ #### New Code (Recommended)
549
+
550
+ ```ruby
551
+ SmartPrompt.define_worker :new_chat do
552
+ use "deepseek_anthropic"
553
+ model "deepseek-chat"
554
+
555
+ session_id = params[:session_id] || "default"
556
+
557
+ sys_msg("You are a helpful assistant.", params)
558
+ prompt(params[:message], with_history: true)
559
+ send_msg
560
+ end
561
+ ```
562
+
563
+ ### Migration Steps
564
+
565
+ 1. **Enable History Management** in your config file:
566
+
567
+ ```yaml
568
+ history:
569
+ cache_size: 100
570
+ session_defaults:
571
+ max_messages: 100
572
+ max_tokens: 4000
573
+ context_strategy: sliding_window
574
+ persistence:
575
+ enabled: true
576
+ storage_path: "./history_data"
577
+ ```
578
+
579
+ 2. **Update Workers Gradually**:
580
+ - Old workers continue to work
581
+ - Add `session_id` parameter for session isolation
582
+ - Configure strategies as needed
583
+
584
+ 3. **Test in Development**:
585
+ - Enable debug logging: `log_level: debug`
586
+ - Monitor statistics: `engine.history_manager.get_stats`
587
+ - Verify session isolation
588
+
589
+ 4. **Deploy to Production**:
590
+ - Start with conservative limits
591
+ - Monitor performance metrics
592
+ - Adjust configuration based on usage
593
+
594
+ ### Breaking Changes
595
+
596
+ **None!** The new system is fully backward compatible.
597
+
598
+ ### Deprecation Warnings
599
+
600
+ If you see deprecation warnings, update your code:
601
+
602
+ ```ruby
603
+ # Deprecated (but still works)
604
+ @engine.history_messages
605
+
606
+ # Recommended
607
+ @engine.history_manager.get_context(session_id)
608
+ ```
609
+
610
+ ## Best Practices
611
+
612
+ ### 1. Choose the Right Strategy
613
+
614
+ - **Sliding Window**: Real-time chat, customer support
615
+ - **Relevance-Based**: Q&A, knowledge bases
616
+ - **Summary-Based**: Long conversations, documentation
617
+ - **Hybrid**: General-purpose, production apps
618
+
619
+ ### 2. Set Appropriate Limits
620
+
621
+ ```ruby
622
+ # For short conversations
623
+ max_messages: 20
624
+ max_tokens: 2000
625
+
626
+ # For long conversations
627
+ max_messages: 200
628
+ max_tokens: 8000
629
+
630
+ # For extended dialogues
631
+ max_messages: 500
632
+ max_tokens: 16000
633
+ ```
634
+
635
+ ### 3. Use Session IDs Effectively
636
+
637
+ ```ruby
638
+ # User-based sessions
639
+ session_id = "user_#{user_id}"
640
+
641
+ # Conversation-based sessions
642
+ session_id = "conv_#{conversation_id}"
643
+
644
+ # Thread-based sessions
645
+ session_id = "thread_#{thread_id}"
646
+
647
+ # Temporary sessions
648
+ session_id = "temp_#{SecureRandom.uuid}"
649
+ ```
650
+
651
+ ### 4. Enable Persistence for Production
652
+
653
+ ```yaml
654
+ persistence:
655
+ enabled: true
656
+ storage_path: "./history_data"
657
+ async: true # Better performance
658
+ ```
659
+
660
+ ### 5. Configure Cleanup
661
+
662
+ ```yaml
663
+ cleanup:
664
+ auto_cleanup: true
665
+ cleanup_interval: 3600 # 1 hour
666
+ session_ttl: 86400 # 24 hours
667
+ ```
668
+
669
+ ### 6. Monitor Performance
670
+
671
+ ```ruby
672
+ # Regular monitoring
673
+ stats = engine.history_manager.get_stats
674
+ puts "Cache hit rate: #{stats[:cache_hit_rate]}"
675
+ puts "Active sessions: #{stats[:active_sessions]}"
676
+
677
+ # Export metrics for monitoring tools
678
+ metrics = engine.history_manager.export_metrics(format: :prometheus)
679
+ ```
680
+
681
+ ### 7. Handle Errors Gracefully
682
+
683
+ ```ruby
684
+ begin
685
+ response = engine.call_worker(:chat, params)
686
+ rescue SmartPrompt::HistoryManagerError => e
687
+ logger.error "History error: #{e.message}"
688
+ # Fallback to stateless conversation
689
+ end
690
+ ```
691
+
692
+ ### 8. Test Session Isolation
693
+
694
+ ```ruby
695
+ # Ensure sessions don't leak
696
+ response1 = engine.call_worker(:chat, {
697
+ session_id: "session_1",
698
+ message: "Remember: my name is Alice"
699
+ })
700
+
701
+ response2 = engine.call_worker(:chat, {
702
+ session_id: "session_2",
703
+ message: "What's my name?"
704
+ })
705
+
706
+ # response2 should not know the name
707
+ ```
708
+
709
+ ## Troubleshooting
710
+
711
+ ### Issue: Sessions Not Persisting
712
+
713
+ **Solution**: Check persistence configuration
714
+
715
+ ```yaml
716
+ persistence:
717
+ enabled: true
718
+ storage_path: "./history_data" # Ensure directory exists and is writable
719
+ ```
720
+
721
+ ### Issue: High Memory Usage
722
+
723
+ **Solution**: Reduce cache size and enable cleanup
724
+
725
+ ```yaml
726
+ cache_size: 50 # Reduce from 100
727
+ cleanup:
728
+ auto_cleanup: true
729
+ session_ttl: 3600 # 1 hour instead of 24
730
+ ```
731
+
732
+ ### Issue: Context Too Large
733
+
734
+ **Solution**: Reduce token limits or use compression
735
+
736
+ ```yaml
737
+ session_defaults:
738
+ max_tokens: 2000 # Reduce from 4000
739
+ compression:
740
+ enabled: true
741
+ auto_compress_threshold: 30
742
+ ```
743
+
744
+ ### Issue: Slow Performance
745
+
746
+ **Solution**: Enable async persistence and increase cache
747
+
748
+ ```yaml
749
+ cache_size: 200 # Increase cache
750
+ persistence:
751
+ async: true # Enable async writes
752
+ ```
753
+
754
+ ### Issue: Sessions Not Isolated
755
+
756
+ **Solution**: Ensure unique session IDs
757
+
758
+ ```ruby
759
+ # Wrong - same session for all users
760
+ session_id = "default"
761
+
762
+ # Correct - unique per user
763
+ session_id = "user_#{params[:user_id]}"
764
+ ```
765
+
766
+ ### Issue: Debug Logging
767
+
768
+ **Solution**: Enable debug mode
769
+
770
+ ```yaml
771
+ monitoring:
772
+ enabled: true
773
+ log_level: debug # See detailed logs
774
+ ```
775
+
776
+ ### Issue: Metrics Not Available
777
+
778
+ **Solution**: Ensure monitoring is enabled
779
+
780
+ ```yaml
781
+ monitoring:
782
+ enabled: true
783
+ metrics_format: prometheus
784
+ ```
785
+
786
+ ## Support
787
+
788
+ For more help:
789
+
790
+ - 📖 [Main Documentation](README.md)
791
+ - 🐛 [Issue Tracker](https://github.com/zhuangbiaowei/smart_prompt/issues)
792
+ - 💬 [Discussions](https://github.com/zhuangbiaowei/smart_prompt/discussions)
793
+ - 📧 Email: zbw@kaiyuanshe.org
794
+
795
+ ---
796
+
797
+ **SmartPrompt History Management** - Intelligent conversation history for production applications.