mistral_translator 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +44 -0
  3. data/CONTRIBUTING.md +70 -0
  4. data/LICENSE.txt +6 -6
  5. data/README.md +212 -119
  6. data/README_TESTING.md +33 -0
  7. data/SECURITY.md +157 -0
  8. data/docs/.nojekyll +2 -0
  9. data/docs/404.html +30 -0
  10. data/docs/README.md +153 -0
  11. data/docs/advanced-usage/batch-processing.md +158 -0
  12. data/docs/advanced-usage/concurrent-async.md +270 -0
  13. data/docs/advanced-usage/error-handling.md +106 -0
  14. data/docs/advanced-usage/monitoring.md +133 -0
  15. data/docs/advanced-usage/summarization.md +86 -0
  16. data/docs/advanced-usage/translations.md +141 -0
  17. data/docs/api-reference/callbacks.md +231 -0
  18. data/docs/api-reference/configuration.md +74 -0
  19. data/docs/api-reference/errors.md +673 -0
  20. data/docs/api-reference/methods.md +539 -0
  21. data/docs/getting-started.md +179 -0
  22. data/docs/index.html +27 -0
  23. data/docs/installation.md +142 -0
  24. data/docs/migration-0.1.0-to-0.2.0.md +61 -0
  25. data/docs/rails-integration/adapters.md +84 -0
  26. data/docs/rails-integration/controllers.md +107 -0
  27. data/docs/rails-integration/jobs.md +97 -0
  28. data/docs/rails-integration/setup.md +339 -0
  29. data/examples/basic_usage.rb +129 -102
  30. data/examples/batch-job.rb +511 -0
  31. data/examples/monitoring-setup.rb +499 -0
  32. data/examples/rails-model.rb +399 -0
  33. data/lib/mistral_translator/adapters.rb +261 -0
  34. data/lib/mistral_translator/client.rb +103 -100
  35. data/lib/mistral_translator/client_helpers.rb +191 -0
  36. data/lib/mistral_translator/configuration.rb +191 -1
  37. data/lib/mistral_translator/errors.rb +16 -0
  38. data/lib/mistral_translator/helpers.rb +292 -0
  39. data/lib/mistral_translator/helpers_extensions.rb +150 -0
  40. data/lib/mistral_translator/levenshtein_helpers.rb +40 -0
  41. data/lib/mistral_translator/logger.rb +39 -8
  42. data/lib/mistral_translator/prompt_builder.rb +93 -41
  43. data/lib/mistral_translator/prompt_helpers.rb +83 -0
  44. data/lib/mistral_translator/prompt_metadata_helpers.rb +42 -0
  45. data/lib/mistral_translator/response_parser.rb +194 -23
  46. data/lib/mistral_translator/security.rb +72 -0
  47. data/lib/mistral_translator/summarizer.rb +41 -2
  48. data/lib/mistral_translator/translator.rb +174 -98
  49. data/lib/mistral_translator/translator_helpers.rb +268 -0
  50. data/lib/mistral_translator/version.rb +1 -1
  51. data/lib/mistral_translator.rb +51 -25
  52. metadata +55 -3
@@ -0,0 +1,270 @@
1
+ # Concurrent & Asynchronous Processing
2
+
3
+ MistralTranslator is thread-safe and optimized for concurrent usage with protected metrics, connection pooling, and built-in rate limiting.
4
+
5
+ ## Thread Safety
6
+
7
+ **Protected components:**
8
+ - Metrics tracking with Mutex
9
+ - Logger cache with concurrent access control
10
+ - Rate limiter with thread synchronization
11
+ - HTTP connection pooling (Net::HTTP::Persistent)
12
+
13
+ ## Ruby Threads
14
+
15
+ ### Basic Usage
16
+
17
+ ```ruby
18
+ languages = ['fr', 'es', 'de', 'it']
19
+ threads = languages.map do |lang|
20
+ Thread.new do
21
+ MistralTranslator.translate("Hello", from: 'en', to: lang)
22
+ rescue MistralTranslator::Error => e
23
+ { error: e.message }
24
+ end
25
+ end
26
+
27
+ results = threads.map(&:value)
28
+ ```
29
+
30
+ ### Thread Pool Pattern
31
+
32
+ ```ruby
33
+ # Safe: Limited threads
34
+ texts.each_slice(5) do |batch|
35
+ threads = batch.map { |t| Thread.new { translate(t) } }
36
+ threads.each(&:join)
37
+ end
38
+
39
+ # Dangerous: Unlimited threads can exhaust memory
40
+ texts.map { |t| Thread.new { translate(t) } } # Avoid this
41
+ ```
42
+
43
+ ## Concurrent Ruby (Recommended)
44
+
45
+ ### Installation
46
+
47
+ ```ruby
48
+ # Gemfile
49
+ gem 'concurrent-ruby', '~> 1.2'
50
+ ```
51
+
52
+ ### Fixed Thread Pool
53
+
54
+ ```ruby
55
+ require 'concurrent'
56
+
57
+ pool = Concurrent::FixedThreadPool.new(5)
58
+
59
+ futures = languages.map do |lang|
60
+ Concurrent::Future.execute(executor: pool) do
61
+ MistralTranslator.translate(text, from: 'en', to: lang)
62
+ end
63
+ end
64
+
65
+ results = futures.map(&:value)
66
+
67
+ pool.shutdown
68
+ pool.wait_for_termination
69
+ ```
70
+
71
+ ### Promise Chains
72
+
73
+ ```ruby
74
+ promise = Concurrent::Promise.execute do
75
+ MistralTranslator.translate_auto(text, to: 'en')
76
+ end.then do |english|
77
+ target_langs.map do |lang|
78
+ MistralTranslator.translate(english, from: 'en', to: lang)
79
+ end
80
+ end.rescue do |error|
81
+ Rails.logger.error "Pipeline failed: #{error}"
82
+ []
83
+ end
84
+
85
+ results = promise.value
86
+ ```
87
+
88
+ ## Background Jobs
89
+
90
+ ### SolidQueue (Rails 8+, Recommended)
91
+
92
+ SolidQueue is the default ActiveJob backend in Rails 8, database-backed and Redis-free.
93
+
94
+ ```ruby
95
+ # config/database.yml - SolidQueue uses your existing database
96
+ production:
97
+ primary:
98
+ <<: *default
99
+ queue: # Separate DB for jobs (optional)
100
+ <<: *default
101
+ database: app_queue
102
+ migrations_paths: db/queue_migrate
103
+
104
+ # app/jobs/translation_job.rb
105
+ class TranslationJob < ApplicationJob
106
+ queue_as :translations
107
+
108
+ retry_on MistralTranslator::RateLimitError, wait: :polynomially_longer
109
+ retry_on MistralTranslator::ApiError, wait: 5.seconds, attempts: 3
110
+
111
+ def perform(text, from_locale, to_locale)
112
+ MistralTranslator.translate(text, from: from_locale, to: to_locale)
113
+ end
114
+ end
115
+
116
+ # Usage
117
+ languages.each { |lang| TranslationJob.perform_later(text, 'en', lang) }
118
+ ```
119
+
120
+ **Configuration:**
121
+ ```ruby
122
+ # config/recurring.yml - Schedule periodic translations
123
+ production:
124
+ sync_translations:
125
+ class: TranslationSyncJob
126
+ schedule: every day at 3am
127
+ queue: translations
128
+ ```
129
+
130
+ ### Sidekiq / Other Backends
131
+
132
+ Compatible with any ActiveJob backend:
133
+
134
+ ```ruby
135
+ class TranslationJob < ApplicationJob
136
+ queue_as :translations
137
+ retry_on MistralTranslator::RateLimitError, wait: :exponentially_longer
138
+
139
+ def perform(text, from, to)
140
+ MistralTranslator.translate(text, from: from, to: to)
141
+ end
142
+ end
143
+ ```
144
+
145
+ ## Performance Best Practices
146
+
147
+ ### Configuration for Concurrent Use
148
+
149
+ ```ruby
150
+ MistralTranslator.configure do |config|
151
+ config.enable_metrics = true
152
+ config.retry_delays = [1, 2, 4, 8] # Shorter for concurrent use
153
+
154
+ config.on_rate_limit = ->(from, to, wait, attempt, ts) {
155
+ Rails.logger.warn "Rate limit: #{wait}s wait"
156
+ }
157
+ end
158
+ ```
159
+
160
+ ### Recommendations
161
+
162
+ **Do:**
163
+ - Use fixed thread pools (5-10 threads)
164
+ - Enable metrics for monitoring
165
+ - Use background jobs for non-critical tasks
166
+ - Batch similar requests
167
+
168
+ **Avoid:**
169
+ - Unlimited thread creation
170
+ - Blocking user requests with translations
171
+ - Sharing API keys across environments
172
+
173
+ ## Rate Limiting
174
+
175
+ The built-in rate limiter (50 requests/60s) is thread-safe but per-process.
176
+
177
+ **Multiple processes:** Each has its own limiter. With 4 Puma workers = 200 requests/min total.
178
+
179
+ ### Custom Rate Limiter
180
+
181
+ ```ruby
182
+ class RedisRateLimiter
183
+ def wait_and_record!
184
+ key = "mistral:#{Time.now.to_i / 60}"
185
+ count = REDIS.incr(key)
186
+ REDIS.expire(key, 60) if count == 1
187
+
188
+ sleep(60 - Time.now.to_i % 60) if count > 50
189
+ end
190
+ end
191
+
192
+ client = MistralTranslator::Client.new(
193
+ rate_limiter: RedisRateLimiter.new
194
+ )
195
+ ```
196
+
197
+ ## Monitoring
198
+
199
+ ```ruby
200
+ MistralTranslator.configure do |config|
201
+ config.enable_metrics = true
202
+
203
+ config.on_translation_complete = ->(from, to, orig, trans, duration) {
204
+ Rails.logger.info "[#{Thread.current.object_id}] #{from}→#{to}: #{duration.round(2)}s"
205
+ }
206
+ end
207
+
208
+ # Thread-safe metrics
209
+ metrics = MistralTranslator.metrics
210
+ puts "Total: #{metrics[:total_translations]}"
211
+ puts "Avg time: #{metrics[:average_translation_time]}s"
212
+ puts "Error rate: #{metrics[:error_rate]}%"
213
+ ```
214
+
215
+ ## Example: High-Performance Service
216
+
217
+ ```ruby
218
+ class TranslationService
219
+ def initialize(max_threads: 5)
220
+ @pool = Concurrent::FixedThreadPool.new(max_threads)
221
+ end
222
+
223
+ def translate_all(texts, from:, to_languages:)
224
+ futures = texts.flat_map do |text|
225
+ to_languages.map do |lang|
226
+ Concurrent::Future.execute(executor: @pool) do
227
+ MistralTranslator.translate(text, from: from, to: lang)
228
+ end
229
+ end
230
+ end
231
+ futures.map(&:value)
232
+ ensure
233
+ @pool.shutdown
234
+ @pool.wait_for_termination(30)
235
+ end
236
+ end
237
+ ```
238
+
239
+ ## Troubleshooting
240
+
241
+ **"Too many open files"**
242
+ ```bash
243
+ ulimit -n 4096
244
+ ```
245
+
246
+ **Rate limits despite low volume**
247
+ Check if multiple processes share the API key. Consider Redis-based rate limiting.
248
+
249
+ **Memory growth**
250
+ ```ruby
251
+ # Join threads periodically
252
+ threads = []
253
+ texts.each do |text|
254
+ threads << Thread.new { translate(text) }
255
+
256
+ if threads.size >= 10
257
+ threads.each(&:join)
258
+ threads.clear
259
+ end
260
+ end
261
+ ```
262
+
263
+ ## Summary
264
+
265
+ - Thread-safe with protected metrics and configuration
266
+ - Connection pooling automatic with net-http-persistent
267
+ - Use fixed-size thread pools (5-10 recommended)
268
+ - Built-in rate limiting works across threads
269
+ - Background jobs recommended for non-critical tasks
270
+ - Separate API keys per environment
@@ -0,0 +1,106 @@
1
+ > **Navigation :** [🏠 Home](README.md) • [📖 API Reference](api-reference/methods.md) • [⚡ Advanced Usage](advanced-usage/translations.md) • [🛤️ Rails Integration](rails-integration/setup.md)
2
+
3
+ ---
4
+
5
+ # Gestion des Erreurs
6
+
7
+ Patterns et stratégies pour gérer les erreurs de traduction de façon robuste.
8
+
9
+ ## 🚨 Types d'Erreurs
10
+
11
+ ### Erreurs de Configuration
12
+
13
+ - **`ConfigurationError`** : Clé API manquante
14
+ - **`AuthenticationError`** : Clé API invalide
15
+
16
+ ### Erreurs d'API
17
+
18
+ - **`RateLimitError`** : Quota dépassé
19
+ - **`ApiError`** : Erreurs serveur (500, 502, 503)
20
+ - **`InvalidResponseError`** : Réponse malformée
21
+
22
+ ### Erreurs de Contenu
23
+
24
+ - **`EmptyTranslationError`** : Traduction vide
25
+ - **`UnsupportedLanguageError`** : Langue non reconnue
26
+
27
+ ## ⚡ Stratégies de Retry
28
+
29
+ ### Configuration
30
+
31
+ ```ruby
32
+ MistralTranslator.configure do |config|
33
+ config.retry_delays = [1, 3, 6, 12] # Délais exponentiels
34
+ end
35
+ ```
36
+
37
+ ### Retry par Type d'Erreur
38
+
39
+ - **Rate Limit** → Délai long (30s, 60s)
40
+ - **Erreur Serveur** → Délai court (2s, 4s)
41
+ - **Auth/Config** → Pas de retry
42
+ - **Response Invalide** → 1-2 tentatives max
43
+
44
+ ## 🔄 Circuit Breaker
45
+
46
+ ### Principe
47
+
48
+ ```
49
+ FERMÉ → OUVERT → DEMI-OUVERT → FERMÉ
50
+ ```
51
+
52
+ - **FERMÉ** : Fonctionnement normal
53
+ - **OUVERT** : Échecs répétés → court-circuite
54
+ - **DEMI-OUVERT** : Test après timeout
55
+
56
+ ### Paramètres
57
+
58
+ - Seuil : 5 erreurs consécutives
59
+ - Timeout : 5 minutes
60
+ - Reset : 3 succès consécutifs
61
+
62
+ ## 🛡️ Fallback Strategies
63
+
64
+ ### Cascade Recommandée
65
+
66
+ 1. API Mistral
67
+ 2. Cache de traductions
68
+ 3. Service alternatif
69
+ 4. Texte original
70
+
71
+ ### Validation Qualité
72
+
73
+ - Ratio longueur : 0.3x - 3x de l'original
74
+ - Pas identique à l'original
75
+ - Encodage correct
76
+
77
+ ## 📊 Monitoring
78
+
79
+ ### Métriques Clés
80
+
81
+ - Taux d'erreur global (< 5%)
82
+ - Types d'erreurs par fréquence
83
+ - Efficacité des fallbacks
84
+
85
+ ### Alertes
86
+
87
+ ```yaml
88
+ Critique:
89
+ - Auth échoue
90
+ - Erreurs > 20% sur 5min
91
+
92
+ Important:
93
+ - Rate limit > 10/h
94
+ - Circuit breaker ouvert
95
+ ```
96
+
97
+ ## 🎯 Configuration par Environnement
98
+
99
+ **Development** : Logs verbeux, retry rapides
100
+ **Test** : Mocks d'erreurs, validation timeouts
101
+ **Production** : Circuit breaker, fallbacks gracieux
102
+
103
+ ---
104
+
105
+ **Advanced Usage Navigation:**
106
+ [← Translations](advanced-usage/translations.md) | [Batch Processing](advanced-usage/batch-processing.md) | [Error Handling](advanced-usage/error-handling.md) | [Monitoring](advanced-usage/monitoring.md) | [Summarization](advanced-usage/summarization.md) →
@@ -0,0 +1,133 @@
1
+ > **Navigation :** [🏠 Home](README.md) • [📖 API Reference](api-reference/methods.md) • [⚡ Advanced Usage](advanced-usage/translations.md) • [🛤️ Rails Integration](rails-integration/setup.md)
2
+
3
+ ---
4
+
5
+ # Monitoring et Métriques
6
+
7
+ Surveillez et optimisez vos traductions avec des métriques pertinentes et des alertes intelligentes.
8
+
9
+ ## 📊 Métriques Essentielles
10
+
11
+ ### Configuration de base
12
+
13
+ ```ruby
14
+ MistralTranslator.configure do |config|
15
+ config.enable_metrics = true
16
+
17
+ # Callbacks pour vos systèmes de monitoring
18
+ config.on_translation_complete = ->(from, to, orig_len, trans_len, duration) {
19
+ # Intégrer avec votre système (StatsD, DataDog, etc.)
20
+ }
21
+ end
22
+ ```
23
+
24
+ ### Métriques clés à suivre
25
+
26
+ **Performance :**
27
+
28
+ - Temps de réponse moyen/médian/P95
29
+ - Throughput (traductions/minute)
30
+ - Taille des textes traités
31
+
32
+ **Qualité :**
33
+
34
+ - Taux de succès vs erreurs
35
+ - Types d'erreurs (auth, rate limit, timeout)
36
+ - Score de confiance moyen
37
+
38
+ **Usage :**
39
+
40
+ - Paires de langues populaires
41
+ - Volume par heure/jour/mois
42
+ - Coût estimé
43
+
44
+ ## 🚨 Alertes Recommandées
45
+
46
+ ### Alertes Critiques
47
+
48
+ - **Taux d'erreur > 10%** sur 5 minutes
49
+ - **Temps de réponse > 30s** de façon répétée
50
+ - **Erreurs d'authentification** (problème de clé API)
51
+
52
+ ### Alertes d'Information
53
+
54
+ - **Rate limit atteint** (ajuster le throttling)
55
+ - **Pic d'usage inhabituel** (analyser la cause)
56
+ - **Nouvelle paire de langues** utilisée
57
+
58
+ ## 📈 Dashboard Suggéré
59
+
60
+ ### Vue d'ensemble
61
+
62
+ ```
63
+ ┌─────────────────┬─────────────────┬─────────────────┐
64
+ │ Traductions/h │ Temps moyen │ Taux de succès │
65
+ │ 250 │ 1.2s │ 99.2% │
66
+ ├─────────────────┼─────────────────┼─────────────────┤
67
+ │ Top langues │ Erreurs/h │ Coût du jour │
68
+ │ fr→en (45%) │ 2 │ $12.34 │
69
+ │ en→es (23%) │ │ │
70
+ └─────────────────┴─────────────────┴─────────────────┘
71
+ ```
72
+
73
+ ### Graphiques utiles
74
+
75
+ - **Timeline** : Volume de traductions dans le temps
76
+ - **Heatmap** : Paires de langues par popularité
77
+ - **Latency** : Distribution des temps de réponse
78
+ - **Errors** : Types d'erreurs par période
79
+
80
+ ## 🔍 Logging Efficace
81
+
82
+ ### Structure de logs recommandée
83
+
84
+ ```
85
+ [TIMESTAMP] [LEVEL] [MistralTranslator] [OPERATION] from=fr to=en chars=150 duration=1.2s status=success
86
+ [TIMESTAMP] [ERROR] [MistralTranslator] [TRANSLATE] from=fr to=en error=rate_limit attempt=2
87
+ ```
88
+
89
+ ### Niveaux de log par environnement
90
+
91
+ - **Development** : DEBUG (tout)
92
+ - **Staging** : INFO (succès + erreurs)
93
+ - **Production** : WARN (erreurs + rate limits)
94
+
95
+ ## ⚡ Optimisation Basée sur les Métriques
96
+
97
+ ### Patterns à identifier
98
+
99
+ **Rate limiting :**
100
+
101
+ - Si beaucoup de rate limits → ajuster les délais
102
+ - Répartir les requêtes dans le temps
103
+
104
+ **Performance :**
105
+
106
+ - Textes longs = temps longs → découper si possible
107
+ - Certaines paires de langues plus lentes
108
+
109
+ **Usage :**
110
+
111
+ - Cache les traductions populaires
112
+ - Pre-traduire le contenu critique
113
+
114
+ ### Seuils d'alerte suggérés
115
+
116
+ ```yaml
117
+ performance:
118
+ response_time_p95: 10s
119
+ error_rate_5min: 5%
120
+
121
+ capacity:
122
+ requests_per_minute: 50
123
+ daily_cost: $100
124
+
125
+ quality:
126
+ confidence_score_avg: 0.7
127
+ empty_translations: 1%
128
+ ```
129
+
130
+ ---
131
+
132
+ **Advanced Usage Navigation:**
133
+ [← Translations](advanced-usage/translations.md) | [Batch Processing](advanced-usage/batch-processing.md) | [Error Handling](advanced-usage/error-handling.md) | [Monitoring](advanced-usage/monitoring.md) | [Summarization](advanced-usage/summarization.md) →
@@ -0,0 +1,86 @@
1
+ > **Navigation :** [🏠 Home](README.md) • [📖 API Reference](api-reference/methods.md) • [⚡ Advanced Usage](advanced-usage/translations.md) • [🛤️ Rails Integration](rails-integration/setup.md)
2
+
3
+ ---
4
+
5
+ # Résumés Intelligents
6
+
7
+ Résumés automatiques, multi-niveaux et multilingues.
8
+
9
+ ## 📝 Résumé Simple
10
+
11
+ ```ruby
12
+ summarizer = MistralTranslator::Summarizer.new
13
+
14
+ summary = summarizer.summarize(
15
+ long_text,
16
+ language: "fr",
17
+ max_words: 100
18
+ )
19
+ ```
20
+
21
+ ## 🌍 Résumé + Traduction
22
+
23
+ ```ruby
24
+ # Résume ET traduit en une opération
25
+ french_summary = summarizer.summarize_and_translate(
26
+ english_article,
27
+ from: "en",
28
+ to: "fr",
29
+ max_words: 150
30
+ )
31
+ ```
32
+
33
+ ## 📊 Multi-niveaux
34
+
35
+ ```ruby
36
+ summaries = summarizer.summarize_tiered(
37
+ article,
38
+ language: "fr",
39
+ short: 50, # Tweet
40
+ medium: 150, # Paragraphe
41
+ long: 400 # Article court
42
+ )
43
+
44
+ puts summaries[:short]
45
+ puts summaries[:medium]
46
+ puts summaries[:long]
47
+ ```
48
+
49
+ ## 🗺️ Multi-langues
50
+
51
+ ```ruby
52
+ results = summarizer.summarize_to_multiple(
53
+ document,
54
+ languages: ["fr", "en", "es"],
55
+ max_words: 200
56
+ )
57
+ # => { "fr" => "résumé...", "en" => "summary...", "es" => "resumen..." }
58
+ ```
59
+
60
+ ## 🎯 Styles de Résumé
61
+
62
+ ```ruby
63
+ # Différents styles selon l'usage
64
+ summarizer.summarize(content, context: "Executive summary, key metrics")
65
+ summarizer.summarize(content, context: "Social media, engaging tone")
66
+ summarizer.summarize(content, context: "Technical documentation")
67
+ ```
68
+
69
+ ## 📈 Longueurs Recommandées
70
+
71
+ **Selon taille du contenu :**
72
+
73
+ - 0-200 mots → Résumé 30-50 mots
74
+ - 200-800 mots → Résumé 50-100 mots
75
+ - 800+ mots → Résumé 100-200 mots
76
+
77
+ **Selon usage :**
78
+
79
+ - Tweet → 50 mots max
80
+ - Meta description → 150 mots max
81
+ - Résumé exécutif → 200-400 mots
82
+
83
+ ---
84
+
85
+ **Advanced Usage Navigation:**
86
+ [← Translations](advanced-usage/translations.md) | [Batch Processing](advanced-usage/batch-processing.md) | [Error Handling](advanced-usage/error-handling.md) | [Monitoring](advanced-usage/monitoring.md) | [Summarization](advanced-usage/summarization.md) →