gptrans 1.8.8 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -10,6 +10,8 @@ Whether you're building a multilingual website, a mobile app, or a localization
10
10
 
11
11
  - **AI-Powered Translations:** Harness advanced models like OpenAI's GPT and Anthropic's Sonnet for high-quality translations
12
12
  - **Smart Batching & Debouncing:** Translations are processed in batches, not only for efficiency but also to provide better context. By sending multiple related texts together, the AI model can better understand the overall context and produce more accurate and consistent translations across related terms and phrases.
13
+ - **Reference Translations:** Use existing translations in other languages as context to improve accuracy and consistency across your multilingual content
14
+ - **Flexible Base Language:** Translate from an intermediate language instead of the original, useful for avoiding gender-specific terms or leveraging more neutral language versions
13
15
  - **Caching with JSON:** Quickly retrieves cached translations to boost performance
14
16
  - **Parameter Substitution:** Dynamically replace placeholders in your translations
15
17
  - **Smart Context Handling:** Add contextual information to improve translation accuracy. Perfect for gender-aware translations, domain-specific content, or any scenario where additional context helps produce better results. The context is automatically cleared after each translation to prevent unintended effects.
@@ -117,12 +119,85 @@ If you're looking to streamline your translation workflow and bring your applica
117
119
 
118
120
  ## 🔄 Preloading Translations
119
121
 
120
- You can preload translations for specific languages using the `preload` method. This is particularly useful when you want to initialize translations based on dynamically generated keys:
122
+ The `preload()` method allows you to pre-translate all texts in your database. It now supports advanced options for improved translation accuracy through reference translations and alternate base languages.
123
+
124
+ ### Basic Usage
125
+
126
+ ```javascript
127
+ // Simple preload - translates all pending texts
128
+ await gptrans.preload();
129
+ ```
130
+
131
+ ### Advanced Options
132
+
133
+ #### Using Reference Translations
134
+
135
+ Include translations from other languages as context to improve accuracy and consistency:
136
+
137
+ ```javascript
138
+ // Use English and Portuguese translations as reference
139
+ const gptrans = new GPTrans({ from: 'es', target: 'fr' });
140
+ await gptrans.preload({
141
+ references: ['en', 'pt']
142
+ });
143
+ ```
144
+
145
+ The AI model will see existing translations in the reference languages, helping it produce more consistent and accurate translations.
146
+
147
+ #### Using an Alternate Base Language
148
+
149
+ Translate from an intermediate language instead of the original:
121
150
 
122
151
  ```javascript
123
- await gptrans.preload({ target:'ar' });
152
+ // Original is Spanish, but translate FROM English TO Portuguese
153
+ const gptrans = new GPTrans({ from: 'es', target: 'pt' });
154
+ await gptrans.preload({
155
+ baseLanguage: 'en'
156
+ });
124
157
  ```
125
158
 
159
+ This is useful when:
160
+ - The intermediate language has characteristics that better suit the target (e.g., English "he/she" can be omitted in some languages)
161
+ - You want to avoid gender-specific terms present in the source language
162
+ - The intermediate translation is cleaner or more universal
163
+
164
+ #### Combined Usage
165
+
166
+ You can use both options together:
167
+
168
+ ```javascript
169
+ // Translate from English to German, showing Spanish and Portuguese as reference
170
+ const gptrans = new GPTrans({ from: 'es', target: 'de' });
171
+ await gptrans.preload({
172
+ baseLanguage: 'en',
173
+ references: ['es', 'pt']
174
+ });
175
+ ```
176
+
177
+ ### Real-World Example
178
+
179
+ **Problem:** Gender-specific translations
180
+
181
+ ```javascript
182
+ // Spanish: "El estudiante es muy bueno" (masculine)
183
+ // English: "The student is very good" (neutral)
184
+
185
+ // Solution: Translate to Portuguese using English as base
186
+ const ptTranslator = new GPTrans({ from: 'es', target: 'pt' });
187
+ await ptTranslator.preload({
188
+ baseLanguage: 'en', // Use neutral English version
189
+ references: ['es'] // Show original Spanish for context
190
+ });
191
+ ```
192
+
193
+ ### Parameters
194
+
195
+ | Parameter | Type | Description |
196
+ |-----------|------|-------------|
197
+ | `references` | `string[]` | Array of language codes (e.g., `['en', 'pt']`) to use as translation references |
198
+ | `baseLanguage` | `string` | Language code to use as the base for translation instead of the original |
199
+
200
+
126
201
  ### Model Fallback System
127
202
 
128
203
  GPTrans supports a fallback mechanism for translation models. Instead of providing a single model, you can pass an array of models:
@@ -20,7 +20,7 @@ async function testReferences() {
20
20
  target: 'en',
21
21
  model: 'sonnet45',
22
22
  name: 'ref_test',
23
- debug: true
23
+ debug: 3
24
24
  });
25
25
 
26
26
  const ptTranslator = new GPTrans({
@@ -28,7 +28,7 @@ async function testReferences() {
28
28
  target: 'pt',
29
29
  model: 'sonnet45',
30
30
  name: 'ref_test',
31
- debug: true
31
+ debug: 3
32
32
  });
33
33
 
34
34
  // Sample Spanish texts with gendered language
@@ -44,14 +44,16 @@ async function testReferences() {
44
44
  console.log(` EN: ${enTranslator.t(text)}`);
45
45
  });
46
46
 
47
- await new Promise(resolve => setTimeout(resolve, 5000));
47
+ // Wait for EN translations to complete if needed
48
+ await enTranslator.preload();
48
49
 
49
50
  console.log('\n📝 Creating Portuguese translations from Spanish...');
50
51
  spanishTexts.forEach(text => {
51
52
  console.log(` PT: ${ptTranslator.t(text)}`);
52
53
  });
53
54
 
54
- await new Promise(resolve => setTimeout(resolve, 5000));
55
+ // Wait for PT translations to complete if needed
56
+ await ptTranslator.preload();
55
57
 
56
58
  console.log('\n' + '='.repeat(70));
57
59
  console.log('\n📋 Test 2: Translation with English as reference\n');
@@ -62,7 +64,7 @@ async function testReferences() {
62
64
  target: 'fr',
63
65
  model: 'sonnet45',
64
66
  name: 'ref_test',
65
- debug: true
67
+ debug: 3
66
68
  });
67
69
 
68
70
  console.log('🔄 Preloading French translations with English as reference...');
@@ -85,7 +87,7 @@ async function testReferences() {
85
87
  target: 'it',
86
88
  model: 'sonnet45',
87
89
  name: 'ref_test',
88
- debug: true
90
+ debug: 3
89
91
  });
90
92
 
91
93
  console.log('🔄 Preloading Italian translations with English as base language...');
@@ -109,7 +111,7 @@ async function testReferences() {
109
111
  target: 'de',
110
112
  model: 'sonnet45',
111
113
  name: 'ref_test',
112
- debug: true
114
+ debug: 3
113
115
  });
114
116
 
115
117
  console.log('🔄 Preloading German translations with multiple references...');
package/index.js CHANGED
@@ -247,29 +247,50 @@ class GPTrans {
247
247
 
248
248
  model.setSystem("You are an expert translator specialized in literary translation between FROM_LANG and TARGET_DENONYM TARGET_LANG.");
249
249
 
250
- model.addTextFromFile(this.promptFile);
250
+ // Read and process prompt file
251
+ let promptContent = fs.readFileSync(this.promptFile, 'utf-8');
251
252
 
252
253
  // Format references if available
253
254
  let referencesText = '';
254
255
  if (Object.keys(batchReferences).length > 0 && batch.length > 0) {
255
- const textsArray = text.split(`\n${this.divider}\n`);
256
-
257
- referencesText = textsArray.map((txt, index) => {
258
- const key = batch[index] ? batch[index][0] : null;
259
- if (key && batchReferences[key]) {
260
- const refs = batchReferences[key];
261
- const refLines = Object.entries(refs).map(([lang, translation]) => {
262
- try {
263
- const langInfo = isoAssoc(lang);
264
- return `${langInfo.DENONYM} ${langInfo.LANG} (${lang}): ${translation}`;
265
- } catch (e) {
266
- return `${lang}: ${translation}`;
256
+ // Group all references by language first
257
+ const refsByLang = {};
258
+
259
+ batch.forEach(([key], index) => {
260
+ if (batchReferences[key]) {
261
+ Object.entries(batchReferences[key]).forEach(([lang, translation]) => {
262
+ if (!refsByLang[lang]) {
263
+ refsByLang[lang] = [];
267
264
  }
268
- }).join(`\n${this.divider}\n`);
269
- return refLines;
265
+ refsByLang[lang].push(translation);
266
+ });
267
+ }
268
+ });
269
+
270
+ // Format: one language header, then all its translations with bullets
271
+ const refBlocks = Object.entries(refsByLang).map(([lang, translations]) => {
272
+ try {
273
+ const langInfo = isoAssoc(lang);
274
+ const header = `### ${langInfo.DENONYM} ${langInfo.LANG} (${lang}):`;
275
+ const content = translations.map(t => `- ${t}`).join('\n');
276
+ return `${header}\n${content}`;
277
+ } catch (e) {
278
+ const header = `### ${lang}:`;
279
+ const content = translations.map(t => `- ${t}`).join('\n');
280
+ return `${header}\n${content}`;
270
281
  }
271
- return '';
272
- }).filter(r => r).join(`\n\n`);
282
+ });
283
+
284
+ referencesText = refBlocks.join(`\n\n`);
285
+ }
286
+
287
+ // Remove reference section if no references
288
+ if (!referencesText) {
289
+ // Remove the entire "Reference Translations" section
290
+ promptContent = promptContent.replace(
291
+ /## Reference Translations \(for context\)[\s\S]*?(?=\n#|$)/,
292
+ ''
293
+ );
273
294
  }
274
295
 
275
296
  // Determine which FROM_ values to use
@@ -282,21 +303,24 @@ class GPTrans {
282
303
  }
283
304
  }
284
305
 
285
- model.replace({
286
- INPUT: text,
287
- CONTEXT: this.context,
288
- REFERENCES: referencesText || 'None'
306
+ // Apply replacements to prompt
307
+ promptContent = promptContent
308
+ .replace(/INPUT/g, text)
309
+ .replace(/CONTEXT/g, this.context)
310
+ .replace(/REFERENCES/g, referencesText);
311
+
312
+ // Apply language-specific replacements
313
+ Object.entries(this.replaceTarget).forEach(([key, value]) => {
314
+ promptContent = promptContent.replace(new RegExp(key, 'g'), value);
315
+ });
316
+ Object.entries(fromReplace).forEach(([key, value]) => {
317
+ promptContent = promptContent.replace(new RegExp(key, 'g'), value);
289
318
  });
290
- model.replace(this.replaceTarget);
291
- model.replace(fromReplace);
292
319
 
293
- const response = await model.message();
320
+ model.addText(promptContent);
294
321
 
295
- const codeBlockRegex = /```(?:\w*\n)?([\s\S]*?)```/;
296
- const match = response.match(codeBlockRegex);
297
- const translatedText = match ? match[1].trim() : response;
322
+ return model.block({ addSystemExtra: false });
298
323
 
299
- return translatedText;
300
324
  } finally {
301
325
  // Always release the lock
302
326
  releaseLock();
@@ -347,8 +371,24 @@ class GPTrans {
347
371
  return this;
348
372
  }
349
373
 
374
+ // Filter out invalid references
375
+ const validReferences = references.filter(lang => {
376
+ const normalizedLang = this.normalizeBCP47(lang);
377
+ // Don't include target language as reference (we're translating TO it)
378
+ if (normalizedLang === this.replaceTarget.TARGET_ISO) {
379
+ console.warn(`Ignoring reference language '${lang}': cannot use target language as reference`);
380
+ return false;
381
+ }
382
+ // Don't include baseLanguage as reference (redundant)
383
+ if (baseLanguage && normalizedLang === this.normalizeBCP47(baseLanguage)) {
384
+ console.warn(`Ignoring reference language '${lang}': same as baseLanguage`);
385
+ return false;
386
+ }
387
+ return true;
388
+ });
389
+
350
390
  // Store preload options for use in translation
351
- this.preloadReferences = references;
391
+ this.preloadReferences = validReferences;
352
392
  this.preloadBaseLanguage = baseLanguage;
353
393
 
354
394
  // Track which keys need translation
@@ -365,8 +405,9 @@ class GPTrans {
365
405
  // Check if translation already exists
366
406
  if (!this.dbTarget.get(contextHash, key)) {
367
407
  keysNeedingTranslation.push({ context, contextHash, key });
408
+ // Only call get() if translation doesn't exist
409
+ this.get(key, text);
368
410
  }
369
- this.get(key, text);
370
411
  }
371
412
  }
372
413
 
@@ -378,10 +419,8 @@ class GPTrans {
378
419
  }
379
420
 
380
421
  // Wait for any pending translations to complete
381
- const maxWaitTime = 120000; // 120 seconds timeout
382
- const startTime = Date.now();
383
-
384
- await new Promise((resolve, reject) => {
422
+ // No global timeout - each translation request has its own timeout
423
+ await new Promise((resolve) => {
385
424
  const checkInterval = setInterval(() => {
386
425
  // Check if there are still pending translations or batch being processed
387
426
  const hasPending = this.pendingTranslations.size > 0 || this.isProcessingBatch;
@@ -399,13 +438,6 @@ class GPTrans {
399
438
  clearInterval(checkInterval);
400
439
  resolve();
401
440
  }
402
-
403
- // Timeout check
404
- if (Date.now() - startTime > maxWaitTime) {
405
- clearInterval(checkInterval);
406
- console.warn(`Preload timeout: ${keysNeedingTranslation.length} translations pending`);
407
- resolve(); // Resolve instead of reject to allow partial completion
408
- }
409
441
  }, 100);
410
442
  });
411
443
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "gptrans",
3
3
  "type": "module",
4
- "version": "1.8.8",
4
+ "version": "1.9.0",
5
5
  "description": "🚆 GPTrans - The smarter AI-powered way to translate.",
6
6
  "keywords": [
7
7
  "translate",
@@ -8,9 +8,8 @@ INPUT
8
8
 
9
9
  ## Reference Translations (for context)
10
10
  These are existing translations in other languages that may help you provide a more accurate translation. Use them as reference but do not simply copy them:
11
- ```
11
+
12
12
  REFERENCES
13
- ```
14
13
 
15
14
  # Return Format
16
15
  - Provide the final translation within a code block using ```.
@@ -1,101 +0,0 @@
1
- # GPTrans - Referencias Múltiples y Idioma Base Alternativo
2
-
3
- ## Nuevas Funcionalidades
4
-
5
- El método `preload()` ahora soporta dos parámetros opcionales que mejoran la precisión de las traducciones:
6
-
7
- ### 1. `references` - Referencias Múltiples
8
-
9
- Permite incluir traducciones existentes en otros idiomas como contexto adicional para el modelo de IA.
10
-
11
- ```javascript
12
- await gptrans.preload({
13
- references: ['en', 'pt'] // Usar inglés y portugués como referencia
14
- });
15
- ```
16
-
17
- **Caso de uso**: Cuando tienes traducciones en varios idiomas y quieres que la nueva traducción sea consistente con las existentes.
18
-
19
- ### 2. `baseLanguage` - Idioma Base Alternativo
20
-
21
- Permite usar un idioma diferente al original como base para la traducción.
22
-
23
- ```javascript
24
- await gptrans.preload({
25
- baseLanguage: 'en' // Traducir desde inglés en vez del idioma original
26
- });
27
- ```
28
-
29
- **Caso de uso**: Cuando el texto original tiene características específicas del idioma (como he/she en inglés) que pueden omitirse en la traducción final.
30
-
31
- ## Ejemplos de Uso
32
-
33
- ### Ejemplo 1: Traducción Básica (sin cambios)
34
-
35
- ```javascript
36
- const gptrans = new GPTrans({ from: 'en', target: 'es' });
37
- await gptrans.preload(); // Comportamiento original
38
- ```
39
-
40
- ### Ejemplo 2: Con Referencias
41
-
42
- ```javascript
43
- // Original en español, traducir a francés usando inglés como referencia
44
- const gptrans = new GPTrans({ from: 'es', target: 'fr' });
45
- await gptrans.preload({
46
- references: ['en'] // El modelo verá la traducción en inglés como contexto
47
- });
48
- ```
49
-
50
- ### Ejemplo 3: Con Idioma Base Alternativo
51
-
52
- ```javascript
53
- // Original en español, pero traducir DE inglés A portugués
54
- const gptrans = new GPTrans({ from: 'es', target: 'pt' });
55
- await gptrans.preload({
56
- baseLanguage: 'en' // Usa la traducción en inglés como base
57
- });
58
- ```
59
-
60
- ### Ejemplo 4: Combinado
61
-
62
- ```javascript
63
- // Original en español, traducir de inglés a alemán, mostrando español y portugués como referencia
64
- const gptrans = new GPTrans({ from: 'es', target: 'de' });
65
- await gptrans.preload({
66
- baseLanguage: 'en', // Traduce desde inglés
67
- references: ['es', 'pt'] // Muestra español y portugués como contexto adicional
68
- });
69
- ```
70
-
71
- ## Caso de Uso Real: Evitar Problemas de Género
72
-
73
- ### Problema
74
- En inglés: "The student is very good" (neutral)
75
- En español: "El estudiante es muy bueno" / "La estudiante es muy buena" (con género)
76
- En portugués: Se puede omitir el género en algunos contextos
77
-
78
- ### Solución
79
- ```javascript
80
- // Si el original está en español pero queremos la traducción al portugués
81
- // basada en la versión inglesa (que es neutral):
82
-
83
- const ptTranslator = new GPTrans({ from: 'es', target: 'pt' });
84
- await ptTranslator.preload({
85
- baseLanguage: 'en', // Usa la versión inglesa como base
86
- references: ['es'] // Muestra el español original como referencia
87
- });
88
- ```
89
-
90
- ## Compatibilidad
91
-
92
- ✅ **Totalmente retrocompatible**: Si no especificas ningún parámetro, `preload()` funciona exactamente igual que antes.
93
-
94
- ```javascript
95
- // Esto sigue funcionando sin cambios:
96
- await gptrans.preload();
97
- ```
98
-
99
- ## Archivo de Prueba
100
-
101
- Ejecuta `demo/case_references.js` para ver ejemplos completos de todas las funcionalidades nuevas.