gptrans 1.8.8 → 1.9.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -10,6 +10,8 @@ Whether you're building a multilingual website, a mobile app, or a localization
10
10
 
11
11
  - **AI-Powered Translations:** Harness advanced models like OpenAI's GPT and Anthropic's Sonnet for high-quality translations
12
12
  - **Smart Batching & Debouncing:** Translations are processed in batches, not only for efficiency but also to provide better context. By sending multiple related texts together, the AI model can better understand the overall context and produce more accurate and consistent translations across related terms and phrases.
13
+ - **Reference Translations:** Use existing translations in other languages as context to improve accuracy and consistency across your multilingual content
14
+ - **Flexible Base Language:** Translate from an intermediate language instead of the original, useful for avoiding gender-specific terms or leveraging more neutral language versions
13
15
  - **Caching with JSON:** Quickly retrieves cached translations to boost performance
14
16
  - **Parameter Substitution:** Dynamically replace placeholders in your translations
15
17
  - **Smart Context Handling:** Add contextual information to improve translation accuracy. Perfect for gender-aware translations, domain-specific content, or any scenario where additional context helps produce better results. The context is automatically cleared after each translation to prevent unintended effects.
@@ -117,12 +119,85 @@ If you're looking to streamline your translation workflow and bring your applica
117
119
 
118
120
  ## 🔄 Preloading Translations
119
121
 
120
- You can preload translations for specific languages using the `preload` method. This is particularly useful when you want to initialize translations based on dynamically generated keys:
122
+ The `preload()` method allows you to pre-translate all texts in your database. It now supports advanced options for improved translation accuracy through reference translations and alternate base languages.
123
+
124
+ ### Basic Usage
125
+
126
+ ```javascript
127
+ // Simple preload - translates all pending texts
128
+ await gptrans.preload();
129
+ ```
130
+
131
+ ### Advanced Options
132
+
133
+ #### Using Reference Translations
134
+
135
+ Include translations from other languages as context to improve accuracy and consistency:
136
+
137
+ ```javascript
138
+ // Use English and Portuguese translations as reference
139
+ const gptrans = new GPTrans({ from: 'es', target: 'fr' });
140
+ await gptrans.preload({
141
+ references: ['en', 'pt']
142
+ });
143
+ ```
144
+
145
+ The AI model will see existing translations in the reference languages, helping it produce more consistent and accurate translations.
146
+
147
+ #### Using an Alternate Base Language
148
+
149
+ Translate from an intermediate language instead of the original:
121
150
 
122
151
  ```javascript
123
- await gptrans.preload({ target:'ar' });
152
+ // Original is Spanish, but translate FROM English TO Portuguese
153
+ const gptrans = new GPTrans({ from: 'es', target: 'pt' });
154
+ await gptrans.preload({
155
+ baseLanguage: 'en'
156
+ });
124
157
  ```
125
158
 
159
+ This is useful when:
160
+ - The intermediate language has characteristics that better suit the target (e.g., English "he/she" can be omitted in some languages)
161
+ - You want to avoid gender-specific terms present in the source language
162
+ - The intermediate translation is cleaner or more universal
163
+
164
+ #### Combined Usage
165
+
166
+ You can use both options together:
167
+
168
+ ```javascript
169
+ // Translate from English to German, showing Spanish and Portuguese as reference
170
+ const gptrans = new GPTrans({ from: 'es', target: 'de' });
171
+ await gptrans.preload({
172
+ baseLanguage: 'en',
173
+ references: ['es', 'pt']
174
+ });
175
+ ```
176
+
177
+ ### Real-World Example
178
+
179
+ **Problem:** Gender-specific translations
180
+
181
+ ```javascript
182
+ // Spanish: "El estudiante es muy bueno" (masculine)
183
+ // English: "The student is very good" (neutral)
184
+
185
+ // Solution: Translate to Portuguese using English as base
186
+ const ptTranslator = new GPTrans({ from: 'es', target: 'pt' });
187
+ await ptTranslator.preload({
188
+ baseLanguage: 'en', // Use neutral English version
189
+ references: ['es'] // Show original Spanish for context
190
+ });
191
+ ```
192
+
193
+ ### Parameters
194
+
195
+ | Parameter | Type | Description |
196
+ |-----------|------|-------------|
197
+ | `references` | `string[]` | Array of language codes (e.g., `['en', 'pt']`) to use as translation references |
198
+ | `baseLanguage` | `string` | Language code to use as the base for translation instead of the original |
199
+
200
+
126
201
  ### Model Fallback System
127
202
 
128
203
  GPTrans supports a fallback mechanism for translation models. Instead of providing a single model, you can pass an array of models:
@@ -20,7 +20,7 @@ async function testReferences() {
20
20
  target: 'en',
21
21
  model: 'sonnet45',
22
22
  name: 'ref_test',
23
- debug: true
23
+ debug: 3
24
24
  });
25
25
 
26
26
  const ptTranslator = new GPTrans({
@@ -28,7 +28,7 @@ async function testReferences() {
28
28
  target: 'pt',
29
29
  model: 'sonnet45',
30
30
  name: 'ref_test',
31
- debug: true
31
+ debug: 3
32
32
  });
33
33
 
34
34
  // Sample Spanish texts with gendered language
@@ -44,14 +44,16 @@ async function testReferences() {
44
44
  console.log(` EN: ${enTranslator.t(text)}`);
45
45
  });
46
46
 
47
- await new Promise(resolve => setTimeout(resolve, 5000));
47
+ // Wait for EN translations to complete if needed
48
+ await enTranslator.preload();
48
49
 
49
50
  console.log('\n📝 Creating Portuguese translations from Spanish...');
50
51
  spanishTexts.forEach(text => {
51
52
  console.log(` PT: ${ptTranslator.t(text)}`);
52
53
  });
53
54
 
54
- await new Promise(resolve => setTimeout(resolve, 5000));
55
+ // Wait for PT translations to complete if needed
56
+ await ptTranslator.preload();
55
57
 
56
58
  console.log('\n' + '='.repeat(70));
57
59
  console.log('\n📋 Test 2: Translation with English as reference\n');
@@ -62,7 +64,7 @@ async function testReferences() {
62
64
  target: 'fr',
63
65
  model: 'sonnet45',
64
66
  name: 'ref_test',
65
- debug: true
67
+ debug: 3
66
68
  });
67
69
 
68
70
  console.log('🔄 Preloading French translations with English as reference...');
@@ -85,7 +87,7 @@ async function testReferences() {
85
87
  target: 'it',
86
88
  model: 'sonnet45',
87
89
  name: 'ref_test',
88
- debug: true
90
+ debug: 3
89
91
  });
90
92
 
91
93
  console.log('🔄 Preloading Italian translations with English as base language...');
@@ -109,7 +111,7 @@ async function testReferences() {
109
111
  target: 'de',
110
112
  model: 'sonnet45',
111
113
  name: 'ref_test',
112
- debug: true
114
+ debug: 3
113
115
  });
114
116
 
115
117
  console.log('🔄 Preloading German translations with multiple references...');
package/index.js CHANGED
@@ -218,15 +218,45 @@ class GPTrans {
218
218
  const textsToTranslate = batch.map(([_, text]) => text).join(`\n${this.divider}\n`);
219
219
  try {
220
220
  const translations = await this._translate(textsToTranslate, batch, batchReferences, this.preloadBaseLanguage);
221
- const translatedTexts = translations.split(`\n${this.divider}\n`);
221
+
222
+ // Try different split strategies to be more robust
223
+ let translatedTexts = translations.split(`\n${this.divider}\n`);
224
+
225
+ // If split doesn't match batch size, try alternative separators
226
+ if (translatedTexts.length !== batch.length) {
227
+ // Try without newlines around divider
228
+ translatedTexts = translations.split(this.divider);
229
+
230
+ // If still doesn't match, try with just newline
231
+ if (translatedTexts.length !== batch.length) {
232
+ translatedTexts = translations.split(/\n{2,}/); // Split by multiple newlines
233
+ }
234
+ }
222
235
 
223
236
  const contextHash = this._hash(context);
224
- batch.forEach(([key], index) => {
225
-
226
- if (!translatedTexts[index]) {
227
- console.log(translations);
228
- console.error(`No translation found for ${key}`);
237
+
238
+ // Validate we have the right number of translations
239
+ if (translatedTexts.length !== batch.length) {
240
+ console.error(`❌ Translation count mismatch:`);
241
+ console.error(` Expected: ${batch.length} translations`);
242
+ console.error(` Received: ${translatedTexts.length} translations`);
243
+ console.error(` Batch keys: ${batch.map(([key]) => key).join(', ')}`);
244
+ console.error(`\n📝 Full response from model:\n${translations}\n`);
245
+
246
+ // Try to save what we can
247
+ const minLength = Math.min(translatedTexts.length, batch.length);
248
+ for (let i = 0; i < minLength; i++) {
249
+ if (translatedTexts[i] && translatedTexts[i].trim()) {
250
+ this.dbTarget.set(contextHash, batch[i][0], translatedTexts[i].trim());
251
+ }
252
+ }
253
+ return;
254
+ }
229
255
 
256
+ batch.forEach(([key], index) => {
257
+ if (!translatedTexts[index] || !translatedTexts[index].trim()) {
258
+ console.error(`❌ No translation found for ${key} at index ${index}`);
259
+ console.error(` Original text: ${batch[index][1]}`);
230
260
  return;
231
261
  }
232
262
 
@@ -234,7 +264,8 @@ class GPTrans {
234
264
  });
235
265
 
236
266
  } catch (e) {
237
- console.error(e);
267
+ console.error('❌ Error in _processBatch:', e.message);
268
+ console.error(e.stack);
238
269
  }
239
270
  }
240
271
 
@@ -247,29 +278,50 @@ class GPTrans {
247
278
 
248
279
  model.setSystem("You are an expert translator specialized in literary translation between FROM_LANG and TARGET_DENONYM TARGET_LANG.");
249
280
 
250
- model.addTextFromFile(this.promptFile);
281
+ // Read and process prompt file
282
+ let promptContent = fs.readFileSync(this.promptFile, 'utf-8');
251
283
 
252
284
  // Format references if available
253
285
  let referencesText = '';
254
286
  if (Object.keys(batchReferences).length > 0 && batch.length > 0) {
255
- const textsArray = text.split(`\n${this.divider}\n`);
256
-
257
- referencesText = textsArray.map((txt, index) => {
258
- const key = batch[index] ? batch[index][0] : null;
259
- if (key && batchReferences[key]) {
260
- const refs = batchReferences[key];
261
- const refLines = Object.entries(refs).map(([lang, translation]) => {
262
- try {
263
- const langInfo = isoAssoc(lang);
264
- return `${langInfo.DENONYM} ${langInfo.LANG} (${lang}): ${translation}`;
265
- } catch (e) {
266
- return `${lang}: ${translation}`;
287
+ // Group all references by language first
288
+ const refsByLang = {};
289
+
290
+ batch.forEach(([key], index) => {
291
+ if (batchReferences[key]) {
292
+ Object.entries(batchReferences[key]).forEach(([lang, translation]) => {
293
+ if (!refsByLang[lang]) {
294
+ refsByLang[lang] = [];
267
295
  }
268
- }).join(`\n${this.divider}\n`);
269
- return refLines;
296
+ refsByLang[lang].push(translation);
297
+ });
298
+ }
299
+ });
300
+
301
+ // Format: one language header, then all its translations with bullets
302
+ const refBlocks = Object.entries(refsByLang).map(([lang, translations]) => {
303
+ try {
304
+ const langInfo = isoAssoc(lang);
305
+ const header = `### ${langInfo.DENONYM} ${langInfo.LANG} (${lang}):`;
306
+ const content = translations.map(t => `- ${t}`).join('\n');
307
+ return `${header}\n${content}`;
308
+ } catch (e) {
309
+ const header = `### ${lang}:`;
310
+ const content = translations.map(t => `- ${t}`).join('\n');
311
+ return `${header}\n${content}`;
270
312
  }
271
- return '';
272
- }).filter(r => r).join(`\n\n`);
313
+ });
314
+
315
+ referencesText = refBlocks.join(`\n\n`);
316
+ }
317
+
318
+ // Remove reference section if no references
319
+ if (!referencesText) {
320
+ // Remove the entire "Reference Translations" section
321
+ promptContent = promptContent.replace(
322
+ /## Reference Translations \(for context\)[\s\S]*?(?=\n#|$)/,
323
+ ''
324
+ );
273
325
  }
274
326
 
275
327
  // Determine which FROM_ values to use
@@ -282,21 +334,31 @@ class GPTrans {
282
334
  }
283
335
  }
284
336
 
285
- model.replace({
286
- INPUT: text,
287
- CONTEXT: this.context,
288
- REFERENCES: referencesText || 'None'
337
+ // Apply replacements to prompt
338
+ promptContent = promptContent
339
+ .replace(/INPUT/g, text)
340
+ .replace(/CONTEXT/g, this.context)
341
+ .replace(/REFERENCES/g, referencesText);
342
+
343
+ // Apply language-specific replacements
344
+ Object.entries(this.replaceTarget).forEach(([key, value]) => {
345
+ promptContent = promptContent.replace(new RegExp(key, 'g'), value);
289
346
  });
290
- model.replace(this.replaceTarget);
291
- model.replace(fromReplace);
347
+ Object.entries(fromReplace).forEach(([key, value]) => {
348
+ promptContent = promptContent.replace(new RegExp(key, 'g'), value);
349
+ });
350
+
351
+ model.addText(promptContent);
292
352
 
293
353
  const response = await model.message();
294
354
 
355
+ // Extract content from code block if present
295
356
  const codeBlockRegex = /```(?:\w*\n)?([\s\S]*?)```/;
296
357
  const match = response.match(codeBlockRegex);
297
- const translatedText = match ? match[1].trim() : response;
358
+ const translatedText = match ? match[1].trim() : response.trim();
298
359
 
299
360
  return translatedText;
361
+
300
362
  } finally {
301
363
  // Always release the lock
302
364
  releaseLock();
@@ -347,8 +409,24 @@ class GPTrans {
347
409
  return this;
348
410
  }
349
411
 
412
+ // Filter out invalid references
413
+ const validReferences = references.filter(lang => {
414
+ const normalizedLang = this.normalizeBCP47(lang);
415
+ // Don't include target language as reference (we're translating TO it)
416
+ if (normalizedLang === this.replaceTarget.TARGET_ISO) {
417
+ console.warn(`Ignoring reference language '${lang}': cannot use target language as reference`);
418
+ return false;
419
+ }
420
+ // Don't include baseLanguage as reference (redundant)
421
+ if (baseLanguage && normalizedLang === this.normalizeBCP47(baseLanguage)) {
422
+ console.warn(`Ignoring reference language '${lang}': same as baseLanguage`);
423
+ return false;
424
+ }
425
+ return true;
426
+ });
427
+
350
428
  // Store preload options for use in translation
351
- this.preloadReferences = references;
429
+ this.preloadReferences = validReferences;
352
430
  this.preloadBaseLanguage = baseLanguage;
353
431
 
354
432
  // Track which keys need translation
@@ -365,8 +443,9 @@ class GPTrans {
365
443
  // Check if translation already exists
366
444
  if (!this.dbTarget.get(contextHash, key)) {
367
445
  keysNeedingTranslation.push({ context, contextHash, key });
446
+ // Only call get() if translation doesn't exist
447
+ this.get(key, text);
368
448
  }
369
- this.get(key, text);
370
449
  }
371
450
  }
372
451
 
@@ -378,10 +457,8 @@ class GPTrans {
378
457
  }
379
458
 
380
459
  // Wait for any pending translations to complete
381
- const maxWaitTime = 120000; // 120 seconds timeout
382
- const startTime = Date.now();
383
-
384
- await new Promise((resolve, reject) => {
460
+ // No global timeout - each translation request has its own timeout
461
+ await new Promise((resolve) => {
385
462
  const checkInterval = setInterval(() => {
386
463
  // Check if there are still pending translations or batch being processed
387
464
  const hasPending = this.pendingTranslations.size > 0 || this.isProcessingBatch;
@@ -399,13 +476,6 @@ class GPTrans {
399
476
  clearInterval(checkInterval);
400
477
  resolve();
401
478
  }
402
-
403
- // Timeout check
404
- if (Date.now() - startTime > maxWaitTime) {
405
- clearInterval(checkInterval);
406
- console.warn(`Preload timeout: ${keysNeedingTranslation.length} translations pending`);
407
- resolve(); // Resolve instead of reject to allow partial completion
408
- }
409
479
  }, 100);
410
480
  });
411
481
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "gptrans",
3
3
  "type": "module",
4
- "version": "1.8.8",
4
+ "version": "1.9.2",
5
5
  "description": "🚆 GPTrans - The smarter AI-powered way to translate.",
6
6
  "keywords": [
7
7
  "translate",
@@ -8,13 +8,22 @@ INPUT
8
8
 
9
9
  ## Reference Translations (for context)
10
10
  These are existing translations in other languages that may help you provide a more accurate translation. Use them as reference but do not simply copy them:
11
- ```
11
+
12
12
  REFERENCES
13
- ```
14
13
 
15
14
  # Return Format
16
15
  - Provide the final translation within a code block using ```.
17
16
  - Do not include alternative translations, only provide the best translation.
17
+ - **IMPORTANT:** If the input contains multiple texts separated by `------`, you MUST return the translations in the SAME ORDER, separated by the EXACT SAME separator:
18
+ ```
19
+ Translation 1
20
+ ------
21
+ Translation 2
22
+ ------
23
+ Translation 3
24
+ ```
25
+ - Do not add extra newlines or modify the separator format.
26
+ - Each translation should correspond to each input text in the same order.
18
27
 
19
28
  # Warnings
20
29
  - **Context:** I will provide you with a text in FROM_DENONYM FROM_LANG. The goal is to translate it to TARGET_ISO (TARGET_DENONYM TARGET_LANG) while maintaining the essence, style, intention, and tone of the original.
@@ -1,101 +0,0 @@
1
- # GPTrans - Referencias Múltiples y Idioma Base Alternativo
2
-
3
- ## Nuevas Funcionalidades
4
-
5
- El método `preload()` ahora soporta dos parámetros opcionales que mejoran la precisión de las traducciones:
6
-
7
- ### 1. `references` - Referencias Múltiples
8
-
9
- Permite incluir traducciones existentes en otros idiomas como contexto adicional para el modelo de IA.
10
-
11
- ```javascript
12
- await gptrans.preload({
13
- references: ['en', 'pt'] // Usar inglés y portugués como referencia
14
- });
15
- ```
16
-
17
- **Caso de uso**: Cuando tienes traducciones en varios idiomas y quieres que la nueva traducción sea consistente con las existentes.
18
-
19
- ### 2. `baseLanguage` - Idioma Base Alternativo
20
-
21
- Permite usar un idioma diferente al original como base para la traducción.
22
-
23
- ```javascript
24
- await gptrans.preload({
25
- baseLanguage: 'en' // Traducir desde inglés en vez del idioma original
26
- });
27
- ```
28
-
29
- **Caso de uso**: Cuando el texto original tiene características específicas del idioma (como he/she en inglés) que pueden omitirse en la traducción final.
30
-
31
- ## Ejemplos de Uso
32
-
33
- ### Ejemplo 1: Traducción Básica (sin cambios)
34
-
35
- ```javascript
36
- const gptrans = new GPTrans({ from: 'en', target: 'es' });
37
- await gptrans.preload(); // Comportamiento original
38
- ```
39
-
40
- ### Ejemplo 2: Con Referencias
41
-
42
- ```javascript
43
- // Original en español, traducir a francés usando inglés como referencia
44
- const gptrans = new GPTrans({ from: 'es', target: 'fr' });
45
- await gptrans.preload({
46
- references: ['en'] // El modelo verá la traducción en inglés como contexto
47
- });
48
- ```
49
-
50
- ### Ejemplo 3: Con Idioma Base Alternativo
51
-
52
- ```javascript
53
- // Original en español, pero traducir DE inglés A portugués
54
- const gptrans = new GPTrans({ from: 'es', target: 'pt' });
55
- await gptrans.preload({
56
- baseLanguage: 'en' // Usa la traducción en inglés como base
57
- });
58
- ```
59
-
60
- ### Ejemplo 4: Combinado
61
-
62
- ```javascript
63
- // Original en español, traducir de inglés a alemán, mostrando español y portugués como referencia
64
- const gptrans = new GPTrans({ from: 'es', target: 'de' });
65
- await gptrans.preload({
66
- baseLanguage: 'en', // Traduce desde inglés
67
- references: ['es', 'pt'] // Muestra español y portugués como contexto adicional
68
- });
69
- ```
70
-
71
- ## Caso de Uso Real: Evitar Problemas de Género
72
-
73
- ### Problema
74
- En inglés: "The student is very good" (neutral)
75
- En español: "El estudiante es muy bueno" / "La estudiante es muy buena" (con género)
76
- En portugués: Se puede omitir el género en algunos contextos
77
-
78
- ### Solución
79
- ```javascript
80
- // Si el original está en español pero queremos la traducción al portugués
81
- // basada en la versión inglesa (que es neutral):
82
-
83
- const ptTranslator = new GPTrans({ from: 'es', target: 'pt' });
84
- await ptTranslator.preload({
85
- baseLanguage: 'en', // Usa la versión inglesa como base
86
- references: ['es'] // Muestra el español original como referencia
87
- });
88
- ```
89
-
90
- ## Compatibilidad
91
-
92
- ✅ **Totalmente retrocompatible**: Si no especificas ningún parámetro, `preload()` funciona exactamente igual que antes.
93
-
94
- ```javascript
95
- // Esto sigue funcionando sin cambios:
96
- await gptrans.preload();
97
- ```
98
-
99
- ## Archivo de Prueba
100
-
101
- Ejecuta `demo/case_references.js` para ver ejemplos completos de todas las funcionalidades nuevas.