npm - gptrans - Versions diffs - 1.4.6 → 1.5.0 - Mend

gptrans 1.4.6 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/README.md +19 -8
package/db/gptrans_en.json +9 -0
package/db/gptrans_es.json +5 -0
package/db/gptrans_from_es.json +7 -1
package/demo/case_4.js +17 -0
package/index.js +71 -49
package/isoAssoc.js +6 -2
package/package.json +1 -1
package/prompt/translate.md +2 -0
package/db/gptrans_ar.json +0 -3
package/db/gptrans_es-AR.json +0 -13
package/db/gptrans_es-ES.json +0 -3
package/db/gptrans_from_en-US.json +0 -11
package/db/gptrans_from_es-AR.json +0 -3
package/db/gptrans_from_es-ES.json +0 -5
package/db/gptrans_it.json +0 -11

package/README.md CHANGED Viewed

@@ -9,10 +9,10 @@ Whether you're building a multilingual website, a mobile app, or a localization
 ## ✨ Features
 - **AI-Powered Translations:** Harness advanced models like OpenAI's GPT and Anthropic's Sonnet for high-quality translations
-- **Smart Batching & Debouncing:** Automatically groups translation requests to optimize API usage
-- **Caching with DeepBase:** Quickly retrieves cached translations to boost performance
+- **Smart Batching & Debouncing:** Translations are processed in batches, not only for efficiency but also to provide better context. By sending multiple related texts together, the AI model can better understand the overall context and produce more accurate and consistent translations across related terms and phrases.
+- **Caching with JSON:** Quickly retrieves cached translations to boost performance
 - **Parameter Substitution:** Dynamically replace placeholders in your translations
-- **Flexible Configuration:** Customize source and target locales, model keys, and batching settings to fit your needs
+- **Smart Context Handling:** Add contextual information to improve translation accuracy. Perfect for gender-aware translations, domain-specific content, or any scenario where additional context helps produce better results. The context is automatically cleared after each translation to prevent unintended effects.
 ## 📦 Installation
@@ -65,20 +65,31 @@ When creating a new instance of GPTrans, you can customize:
 | Option | Description | Default |
 |--------|-------------|---------|
-| `from` | Source language locale | `es-AR` |
-| `target` | Target language locale | `en-US` |
+| `from` | Source language locale (BCP 47) | `es-AR` |
+| `target` | Target language locale (BCP 47) | `en-US` |
 | `model` | Translation model key | `claude-3-7-sonnet` |
 | `batchThreshold` | Maximum number of characters to accumulate before triggering batch processing | `1000` |
 | `debounceTimeout` | Time in milliseconds to wait before processing translations | `500` |
+### BCP 47 Language Tags
+GPTrans uses BCP 47 language tags for language specification. BCP 47 is the standard for language tags that combines language, script, and region codes. Here are some common examples:
+- `en-US` - English (United States)
+- `es-AR` - Spanish (Argentina)
+- `pt-BR` - Portuguese (Brazil)
+For simplified or universal language codes, you can omit the region specification:
+- `es` - Spanish (Universal)
 ## 🔍 How It Works
 1. **First-Time Translation Behavior:** On the first request, Gptrans will return the original text while processing the translation in the background. This ensures your application remains responsive without waiting for API calls.
-2. **Translation Caching:** Once processed, translations are stored in `db/gptrans_<iso>.json`. Subsequent requests for the same text will be served instantly from the cache.
-3. **Smart Batch Processing:** Translations are processed in batches, providing better context for more accurate results.
+2. **Translation Caching:** Once processed, translations are stored in `db/gptrans_<tag>.json`. Subsequent requests for the same text will be served instantly from the cache.
+3. **Smart Batch Processing:** Automatically groups translation requests to optimize API usage and provide better context.
 4. **Dynamic Model Integration:** Easily plug in multiple AI translation providers with the ModelMix library.
 5. **Customizable Prompts:** Load and modify translation prompts (see the `prompt/translate.md` file) to fine-tune the translation output.
-6. **Manual Corrections:** A JSON file stores key-translation pairs, allowing you to override specific translations and make manual corrections when needed. Simply edit the `db/gptrans_<iso>.json` file:
+6. **Manual Corrections:** A JSON file stores key-translation pairs, allowing you to override specific translations and make manual corrections when needed. Simply edit the `db/gptrans_<tag>.json` file:
 ```json
 {

package/db/gptrans_en.json ADDED Viewed

@@ -0,0 +1,9 @@
+{
+    "45h": {
+        "eres_muy_bueno_26czme": "You're very good",
+        "tienes_fuego_1i2o3ok": "Do you have a light?"
+    },
+    "1sfvxng": {
+        "eres_muy_bueno_26czme": "You are very good"
+    }
+}

package/db/gptrans_es.json ADDED Viewed

@@ -0,0 +1,5 @@
+{
+    "1sfvxng": {
+        "eres_muy_bueno_26czme": "Eres muy buena"
+    }
+}

package/db/gptrans_from_es.json CHANGED Viewed

@@ -1,3 +1,9 @@
 {
-    "cargan_1998owo": "Cargando..."
+    "": {
+        "eres_muy_bueno_26czme": "Eres muy bueno",
+        "tienes_fuego_1i2o3ok": "Tienes fuego?"
+    },
+    "El mensaje es para una mujer": {
+        "eres_muy_bueno_26czme": "Eres muy bueno"
+    }
 }

package/demo/case_4.js ADDED Viewed

@@ -0,0 +1,17 @@
+import GPTrans from '../index.js';
+// Case 2: Translate from Spanish to English
+const model = new GPTrans({
+    from: 'es',
+    target: 'en',
+});
+await model.preload();
+console.log(model.setContext().t('Eres muy bueno'));
+console.log(model.setContext('El mensaje es para una mujer').t('Eres muy bueno'));
+console.log(model.setContext().t('Tienes fuego?'));

package/index.js CHANGED Viewed

@@ -9,7 +9,16 @@ class GPTrans {
     static get mmix() {
         if (!this.#mmixInstance) {
-            const mmix = new ModelMix();
+            const mmix = new ModelMix({
+                config: {
+                    max_history: 1,
+                    debug: false,
+                    bottleneck: {
+                        minTime: 15000,
+                        maxConcurrent: 1
+                    }
+                }
+            });
             mmix.attach(new MixOpenAI());
             mmix.attach(new MixAnthropic());
@@ -23,7 +32,7 @@ class GPTrans {
         return isLanguageAvailable(langCode);
     }
-    constructor({ from = 'en-US', target = 'es-AR', model = 'claude-3-7-sonnet-20250219', batchThreshold = 1000, debounceTimeout = 500, promptFile = null, context = '' }) {
+    constructor({ from = 'en-US', target = 'es-AR', model = 'claude-3-7-sonnet-20250219', batchThreshold = 1500, debounceTimeout = 500, promptFile = null, context = '' }) {
         try {
             dotenv.config();
@@ -35,8 +44,8 @@ class GPTrans {
         this.dbFrom = new DeepBase({ name: 'gptrans_from_' + from });
         try {
-            this.replace_target = isoAssoc(target, 'TARGET_');
-            this.replace_from = isoAssoc(from, 'FROM_');
+            this.replaceTarget = isoAssoc(target, 'TARGET_');
+            this.replaceFrom = isoAssoc(from, 'FROM_');
         } catch (e) {
             throw new Error(`Invalid target: ${target}`);
         }
@@ -50,14 +59,7 @@ class GPTrans {
         this.promptFile = promptFile ?? new URL('./prompt/translate.md', import.meta.url).pathname;
         this.context = context;
         this.modelConfig = {
-            config: {
-                max_history: 1,
-                debug: false,
-                bottleneck: {
-                    maxConcurrent: 2,
-                }
-            },
-            options: {
+            options: {
                 max_tokens: batchThreshold,
                 temperature: 0
             }
@@ -68,7 +70,7 @@ class GPTrans {
     setContext(context = '') {
         if (this.context !== context && this.pendingTranslations.size > 0) {
             clearTimeout(this.debounceTimer);
-            this._processBatch();
+            this._processBatch(this.context);
         }
         this.context = context;
         return this;
@@ -85,15 +87,27 @@ class GPTrans {
     }
     get(key, text) {
-        const translation = this.dbTarget.get(key);
+        if (!text || !text.trim()) {
+            return text;
+        }
+        const contextHash = this._hash(this.context);
+        const translation = this.dbTarget.get(contextHash, key);
         if (!translation) {
-            this.pendingTranslations.set(key, text);
-            this.pendingCharCount += text.length; // Update character count
+            if (!this.dbFrom.get(this.context, key)) {
+                this.dbFrom.set(this.context, key, text);
+            }
-            if (!this.dbFrom.get(key)) {
-                this.dbFrom.set(key, text);
+            // Skip translation if context is empty and languages are the same
+            if (!this.context && this.replaceFrom.FROM_ISO === this.replaceTarget.TARGET_ISO) {
+                return text;
             }
+            this.pendingTranslations.set(key, text);
+            this.pendingCharCount += text.length; // Update character count
             // Clear existing timer
             if (this.debounceTimer) {
                 clearTimeout(this.debounceTimer);
@@ -102,27 +116,31 @@ class GPTrans {
             // Set new timer
             this.debounceTimer = setTimeout(() => {
                 if (this.pendingTranslations.size > 0) {
-                    this._processBatch();
+                    this._processBatch(this.context);
                 }
             }, this.debounceTimeout);
             // Process if we hit the character count threshold
             if (this.pendingCharCount >= this.batchThreshold) {
                 clearTimeout(this.debounceTimer);
-                this._processBatch();
+                this._processBatch(this.context);
             }
         }
         return translation;
     }
-    async _processBatch() {
+    async _processBatch(context) {
         this.processing = true;
         const batch = Array.from(this.pendingTranslations.entries());
         // Clear pending translations and character count before awaiting translation
         this.pendingTranslations.clear();
         this.modelConfig.options.max_tokens = this.pendingCharCount + 1000;
+        const minTime = Math.floor((60000 / (8000 / this.pendingCharCount)) * 1.4);
+        GPTrans.mmix.limiter.updateSettings({ minTime });
         this.pendingCharCount = 0;
         const textsToTranslate = batch.map(([_, text]) => text).join('\n---\n');
@@ -130,8 +148,17 @@ class GPTrans {
             const translations = await this._translate(textsToTranslate);
             const translatedTexts = translations.split('\n---\n');
+            const contextHash = this._hash(context);
             batch.forEach(([key], index) => {
-                this.dbTarget.set(key, translatedTexts[index].trim());
+                if (!translatedTexts[index]) {
+                    console.log(translations);
+                    console.error(`No translation found for ${key}`);
+                    return;
+                }
+                this.dbTarget.set(contextHash, key, translatedTexts[index].trim());
             });
         } catch (e) {
@@ -142,6 +169,7 @@ class GPTrans {
     }
     async _translate(text) {
         const model = GPTrans.mmix.create(this.modelKey, this.modelConfig);
         model.setSystem("You are an expert translator specialized in literary translation between FROM_LANG and TARGET_DENONYM TARGET_LANG.");
@@ -149,8 +177,8 @@ class GPTrans {
         model.addTextFromFile(this.promptFile);
         model.replace({ INPUT: text, CONTEXT: this.context });
-        model.replace(this.replace_target);
-        model.replace(this.replace_from);
+        model.replace(this.replaceTarget);
+        model.replace(this.replaceFrom);
         const response = await model.message();
@@ -171,39 +199,33 @@ class GPTrans {
         let key = words.map((x) => x.slice(0, maxlen)).join("_");
         key += key ? '_' : '';
-        key += stringHash(text + this.context).toString(36);
+        key += this._hash(text);
         return key;
     }
-    async preload({ target = this.replace_target.TARGET_ISO, model = this.modelKey, from = this.replace_from.FROM_ISO, batchThreshold = this.batchThreshold, debounceTimeout = this.debounceTimeout } = {}) {
-        // Create new GPTrans instance for the target language
-        const translator = new GPTrans({
-            from,
-            target,
-            model,
-            batchThreshold,
-            debounceTimeout,
-        });
+    _hash(input) {
+        return stringHash(input).toString(36);
+    }
-        // Process all entries in batches
-        for (const [key, text] of this.dbFrom.entries()) {
-            translator.get(key, text);
+    async preload() {
+        for (const [context, pairs] of this.dbFrom.entries()) {
+            this.setContext(context);
+            for (const [key, text] of Object.entries(pairs)) {
+                this.get(key, text);
+            }
         }
         // Wait for any pending translations to complete
-        if (translator.pendingTranslations.size > 0) {
-            await new Promise(resolve => {
-                const checkInterval = setInterval(() => {
-                    if (translator.processing === false && translator.pendingTranslations.size === 0) {
-                        clearInterval(checkInterval);
-                        resolve();
-                    }
-                }, 1000);
-            });
-        }
+        await new Promise(resolve => {
+            const checkInterval = setInterval(() => {
+                if (this.dbFrom.keys().length === this.dbTarget.keys().length) {
+                    clearInterval(checkInterval);
+                    resolve();
+                }
+            }, 100);
+        });
-        return translator;
+        return this;
     }
 }

package/isoAssoc.js CHANGED Viewed

@@ -180,10 +180,14 @@ export function isoAssoc(iso, prefix = '') {
     const parts = iso.toLowerCase().split('-');
     const lang = parts[0];
-    const country = parts.length > 1 ? parts[1] : null;
+    let country = parts.length > 1 ? parts[1] : null;
-    let denonym = country ? countryDenonym[country] : 'Neutral';
+    if (lang === 'en' && !country) {
+        country = 'us';
+    }
+    let denonym = country ? countryDenonym[country] : 'Neutral';
     if (lang === 'zh' && !country) {
         denonym = 'Simplified';
     }

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "gptrans",
   "type": "module",
-  "version": "1.4.6",
+  "version": "1.5.0",
   "description": "🚆 GPTrans - The smarter AI-powered way to translate.",
   "keywords": [
     "translate",

package/prompt/translate.md CHANGED Viewed

@@ -2,7 +2,9 @@
 Translation from FROM_ISO to TARGET_ISO (TARGET_DENONYM TARGET_LANG) with cultural adaptations.
 ## Text to translate
+```
 INPUT
+```
 # Return Format
    - Provide the final translation within a code block using ```.

package/db/gptrans_ar.json DELETED Viewed

@@ -1,3 +0,0 @@
-{
-    "cargan_1998owo": "جارٍ التحميل..."
-}

package/db/gptrans_es-AR.json DELETED Viewed

@@ -1,13 +0,0 @@
-{
-    "eres_muy_bueno_26czme": "Sos muy bueno",
-    "eres_muy_bueno_k3ml5b": "Sos muy buena",
-    "hello_name_1987p1n": "¡Hola, {name}!",
-    "topup_uzdh5y": "Recargar",
-    "transf_176pc1a": "Transferir",
-    "deposi_wg2ec5": "Depositar",
-    "balanc_1rv8if7": "Saldo",
-    "transa_1wtqm5d": "Transacción",
-    "accoun_x1y0v8": "Cuenta",
-    "card_yis1ox": "Tarjeta",
-    "tienes_fuego_1i2o3ok": "¿Tenés fuego?"
-}

package/db/gptrans_es-ES.json DELETED Viewed

@@ -1,3 +0,0 @@
-{
-    "tenes_fuego_1fs98im": "¿Tienes fuego?"
-}

package/db/gptrans_from_en-US.json DELETED Viewed

@@ -1,11 +0,0 @@
-{
-    "hello_name_1987p1n": "Hello, {name}!",
-    "topup_uzdh5y": "Top-up",
-    "transf_176pc1a": "Transfer",
-    "deposi_wg2ec5": "Deposit",
-    "balanc_1rv8if7": "Balance",
-    "transa_1wtqm5d": "Transaction",
-    "accoun_x1y0v8": "Account",
-    "card_yis1ox": "Card",
-    "loadin_21q3nx": "Loading..."
-}

package/db/gptrans_from_es-AR.json DELETED Viewed

@@ -1,3 +0,0 @@
-{
-    "tenes_fuego_1fs98im": "¿Tenés fuego?"
-}

package/db/gptrans_from_es-ES.json DELETED Viewed

@@ -1,5 +0,0 @@
-{
-    "eres_muy_bueno_26czme": "Eres muy bueno",
-    "eres_muy_bueno_k3ml5b": "Eres muy bueno",
-    "tienes_fuego_1i2o3ok": "Tienes fuego?"
-}

package/db/gptrans_it.json DELETED Viewed

@@ -1,11 +0,0 @@
-{
-    "hello_name_1987p1n": "Ciao, {name}!",
-    "topup_uzdh5y": "Ricarica",
-    "transf_176pc1a": "Trasferimento",
-    "deposi_wg2ec5": "Deposito",
-    "balanc_1rv8if7": "Saldo",
-    "transa_1wtqm5d": "Transazione",
-    "accoun_x1y0v8": "Account",
-    "card_yis1ox": "Carta",
-    "loadin_21q3nx": "Caricamento in corso..."
-}