npm - @huggingface/transformers - Versions diffs - 3.0.0-alpha.19 → 3.0.0-alpha.20 - Mend

@huggingface/transformers 3.0.0-alpha.19 → 3.0.0-alpha.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/README.md +3 -2
package/dist/transformers.cjs +43 -104
package/dist/transformers.cjs.map +1 -1
package/dist/transformers.js +46 -105
package/dist/transformers.js.map +1 -1
package/dist/transformers.min.cjs +5 -5
package/dist/transformers.min.cjs.map +1 -1
package/dist/transformers.min.js +5 -5
package/dist/transformers.min.js.map +1 -1
package/dist/transformers.min.mjs +12 -12
package/dist/transformers.min.mjs.map +1 -1
package/dist/transformers.mjs +46 -105
package/dist/transformers.mjs.map +1 -1
package/package.json +1 -1
package/src/backends/onnx.js +7 -1
package/src/configs.js +1 -0
package/src/env.js +1 -1
package/src/models.js +13 -6
package/src/tokenizers.js +17 -96
package/types/backends/onnx.d.ts.map +1 -1
package/types/configs.d.ts +4 -0
package/types/configs.d.ts.map +1 -1
package/types/models.d.ts +4 -0
package/types/models.d.ts.map +1 -1
package/types/tokenizers.d.ts +0 -31
package/types/tokenizers.d.ts.map +1 -1

package/README.md CHANGED Viewed

@@ -101,7 +101,7 @@ npm i @huggingface/transformers
 Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN or static hosting. For example, using [ES Modules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules), you can import the library with:
 ```html
 <script type="module">
-    import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0-alpha.19';
+    import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0-alpha.20';
 </script>
 ```
@@ -134,7 +134,7 @@ Check out the Transformers.js [template](https://huggingface.co/new-space?templa
-By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0-alpha.19/dist/), which should work out-of-the-box. You can customize this as follows:
+By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0-alpha.20/dist/), which should work out-of-the-box. You can customize this as follows:
 ### Settings
@@ -289,6 +289,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
 1. **[Decision Transformer](https://huggingface.co/docs/transformers/model_doc/decision_transformer)** (from Berkeley/Facebook/Google) released with the paper [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/abs/2106.01345) by Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch.
 1. **[DeiT](https://huggingface.co/docs/transformers/model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
 1. **[Depth Anything](https://huggingface.co/docs/transformers/main/model_doc/depth_anything)** (from University of Hong Kong and TikTok) released with the paper [Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data](https://arxiv.org/abs/2401.10891) by Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao.
+1. **Depth Pro** (from Apple) released with the paper [Depth Pro: Sharp Monocular Metric Depth in Less Than a Second](https://arxiv.org/abs/2410.02073) by Aleksei Bochkovskii, Amaël Delaunoy, Hugo Germain, Marcel Santos, Yichao Zhou, Stephan R. Richter, Vladlen Koltun.
 1. **[DETR](https://huggingface.co/docs/transformers/model_doc/detr)** (from Facebook) released with the paper [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko.
 1. **[DINOv2](https://huggingface.co/docs/transformers/model_doc/dinov2)** (from Meta AI) released with the paper [DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193) by Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski.
 1. **[DistilBERT](https://huggingface.co/docs/transformers/model_doc/distilbert)** (from HuggingFace), released together with the paper [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108) by Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into [DistilGPT2](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), RoBERTa into [DistilRoBERTa](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), Multilingual BERT into [DistilmBERT](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) and a German version of DistilBERT.

package/dist/transformers.cjs CHANGED Viewed

@@ -3866,7 +3866,13 @@ const supportedDevices = [];
 /** @type {ONNXExecutionProviders[]} */
 let defaultDevices;
 let ONNX;
-if (_env_js__WEBPACK_IMPORTED_MODULE_0__.apis.IS_NODE_ENV) {
+const ORT_SYMBOL = Symbol.for('onnxruntime');
+if (ORT_SYMBOL in globalThis) {
+  // If the JS runtime exposes their own ONNX runtime, use it
+  ONNX = globalThis[ORT_SYMBOL];
+} else if (_env_js__WEBPACK_IMPORTED_MODULE_0__.apis.IS_NODE_ENV) {
     ONNX = onnxruntime_node__WEBPACK_IMPORTED_MODULE_1__ ?? /*#__PURE__*/ (onnxruntime_node__WEBPACK_IMPORTED_MODULE_1___namespace_cache || (onnxruntime_node__WEBPACK_IMPORTED_MODULE_1___namespace_cache = __webpack_require__.t(onnxruntime_node__WEBPACK_IMPORTED_MODULE_1__, 2)));
     // Updated as of ONNX Runtime 1.18.0
@@ -4401,6 +4407,7 @@ class AutoConfig {
  * for more information.
  * @property {import('./utils/devices.js').DeviceType} [device] The default device to use for the model.
  * @property {import('./utils/dtypes.js').DataType} [dtype] The default data type to use for the model.
+ * @property {boolean|Record<string, boolean>} [use_external_data_format=false] Whether to load the model using the external data format (used for models >= 2GB in size).
  */
@@ -4449,7 +4456,7 @@ __webpack_require__.r(__webpack_exports__);
-const VERSION = '3.0.0-alpha.19';
+const VERSION = '3.0.0-alpha.20';
 // Check if various APIs are available (depends on environment)
 const IS_BROWSER_ENV = typeof self !== 'undefined';
@@ -6484,6 +6491,8 @@ __webpack_require__.r(__webpack_exports__);
 /* harmony export */   DeiTPreTrainedModel: () => (/* binding */ DeiTPreTrainedModel),
 /* harmony export */   DepthAnythingForDepthEstimation: () => (/* binding */ DepthAnythingForDepthEstimation),
 /* harmony export */   DepthAnythingPreTrainedModel: () => (/* binding */ DepthAnythingPreTrainedModel),
+/* harmony export */   DepthProForDepthEstimation: () => (/* binding */ DepthProForDepthEstimation),
+/* harmony export */   DepthProPreTrainedModel: () => (/* binding */ DepthProPreTrainedModel),
 /* harmony export */   DetrForObjectDetection: () => (/* binding */ DetrForObjectDetection),
 /* harmony export */   DetrForSegmentation: () => (/* binding */ DetrForSegmentation),
 /* harmony export */   DetrModel: () => (/* binding */ DetrModel),
@@ -6942,7 +6951,7 @@ async function getSession(pretrained_model_name_or_path, fileName, options) {
     const suffix = _utils_dtypes_js__WEBPACK_IMPORTED_MODULE_2__.DEFAULT_DTYPE_SUFFIX_MAPPING[selectedDtype];
     const modelFileName = `${options.subfolder ?? ''}/${fileName}${suffix}.onnx`;
-    const session_options = { ...options.session_options } ?? {};
+    const session_options = { ...options.session_options };
     // Overwrite `executionProviders` if not specified
     session_options.executionProviders ??= executionProviders;
@@ -6961,14 +6970,15 @@ async function getSession(pretrained_model_name_or_path, fileName, options) {
     const bufferPromise = (0,_utils_hub_js__WEBPACK_IMPORTED_MODULE_5__.getModelFile)(pretrained_model_name_or_path, modelFileName, true, options);
     // handle onnx external data files
+    const use_external_data_format = options.use_external_data_format ?? custom_config.use_external_data_format;
     /** @type {Promise<{path: string, data: Uint8Array}>[]} */
     let externalDataPromises = [];
-    if (options.use_external_data_format && (
-        options.use_external_data_format === true ||
+    if (use_external_data_format && (
+        use_external_data_format === true ||
         (
-            typeof options.use_external_data_format === 'object' &&
-            options.use_external_data_format.hasOwnProperty(fileName) &&
-            options.use_external_data_format[fileName] === true
+            typeof use_external_data_format === 'object' &&
+            use_external_data_format.hasOwnProperty(fileName) &&
+            use_external_data_format[fileName] === true
         )
     )) {
         if (_env_js__WEBPACK_IMPORTED_MODULE_12__.apis.IS_NODE_ENV) {
@@ -11416,6 +11426,11 @@ class SapiensForDepthEstimation extends SapiensPreTrainedModel { }
 class SapiensForNormalEstimation extends SapiensPreTrainedModel { }
 //////////////////////////////////////////////////
+//////////////////////////////////////////////////
+class DepthProPreTrainedModel extends PreTrainedModel { }
+class DepthProForDepthEstimation extends DepthProPreTrainedModel { }
+//////////////////////////////////////////////////
 //////////////////////////////////////////////////
 class MaskFormerPreTrainedModel extends PreTrainedModel { }
 class MaskFormerModel extends MaskFormerPreTrainedModel { }
@@ -13474,6 +13489,7 @@ const MODEL_FOR_DEPTH_ESTIMATION_MAPPING_NAMES = new Map([
     ['depth_anything', ['DepthAnythingForDepthEstimation', DepthAnythingForDepthEstimation]],
     ['glpn', ['GLPNForDepthEstimation', GLPNForDepthEstimation]],
     ['sapiens', ['SapiensForDepthEstimation', SapiensForDepthEstimation]],
+    ['depth_pro', ['DepthProForDepthEstimation', DepthProForDepthEstimation]],
 ])
 const MODEL_FOR_NORMAL_ESTIMATION_MAPPING_NAMES = new Map([
@@ -20714,12 +20730,17 @@ function whitespace_split(text) {
 const PUNCTUATION_REGEX = '\\p{P}\\u0021-\\u002F\\u003A-\\u0040\\u005B-\\u0060\\u007B-\\u007E';
 const PUNCTUATION_ONLY_REGEX = new RegExp(`^[${PUNCTUATION_REGEX}]+$`, 'gu');
+const BLOOM_SPLIT_CHARS = '.,!?\u2026\u3002\uff0c\u3001\u0964\u06d4\u060c';
-// A mapping of regex patterns to their equivalent (but longer) JS-compatible versions.
+// A mapping of regex patterns to their equivalent (but possibly longer) JS-compatible versions.
 const PROBLEMATIC_REGEX_MAP = new Map([
     // This uses the case insensitive group modifier, which is not supported in JavaScript.
     // When parsing the regex, an "Invalid group" error is thrown.
     ["(?i:'s|'t|'re|'ve|'m|'ll|'d)", "(?:'([sS]|[tT]|[rR][eE]|[vV][eE]|[mM]|[lL][lL]|[dD]))"],
+    // Used to override the default (invalid) regex of the bloom pretokenizer.
+    // For more information, see https://github.com/xenova/transformers.js/issues/94
+    [` ?[^(\\s|[${BLOOM_SPLIT_CHARS}])]+`, ` ?[^\\s${BLOOM_SPLIT_CHARS}]+`],
 ])
@@ -20797,14 +20818,21 @@ class TokenizerModel extends _utils_generic_js__WEBPACK_IMPORTED_MODULE_0__.Call
             case 'Unigram':
                 // @ts-ignore
                 return new Unigram(config, ...args);
             case 'BPE':
                 return new BPE(config);
             default:
+                // Some tokenizers, like for google-t5/t5-small, do not have a `type` field.
+                // In this case, we can infer the tokenizer type based on the structure of the `vocab` field.
                 if (config.vocab) {
-                    // @ts-ignore
-                    return new LegacyTokenizerModel(config, ...args);
+                    if (Array.isArray(config.vocab)) {
+                        // config.vocab is of type `[string, number][]`
+                        // @ts-ignore
+                        return new Unigram(config, ...args);
+                    } else {
+                        // @ts-ignore
+                        return new LegacyTokenizerModel(config, ...args);
+                    }
                 }
                 throw new Error(`Unknown TokenizerModel type: ${config.type}`);
         }
@@ -23767,19 +23795,7 @@ class MBart50Tokenizer extends MBartTokenizer { } // NOTE: extends MBartTokenize
 class RobertaTokenizer extends PreTrainedTokenizer { }
-class BloomTokenizer extends PreTrainedTokenizer {
-    constructor(tokenizerJSON, tokenizerConfig) {
-        // Override the default (invalid) regex of the pretokenizer.
-        // For more information, see https://github.com/xenova/transformers.js/issues/94
-        const splitChars = '.,!?\u2026\u3002\uff0c\u3001\u0964\u06d4\u060c';
-        const patternObject = tokenizerJSON.pre_tokenizer?.pretokenizers[0]?.pattern;
-        if (patternObject && patternObject.Regex === ` ?[^(\\s|[${splitChars}])]+`) {
-            patternObject.Regex = ` ?[^\\s${splitChars}]+`;
-        }
-        super(tokenizerJSON, tokenizerConfig);
-    }
-}
+class BloomTokenizer extends PreTrainedTokenizer { }
 const SPIECE_UNDERLINE = "▁";
@@ -24616,85 +24632,6 @@ class WhisperTokenizer extends PreTrainedTokenizer {
             newIndices.filter(x => x.length > 0),
         ]
     }
-    /**
-     * Helper function to build translation inputs for a `WhisperTokenizer`,
-     * depending on the language, task, and whether to predict timestamp tokens.
-     *
-     * Used to override the prefix tokens appended to the start of the label sequence.
-     *
-     * **Example: Get ids for a language**
-     * ```javascript
-     * // instantiate the tokenizer and set the prefix token to Spanish
-     * const tokenizer = await WhisperTokenizer.from_pretrained('Xenova/whisper-tiny');
-     * const forced_decoder_ids = tokenizer.get_decoder_prompt_ids({ language: 'spanish' });
-     * // [(1, 50262), (2, 50363)]
-     * ```
-     *
-     * @param {Object} options Options to generate the decoder prompt.
-     * @param {string} [options.language] The language of the transcription text.
-     * The corresponding language id token is appended to the start of the sequence for multilingual
-     * speech recognition and speech translation tasks, e.g. for "Spanish" the token "<|es|>" is appended
-     * to the start of sequence.
-     * @param {string} [options.task] Task identifier to append at the start of sequence (if any).
-     * This should be used for mulitlingual fine-tuning, with "transcribe" for speech recognition and
-     * "translate" for speech translation.
-     * @param {boolean} [options.no_timestamps] Whether to add the <|notimestamps|> token at the start of the sequence.
-     * @returns {number[][]} The decoder prompt ids.
-     */
-    get_decoder_prompt_ids({
-        language = null,
-        task = null,
-        no_timestamps = true,
-    } = {}) {
-        // <|lang_id|> <|task|> <|notimestamps|>
-        const forced_decoder_ids = [];
-        if (language) {
-            // User wishes to specify the language
-            const language_code = (0,_models_whisper_common_whisper_js__WEBPACK_IMPORTED_MODULE_7__.whisper_language_to_code)(language);
-            const language_token_id = this.model.tokens_to_ids.get(`<|${language_code}|>`);
-            if (language_token_id === undefined) {
-                throw new Error(`Unable to find language "${language_code}" in model vocabulary. Please report this issue at ${_utils_constants_js__WEBPACK_IMPORTED_MODULE_8__.GITHUB_ISSUE_URL}.`)
-            }
-            forced_decoder_ids.push(language_token_id);
-        } else {
-            // No token will be forced, which leaves the model to predict the language
-            forced_decoder_ids.push(null);
-        }
-        if (task) {
-            task = task.toLowerCase();
-            if (task !== 'transcribe' && task !== 'translate') {
-                throw new Error(`Task "${task}" is not supported. Must be one of: ["transcribe", "translate"]`);
-            }
-            const task_token_id = this.model.tokens_to_ids.get(`<|${task}|>`);
-            if (task_token_id === undefined) {
-                throw new Error(`Unable to find task "${task}" in model vocabulary. Please report this issue at ${_utils_constants_js__WEBPACK_IMPORTED_MODULE_8__.GITHUB_ISSUE_URL}.`)
-            }
-            forced_decoder_ids.push(task_token_id);
-        } else {
-            // No token will be forced, which leaves the model to predict the task
-            forced_decoder_ids.push(null);
-        }
-        if (no_timestamps) {
-            const no_timestamps_id = this.model.tokens_to_ids.get(`<|notimestamps|>`);
-            if (no_timestamps_id === undefined) {
-                throw new Error(`Unable to find "<|notimestamps|>" in model vocabulary. Please report this issue at ${_utils_constants_js__WEBPACK_IMPORTED_MODULE_8__.GITHUB_ISSUE_URL}.`);
-            }
-            forced_decoder_ids.push(no_timestamps_id);
-        }
-        return forced_decoder_ids.map((x, i) => [i + 1, x]).filter(x => x[1] !== null);
-    }
 }
 class CodeGenTokenizer extends PreTrainedTokenizer { }
 class CLIPTokenizer extends PreTrainedTokenizer { }
@@ -30665,6 +30602,8 @@ __webpack_require__.r(__webpack_exports__);
 /* harmony export */   DepthAnythingForDepthEstimation: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DepthAnythingForDepthEstimation),
 /* harmony export */   DepthAnythingPreTrainedModel: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DepthAnythingPreTrainedModel),
 /* harmony export */   DepthEstimationPipeline: () => (/* reexport safe */ _pipelines_js__WEBPACK_IMPORTED_MODULE_1__.DepthEstimationPipeline),
+/* harmony export */   DepthProForDepthEstimation: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DepthProForDepthEstimation),
+/* harmony export */   DepthProPreTrainedModel: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DepthProPreTrainedModel),
 /* harmony export */   DetrFeatureExtractor: () => (/* reexport safe */ _processors_js__WEBPACK_IMPORTED_MODULE_4__.DetrFeatureExtractor),
 /* harmony export */   DetrForObjectDetection: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DetrForObjectDetection),
 /* harmony export */   DetrForSegmentation: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DetrForSegmentation),