@huggingface/transformers 3.0.0-alpha.19 → 3.0.0-alpha.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -101,7 +101,7 @@ npm i @huggingface/transformers
101
101
  Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN or static hosting. For example, using [ES Modules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules), you can import the library with:
102
102
  ```html
103
103
  <script type="module">
104
- import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0-alpha.19';
104
+ import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0-alpha.20';
105
105
  </script>
106
106
  ```
107
107
 
@@ -134,7 +134,7 @@ Check out the Transformers.js [template](https://huggingface.co/new-space?templa
134
134
 
135
135
 
136
136
 
137
- By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0-alpha.19/dist/), which should work out-of-the-box. You can customize this as follows:
137
+ By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0-alpha.20/dist/), which should work out-of-the-box. You can customize this as follows:
138
138
 
139
139
  ### Settings
140
140
 
@@ -289,6 +289,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
289
289
  1. **[Decision Transformer](https://huggingface.co/docs/transformers/model_doc/decision_transformer)** (from Berkeley/Facebook/Google) released with the paper [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/abs/2106.01345) by Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch.
290
290
  1. **[DeiT](https://huggingface.co/docs/transformers/model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
291
291
  1. **[Depth Anything](https://huggingface.co/docs/transformers/main/model_doc/depth_anything)** (from University of Hong Kong and TikTok) released with the paper [Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data](https://arxiv.org/abs/2401.10891) by Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao.
292
+ 1. **Depth Pro** (from Apple) released with the paper [Depth Pro: Sharp Monocular Metric Depth in Less Than a Second](https://arxiv.org/abs/2410.02073) by Aleksei Bochkovskii, Amaël Delaunoy, Hugo Germain, Marcel Santos, Yichao Zhou, Stephan R. Richter, Vladlen Koltun.
292
293
  1. **[DETR](https://huggingface.co/docs/transformers/model_doc/detr)** (from Facebook) released with the paper [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko.
293
294
  1. **[DINOv2](https://huggingface.co/docs/transformers/model_doc/dinov2)** (from Meta AI) released with the paper [DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193) by Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski.
294
295
  1. **[DistilBERT](https://huggingface.co/docs/transformers/model_doc/distilbert)** (from HuggingFace), released together with the paper [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108) by Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into [DistilGPT2](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), RoBERTa into [DistilRoBERTa](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), Multilingual BERT into [DistilmBERT](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) and a German version of DistilBERT.
@@ -3866,7 +3866,13 @@ const supportedDevices = [];
3866
3866
  /** @type {ONNXExecutionProviders[]} */
3867
3867
  let defaultDevices;
3868
3868
  let ONNX;
3869
- if (_env_js__WEBPACK_IMPORTED_MODULE_0__.apis.IS_NODE_ENV) {
3869
+ const ORT_SYMBOL = Symbol.for('onnxruntime');
3870
+
3871
+ if (ORT_SYMBOL in globalThis) {
3872
+ // If the JS runtime exposes their own ONNX runtime, use it
3873
+ ONNX = globalThis[ORT_SYMBOL];
3874
+
3875
+ } else if (_env_js__WEBPACK_IMPORTED_MODULE_0__.apis.IS_NODE_ENV) {
3870
3876
  ONNX = onnxruntime_node__WEBPACK_IMPORTED_MODULE_1__ ?? /*#__PURE__*/ (onnxruntime_node__WEBPACK_IMPORTED_MODULE_1___namespace_cache || (onnxruntime_node__WEBPACK_IMPORTED_MODULE_1___namespace_cache = __webpack_require__.t(onnxruntime_node__WEBPACK_IMPORTED_MODULE_1__, 2)));
3871
3877
 
3872
3878
  // Updated as of ONNX Runtime 1.18.0
@@ -4401,6 +4407,7 @@ class AutoConfig {
4401
4407
  * for more information.
4402
4408
  * @property {import('./utils/devices.js').DeviceType} [device] The default device to use for the model.
4403
4409
  * @property {import('./utils/dtypes.js').DataType} [dtype] The default data type to use for the model.
4410
+ * @property {boolean|Record<string, boolean>} [use_external_data_format=false] Whether to load the model using the external data format (used for models >= 2GB in size).
4404
4411
  */
4405
4412
 
4406
4413
 
@@ -4449,7 +4456,7 @@ __webpack_require__.r(__webpack_exports__);
4449
4456
 
4450
4457
 
4451
4458
 
4452
- const VERSION = '3.0.0-alpha.19';
4459
+ const VERSION = '3.0.0-alpha.20';
4453
4460
 
4454
4461
  // Check if various APIs are available (depends on environment)
4455
4462
  const IS_BROWSER_ENV = typeof self !== 'undefined';
@@ -6484,6 +6491,8 @@ __webpack_require__.r(__webpack_exports__);
6484
6491
  /* harmony export */ DeiTPreTrainedModel: () => (/* binding */ DeiTPreTrainedModel),
6485
6492
  /* harmony export */ DepthAnythingForDepthEstimation: () => (/* binding */ DepthAnythingForDepthEstimation),
6486
6493
  /* harmony export */ DepthAnythingPreTrainedModel: () => (/* binding */ DepthAnythingPreTrainedModel),
6494
+ /* harmony export */ DepthProForDepthEstimation: () => (/* binding */ DepthProForDepthEstimation),
6495
+ /* harmony export */ DepthProPreTrainedModel: () => (/* binding */ DepthProPreTrainedModel),
6487
6496
  /* harmony export */ DetrForObjectDetection: () => (/* binding */ DetrForObjectDetection),
6488
6497
  /* harmony export */ DetrForSegmentation: () => (/* binding */ DetrForSegmentation),
6489
6498
  /* harmony export */ DetrModel: () => (/* binding */ DetrModel),
@@ -6942,7 +6951,7 @@ async function getSession(pretrained_model_name_or_path, fileName, options) {
6942
6951
  const suffix = _utils_dtypes_js__WEBPACK_IMPORTED_MODULE_2__.DEFAULT_DTYPE_SUFFIX_MAPPING[selectedDtype];
6943
6952
  const modelFileName = `${options.subfolder ?? ''}/${fileName}${suffix}.onnx`;
6944
6953
 
6945
- const session_options = { ...options.session_options } ?? {};
6954
+ const session_options = { ...options.session_options };
6946
6955
 
6947
6956
  // Overwrite `executionProviders` if not specified
6948
6957
  session_options.executionProviders ??= executionProviders;
@@ -6961,14 +6970,15 @@ async function getSession(pretrained_model_name_or_path, fileName, options) {
6961
6970
  const bufferPromise = (0,_utils_hub_js__WEBPACK_IMPORTED_MODULE_5__.getModelFile)(pretrained_model_name_or_path, modelFileName, true, options);
6962
6971
 
6963
6972
  // handle onnx external data files
6973
+ const use_external_data_format = options.use_external_data_format ?? custom_config.use_external_data_format;
6964
6974
  /** @type {Promise<{path: string, data: Uint8Array}>[]} */
6965
6975
  let externalDataPromises = [];
6966
- if (options.use_external_data_format && (
6967
- options.use_external_data_format === true ||
6976
+ if (use_external_data_format && (
6977
+ use_external_data_format === true ||
6968
6978
  (
6969
- typeof options.use_external_data_format === 'object' &&
6970
- options.use_external_data_format.hasOwnProperty(fileName) &&
6971
- options.use_external_data_format[fileName] === true
6979
+ typeof use_external_data_format === 'object' &&
6980
+ use_external_data_format.hasOwnProperty(fileName) &&
6981
+ use_external_data_format[fileName] === true
6972
6982
  )
6973
6983
  )) {
6974
6984
  if (_env_js__WEBPACK_IMPORTED_MODULE_12__.apis.IS_NODE_ENV) {
@@ -11416,6 +11426,11 @@ class SapiensForDepthEstimation extends SapiensPreTrainedModel { }
11416
11426
  class SapiensForNormalEstimation extends SapiensPreTrainedModel { }
11417
11427
  //////////////////////////////////////////////////
11418
11428
 
11429
+ //////////////////////////////////////////////////
11430
+ class DepthProPreTrainedModel extends PreTrainedModel { }
11431
+ class DepthProForDepthEstimation extends DepthProPreTrainedModel { }
11432
+ //////////////////////////////////////////////////
11433
+
11419
11434
  //////////////////////////////////////////////////
11420
11435
  class MaskFormerPreTrainedModel extends PreTrainedModel { }
11421
11436
  class MaskFormerModel extends MaskFormerPreTrainedModel { }
@@ -13474,6 +13489,7 @@ const MODEL_FOR_DEPTH_ESTIMATION_MAPPING_NAMES = new Map([
13474
13489
  ['depth_anything', ['DepthAnythingForDepthEstimation', DepthAnythingForDepthEstimation]],
13475
13490
  ['glpn', ['GLPNForDepthEstimation', GLPNForDepthEstimation]],
13476
13491
  ['sapiens', ['SapiensForDepthEstimation', SapiensForDepthEstimation]],
13492
+ ['depth_pro', ['DepthProForDepthEstimation', DepthProForDepthEstimation]],
13477
13493
  ])
13478
13494
 
13479
13495
  const MODEL_FOR_NORMAL_ESTIMATION_MAPPING_NAMES = new Map([
@@ -20714,12 +20730,17 @@ function whitespace_split(text) {
20714
20730
 
20715
20731
  const PUNCTUATION_REGEX = '\\p{P}\\u0021-\\u002F\\u003A-\\u0040\\u005B-\\u0060\\u007B-\\u007E';
20716
20732
  const PUNCTUATION_ONLY_REGEX = new RegExp(`^[${PUNCTUATION_REGEX}]+$`, 'gu');
20733
+ const BLOOM_SPLIT_CHARS = '.,!?\u2026\u3002\uff0c\u3001\u0964\u06d4\u060c';
20717
20734
 
20718
- // A mapping of regex patterns to their equivalent (but longer) JS-compatible versions.
20735
+ // A mapping of regex patterns to their equivalent (but possibly longer) JS-compatible versions.
20719
20736
  const PROBLEMATIC_REGEX_MAP = new Map([
20720
20737
  // This uses the case insensitive group modifier, which is not supported in JavaScript.
20721
20738
  // When parsing the regex, an "Invalid group" error is thrown.
20722
20739
  ["(?i:'s|'t|'re|'ve|'m|'ll|'d)", "(?:'([sS]|[tT]|[rR][eE]|[vV][eE]|[mM]|[lL][lL]|[dD]))"],
20740
+
20741
+ // Used to override the default (invalid) regex of the bloom pretokenizer.
20742
+ // For more information, see https://github.com/xenova/transformers.js/issues/94
20743
+ [` ?[^(\\s|[${BLOOM_SPLIT_CHARS}])]+`, ` ?[^\\s${BLOOM_SPLIT_CHARS}]+`],
20723
20744
  ])
20724
20745
 
20725
20746
 
@@ -20797,14 +20818,21 @@ class TokenizerModel extends _utils_generic_js__WEBPACK_IMPORTED_MODULE_0__.Call
20797
20818
  case 'Unigram':
20798
20819
  // @ts-ignore
20799
20820
  return new Unigram(config, ...args);
20800
-
20801
20821
  case 'BPE':
20802
20822
  return new BPE(config);
20803
20823
 
20804
20824
  default:
20825
+ // Some tokenizers, like for google-t5/t5-small, do not have a `type` field.
20826
+ // In this case, we can infer the tokenizer type based on the structure of the `vocab` field.
20805
20827
  if (config.vocab) {
20806
- // @ts-ignore
20807
- return new LegacyTokenizerModel(config, ...args);
20828
+ if (Array.isArray(config.vocab)) {
20829
+ // config.vocab is of type `[string, number][]`
20830
+ // @ts-ignore
20831
+ return new Unigram(config, ...args);
20832
+ } else {
20833
+ // @ts-ignore
20834
+ return new LegacyTokenizerModel(config, ...args);
20835
+ }
20808
20836
  }
20809
20837
  throw new Error(`Unknown TokenizerModel type: ${config.type}`);
20810
20838
  }
@@ -23767,19 +23795,7 @@ class MBart50Tokenizer extends MBartTokenizer { } // NOTE: extends MBartTokenize
23767
23795
 
23768
23796
  class RobertaTokenizer extends PreTrainedTokenizer { }
23769
23797
 
23770
- class BloomTokenizer extends PreTrainedTokenizer {
23771
-
23772
- constructor(tokenizerJSON, tokenizerConfig) {
23773
- // Override the default (invalid) regex of the pretokenizer.
23774
- // For more information, see https://github.com/xenova/transformers.js/issues/94
23775
- const splitChars = '.,!?\u2026\u3002\uff0c\u3001\u0964\u06d4\u060c';
23776
- const patternObject = tokenizerJSON.pre_tokenizer?.pretokenizers[0]?.pattern;
23777
- if (patternObject && patternObject.Regex === ` ?[^(\\s|[${splitChars}])]+`) {
23778
- patternObject.Regex = ` ?[^\\s${splitChars}]+`;
23779
- }
23780
- super(tokenizerJSON, tokenizerConfig);
23781
- }
23782
- }
23798
+ class BloomTokenizer extends PreTrainedTokenizer { }
23783
23799
 
23784
23800
  const SPIECE_UNDERLINE = "▁";
23785
23801
 
@@ -24616,85 +24632,6 @@ class WhisperTokenizer extends PreTrainedTokenizer {
24616
24632
  newIndices.filter(x => x.length > 0),
24617
24633
  ]
24618
24634
  }
24619
-
24620
- /**
24621
- * Helper function to build translation inputs for a `WhisperTokenizer`,
24622
- * depending on the language, task, and whether to predict timestamp tokens.
24623
- *
24624
- * Used to override the prefix tokens appended to the start of the label sequence.
24625
- *
24626
- * **Example: Get ids for a language**
24627
- * ```javascript
24628
- * // instantiate the tokenizer and set the prefix token to Spanish
24629
- * const tokenizer = await WhisperTokenizer.from_pretrained('Xenova/whisper-tiny');
24630
- * const forced_decoder_ids = tokenizer.get_decoder_prompt_ids({ language: 'spanish' });
24631
- * // [(1, 50262), (2, 50363)]
24632
- * ```
24633
- *
24634
- * @param {Object} options Options to generate the decoder prompt.
24635
- * @param {string} [options.language] The language of the transcription text.
24636
- * The corresponding language id token is appended to the start of the sequence for multilingual
24637
- * speech recognition and speech translation tasks, e.g. for "Spanish" the token "<|es|>" is appended
24638
- * to the start of sequence.
24639
- * @param {string} [options.task] Task identifier to append at the start of sequence (if any).
24640
- * This should be used for mulitlingual fine-tuning, with "transcribe" for speech recognition and
24641
- * "translate" for speech translation.
24642
- * @param {boolean} [options.no_timestamps] Whether to add the <|notimestamps|> token at the start of the sequence.
24643
- * @returns {number[][]} The decoder prompt ids.
24644
- */
24645
- get_decoder_prompt_ids({
24646
- language = null,
24647
- task = null,
24648
- no_timestamps = true,
24649
- } = {}) {
24650
-
24651
- // <|lang_id|> <|task|> <|notimestamps|>
24652
-
24653
- const forced_decoder_ids = [];
24654
-
24655
- if (language) {
24656
- // User wishes to specify the language
24657
- const language_code = (0,_models_whisper_common_whisper_js__WEBPACK_IMPORTED_MODULE_7__.whisper_language_to_code)(language);
24658
- const language_token_id = this.model.tokens_to_ids.get(`<|${language_code}|>`);
24659
- if (language_token_id === undefined) {
24660
- throw new Error(`Unable to find language "${language_code}" in model vocabulary. Please report this issue at ${_utils_constants_js__WEBPACK_IMPORTED_MODULE_8__.GITHUB_ISSUE_URL}.`)
24661
- }
24662
-
24663
- forced_decoder_ids.push(language_token_id);
24664
- } else {
24665
- // No token will be forced, which leaves the model to predict the language
24666
- forced_decoder_ids.push(null);
24667
- }
24668
-
24669
- if (task) {
24670
- task = task.toLowerCase();
24671
- if (task !== 'transcribe' && task !== 'translate') {
24672
- throw new Error(`Task "${task}" is not supported. Must be one of: ["transcribe", "translate"]`);
24673
- }
24674
-
24675
- const task_token_id = this.model.tokens_to_ids.get(`<|${task}|>`);
24676
- if (task_token_id === undefined) {
24677
- throw new Error(`Unable to find task "${task}" in model vocabulary. Please report this issue at ${_utils_constants_js__WEBPACK_IMPORTED_MODULE_8__.GITHUB_ISSUE_URL}.`)
24678
- }
24679
-
24680
- forced_decoder_ids.push(task_token_id);
24681
- } else {
24682
- // No token will be forced, which leaves the model to predict the task
24683
- forced_decoder_ids.push(null);
24684
- }
24685
-
24686
- if (no_timestamps) {
24687
- const no_timestamps_id = this.model.tokens_to_ids.get(`<|notimestamps|>`);
24688
- if (no_timestamps_id === undefined) {
24689
- throw new Error(`Unable to find "<|notimestamps|>" in model vocabulary. Please report this issue at ${_utils_constants_js__WEBPACK_IMPORTED_MODULE_8__.GITHUB_ISSUE_URL}.`);
24690
- }
24691
-
24692
- forced_decoder_ids.push(no_timestamps_id);
24693
- }
24694
-
24695
- return forced_decoder_ids.map((x, i) => [i + 1, x]).filter(x => x[1] !== null);
24696
-
24697
- }
24698
24635
  }
24699
24636
  class CodeGenTokenizer extends PreTrainedTokenizer { }
24700
24637
  class CLIPTokenizer extends PreTrainedTokenizer { }
@@ -30665,6 +30602,8 @@ __webpack_require__.r(__webpack_exports__);
30665
30602
  /* harmony export */ DepthAnythingForDepthEstimation: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DepthAnythingForDepthEstimation),
30666
30603
  /* harmony export */ DepthAnythingPreTrainedModel: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DepthAnythingPreTrainedModel),
30667
30604
  /* harmony export */ DepthEstimationPipeline: () => (/* reexport safe */ _pipelines_js__WEBPACK_IMPORTED_MODULE_1__.DepthEstimationPipeline),
30605
+ /* harmony export */ DepthProForDepthEstimation: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DepthProForDepthEstimation),
30606
+ /* harmony export */ DepthProPreTrainedModel: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DepthProPreTrainedModel),
30668
30607
  /* harmony export */ DetrFeatureExtractor: () => (/* reexport safe */ _processors_js__WEBPACK_IMPORTED_MODULE_4__.DetrFeatureExtractor),
30669
30608
  /* harmony export */ DetrForObjectDetection: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DetrForObjectDetection),
30670
30609
  /* harmony export */ DetrForSegmentation: () => (/* reexport safe */ _models_js__WEBPACK_IMPORTED_MODULE_2__.DetrForSegmentation),