PyPI - EuroEval - Versions diffs - 15.5.0__tar.gz → 15.6.0__tar.gz - Mend

EuroEval 15.5.0tar.gz → 15.6.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of EuroEval might be problematic. Click here for more details.

Files changed (232) hide show

{euroeval-15.5.0 → euroeval-15.6.0}/.pre-commit-config.yaml RENAMED Viewed

@@ -10,7 +10,7 @@ repos:
       - id: trailing-whitespace
       - id: debug-statements
 -   repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: v0.11.4
+    rev: v0.11.5
     hooks:
       - id: ruff
         args:

{euroeval-15.5.0 → euroeval-15.6.0}/CHANGELOG.md RENAMED Viewed

@@ -10,6 +10,41 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
+## [v15.6.0] - 2025-04-13
+### Added
+- We now support specifying custom inference providers when benchmarking via the Hugging
+  Face inference APIs. This can be done by specifying the model as
+  `huggingface/<inference-provider>/<organisation>/<model>`, as described in [these
+  LiteLLM docs](https://docs.litellm.ai/docs/providers/huggingface).
+### Changed
+- Updated `transformers` to `>=4.51.0`, which includes support for Llama-4, Phi-4,
+  Deepseek-v3 and Qwen3. This also includes the `image-text-to-text` pipeline tag
+  properly, so that we do not have to use a custom fix for it anymore.
+- Updated `vllm` to `>=0.8.3`, which includes support for Llama-4.
+- Set the maximum amount of logprobs for generative models to 8, as that is the upper
+  bound for xAI models.
+- When benchmarking Ollama models, if the model is not found, we now also check if the
+  model exists if prefixed with 'hf.co/'.
+- Uniformised the prompt templates used for each task, so that they are more
+  consistent across tasks. Evaluation tests across different model types and sizes show
+  no significant performance difference between the new and old templates. This was
+  contributed by [@viggo-gascou](https://github.com/viggo-gascou) ✨
+### Fixed
+- Avoid duplicate error messages when a rate limit occurs.
+- ModernBERT models cannot be used on a CPU, which caused an error in our check for
+  maximal context length. In this case we simply skip this check and use the reported
+  maximal context length as-is.
+- Fixed issue with benchmarking multiple generative models in the same evaluation
+  command. This was caused by vLLM and Ray not being able to release GPU memory
+  properly, but this seems to be released properly now.
+- Now only logs when encoder models are being benchmarked on generative tasks if the
+  `--verbose` flag is set (or `verbose=True` in the `Benchmarker` API).
+- All Spanish NER datasets were mistakenly marked as unofficial. The `conll-es` is now
+  marked as official.
 ## [v15.5.0] - 2025-04-07
 ### Added
 - Now allows supplying a parameter to API models, which is done by using

{euroeval-15.5.0 → euroeval-15.6.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: EuroEval
-Version: 15.5.0
+Version: 15.6.0
 Summary: The robust European language model benchmark.
 Project-URL: Repository, https://github.com/EuroEval/EuroEval
 Project-URL: Issues, https://github.com/EuroEval/EuroEval/issues
@@ -35,7 +35,7 @@ Requires-Dist: click>=8.1.3
 Requires-Dist: datasets>=2.15.0
 Requires-Dist: demjson3>=3.0.6
 Requires-Dist: evaluate>=0.4.1
-Requires-Dist: huggingface-hub>=0.24.0
+Requires-Dist: huggingface-hub>=0.30.1
 Requires-Dist: levenshtein>=0.24.0
 Requires-Dist: litellm>=1.63.0
 Requires-Dist: more-itertools>=10.5.0
@@ -56,18 +56,18 @@ Requires-Dist: setuptools>=75.8.2
 Requires-Dist: tenacity>=9.0.0
 Requires-Dist: termcolor>=2.0.0
 Requires-Dist: torch>=2.6.0
-Requires-Dist: transformers>=4.50.0
+Requires-Dist: transformers>=4.51.0
 Provides-Extra: all
 Requires-Dist: bitsandbytes>=0.43.1; (platform_system == 'Linux') and extra == 'all'
 Requires-Dist: fbgemm-gpu>=1.0.0; (platform_system == 'Linux') and extra == 'all'
 Requires-Dist: gradio>=4.26.0; extra == 'all'
 Requires-Dist: outlines>=0.1.11; extra == 'all'
-Requires-Dist: vllm>=0.8.0; (platform_system == 'Linux') and extra == 'all'
+Requires-Dist: vllm>=0.8.3; (platform_system == 'Linux') and extra == 'all'
 Provides-Extra: generative
 Requires-Dist: bitsandbytes>=0.43.1; (platform_system == 'Linux') and extra == 'generative'
 Requires-Dist: fbgemm-gpu>=1.0.0; (platform_system == 'Linux') and extra == 'generative'
 Requires-Dist: outlines>=0.1.11; extra == 'generative'
-Requires-Dist: vllm>=0.8.0; (platform_system == 'Linux') and extra == 'generative'
+Requires-Dist: vllm>=0.8.3; (platform_system == 'Linux') and extra == 'generative'
 Provides-Extra: human-evaluation
 Requires-Dist: gradio>=4.26.0; extra == 'human-evaluation'
 Provides-Extra: test
@@ -89,7 +89,7 @@ ______________________________________________________________________
 [![Second paper](https://img.shields.io/badge/arXiv-2406.13469-b31b1b.svg)](https://arxiv.org/abs/2406.13469)
 [![License](https://img.shields.io/github/license/EuroEval/EuroEval)](https://github.com/EuroEval/EuroEval/blob/main/LICENSE)
 [![LastCommit](https://img.shields.io/github/last-commit/EuroEval/EuroEval)](https://github.com/EuroEval/EuroEval/commits/main)
-[![Code Coverage](https://img.shields.io/badge/Coverage-65%25-yellow.svg)](https://github.com/EuroEval/EuroEval/tree/main/tests)
+[![Code Coverage](https://img.shields.io/badge/Coverage-67%25-yellow.svg)](https://github.com/EuroEval/EuroEval/tree/main/tests)
 [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg)](https://github.com/EuroEval/EuroEval/blob/main/CODE_OF_CONDUCT.md)
@@ -206,7 +206,9 @@ sentiment-classification`.
 ### Reproducing the datasets
-All datasets used in this project are generated using the scripts located in the [src/scripts](src/scripts) folder. To reproduce a dataset, run the corresponding script with the following command
+All datasets used in this project are generated using the scripts located in the
+[src/scripts](src/scripts) folder. To reproduce a dataset, run the corresponding script
+with the following command
 ```shell
 $ uv run src/scripts/<name-of-script>.py
@@ -218,8 +220,27 @@ Replace <name-of-script> with the specific script you wish to execute, e.g.,
 $ uv run src/scripts/create_allocine.py
 ```
-## Special Thanks :pray:
+## Contributors :pray:
+A huge thank you to all the contributors who have helped make this project a success!
+<a href="https://github.com/peter-sk"><img src="https://avatars.githubusercontent.com/u/6168908" width=50 alt="Contributor avatar for peter-sk"/></a>
+<a href="https://github.com/AJDERS"><img src="https://avatars.githubusercontent.com/u/38854604" width=50 alt="Contributor avatar for AJDERS"/></a>
+<a href="https://github.com/oliverkinch"><img src="https://avatars.githubusercontent.com/u/71556498" width=50 alt="Contributor avatar for oliverkinch"/></a>
+<a href="https://github.com/versae"><img src="https://avatars.githubusercontent.com/u/173537" width=50 alt="Contributor avatar for versae"/></a>
+<a href="https://github.com/viggo-gascou"><img src="https://avatars.githubusercontent.com/u/94069687" width=50 alt="Contributor avatar for viggo-gascou"/></a>
+<a href="https://github.com/mathiasesn"><img src="https://avatars.githubusercontent.com/u/27091759" width=50 alt="Contributor avatar for mathiasesn"/></a>
+<a href="https://github.com/Alkarex"><img src="https://avatars.githubusercontent.com/u/1008324" width=50 alt="Contributor avatar for Alkarex"/></a>
+<a href="https://github.com/marksverdhei"><img src="https://avatars.githubusercontent.com/u/46672778" width=50 alt="Contributor avatar for marksverdhei"/></a>
+<a href="https://github.com/Mikeriess"><img src="https://avatars.githubusercontent.com/u/19728563" width=50 alt="Contributor avatar for Mikeriess"/></a>
+<a href="https://github.com/pakagronglb"><img src="https://avatars.githubusercontent.com/u/178713124" width=50 alt="Contributor avatar for pakagronglb"/></a>
+<a href="https://github.com/ThomasKluiters"><img src="https://avatars.githubusercontent.com/u/8137941" width=50 alt="Contributor avatar for ThomasKluiters"/></a>
+<a href="https://github.com/BramVanroy"><img src="https://avatars.githubusercontent.com/u/2779410" width=50 alt="Contributor avatar for BramVanroy"/></a>
+<a href="https://github.com/peregilk"><img src="https://avatars.githubusercontent.com/u/9079808" width=50 alt="Contributor avatar for peregilk"/></a>
+### Special Thanks
+- Thanks to [Google](https://google.com/) for sponsoring Gemini credits as part of their
+  [Google Cloud for Researchers Program](https://cloud.google.com/edu/researchers).
 - Thanks [@Mikeriess](https://github.com/Mikeriess) for evaluating many of the larger
   models on the leaderboards.
 - Thanks to [OpenAI](https://openai.com/) for sponsoring OpenAI credits as part of their

{euroeval-15.5.0 → euroeval-15.6.0}/README.md RENAMED Viewed

@@ -13,7 +13,7 @@ ______________________________________________________________________
 [![Second paper](https://img.shields.io/badge/arXiv-2406.13469-b31b1b.svg)](https://arxiv.org/abs/2406.13469)
 [![License](https://img.shields.io/github/license/EuroEval/EuroEval)](https://github.com/EuroEval/EuroEval/blob/main/LICENSE)
 [![LastCommit](https://img.shields.io/github/last-commit/EuroEval/EuroEval)](https://github.com/EuroEval/EuroEval/commits/main)
-[![Code Coverage](https://img.shields.io/badge/Coverage-65%25-yellow.svg)](https://github.com/EuroEval/EuroEval/tree/main/tests)
+[![Code Coverage](https://img.shields.io/badge/Coverage-67%25-yellow.svg)](https://github.com/EuroEval/EuroEval/tree/main/tests)
 [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg)](https://github.com/EuroEval/EuroEval/blob/main/CODE_OF_CONDUCT.md)
@@ -130,7 +130,9 @@ sentiment-classification`.
 ### Reproducing the datasets
-All datasets used in this project are generated using the scripts located in the [src/scripts](src/scripts) folder. To reproduce a dataset, run the corresponding script with the following command
+All datasets used in this project are generated using the scripts located in the
+[src/scripts](src/scripts) folder. To reproduce a dataset, run the corresponding script
+with the following command
 ```shell
 $ uv run src/scripts/<name-of-script>.py
@@ -142,8 +144,27 @@ Replace <name-of-script> with the specific script you wish to execute, e.g.,
 $ uv run src/scripts/create_allocine.py
 ```
+## Contributors :pray:
-## Special Thanks :pray:
+A huge thank you to all the contributors who have helped make this project a success!
+<a href="https://github.com/peter-sk"><img src="https://avatars.githubusercontent.com/u/6168908" width=50 alt="Contributor avatar for peter-sk"/></a>
+<a href="https://github.com/AJDERS"><img src="https://avatars.githubusercontent.com/u/38854604" width=50 alt="Contributor avatar for AJDERS"/></a>
+<a href="https://github.com/oliverkinch"><img src="https://avatars.githubusercontent.com/u/71556498" width=50 alt="Contributor avatar for oliverkinch"/></a>
+<a href="https://github.com/versae"><img src="https://avatars.githubusercontent.com/u/173537" width=50 alt="Contributor avatar for versae"/></a>
+<a href="https://github.com/viggo-gascou"><img src="https://avatars.githubusercontent.com/u/94069687" width=50 alt="Contributor avatar for viggo-gascou"/></a>
+<a href="https://github.com/mathiasesn"><img src="https://avatars.githubusercontent.com/u/27091759" width=50 alt="Contributor avatar for mathiasesn"/></a>
+<a href="https://github.com/Alkarex"><img src="https://avatars.githubusercontent.com/u/1008324" width=50 alt="Contributor avatar for Alkarex"/></a>
+<a href="https://github.com/marksverdhei"><img src="https://avatars.githubusercontent.com/u/46672778" width=50 alt="Contributor avatar for marksverdhei"/></a>
+<a href="https://github.com/Mikeriess"><img src="https://avatars.githubusercontent.com/u/19728563" width=50 alt="Contributor avatar for Mikeriess"/></a>
+<a href="https://github.com/pakagronglb"><img src="https://avatars.githubusercontent.com/u/178713124" width=50 alt="Contributor avatar for pakagronglb"/></a>
+<a href="https://github.com/ThomasKluiters"><img src="https://avatars.githubusercontent.com/u/8137941" width=50 alt="Contributor avatar for ThomasKluiters"/></a>
+<a href="https://github.com/BramVanroy"><img src="https://avatars.githubusercontent.com/u/2779410" width=50 alt="Contributor avatar for BramVanroy"/></a>
+<a href="https://github.com/peregilk"><img src="https://avatars.githubusercontent.com/u/9079808" width=50 alt="Contributor avatar for peregilk"/></a>
+### Special Thanks
+- Thanks to [Google](https://google.com/) for sponsoring Gemini credits as part of their
+  [Google Cloud for Researchers Program](https://cloud.google.com/edu/researchers).
 - Thanks [@Mikeriess](https://github.com/Mikeriess) for evaluating many of the larger
   models on the leaderboards.
 - Thanks to [OpenAI](https://openai.com/) for sponsoring OpenAI credits as part of their

{euroeval-15.5.0 → euroeval-15.6.0}/makefile RENAMED Viewed

@@ -56,7 +56,6 @@ install-dependencies:
 	@if [ "${NO_FLASH_ATTN}" != "1" ] && [ $$(uname) != "Darwin" ]; then \
 		uv pip install --no-build-isolation flash-attn>=2.7.0.post2; \
 	fi
-	@uv sync -U --only-dev
 setup-environment-variables:
 	@uv run python src/scripts/fix_dot_env_file.py
@@ -156,8 +155,8 @@ publish-scandeval:
 	fi
 	@mv src/scandeval src/euroeval
-publish-major: bump-major publish  ## Publish a major version
+publish-major: install check bump-major publish  ## Publish a major version
-publish-minor: bump-minor publish  ## Publish a minor version
+publish-minor: install check bump-minor publish  ## Publish a minor version
-publish-patch: bump-patch publish  ## Publish a patch version
+publish-patch: install check bump-patch publish  ## Publish a patch version

{euroeval-15.5.0 → euroeval-15.6.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "EuroEval"
-version = "15.5.0"
+version = "15.6.0"
 description = "The robust European language model benchmark."
 readme = "README.md"
 authors = [
@@ -15,7 +15,7 @@ dependencies = [
     "torch>=2.6.0",
     "pandas>=2.2.0",
     "numpy>=1.23.0,<2.0.0",
-    "transformers>=4.50.0",
+    "transformers>=4.51.0",
     "accelerate>=0.34.2",
     "evaluate>=0.4.1",
     "datasets>=2.15.0",
@@ -24,7 +24,7 @@ dependencies = [
     "termcolor>=2.0.0",
     "seqeval>=1.2.2",
     "python-dotenv>=1.0.1",
-    "huggingface-hub>=0.24.0",
+    "huggingface-hub>=0.30.1",
     "pyinfer>=0.0.3",
     "sentencepiece>=0.1.96",
     "protobuf~=3.20.0",
@@ -46,7 +46,7 @@ dependencies = [
 generative = [
     "outlines>=0.1.11",
     "bitsandbytes>=0.43.1; platform_system == 'Linux'",
-    "vllm>=0.8.0; platform_system == 'Linux'",
+    "vllm>=0.8.3; platform_system == 'Linux'",
     "fbgemm-gpu>=1.0.0; platform_system == 'Linux'",
 ]
 human_evaluation = [
@@ -55,7 +55,7 @@ human_evaluation = [
 all = [
     "outlines>=0.1.11",
     "bitsandbytes>=0.43.1; platform_system == 'Linux'",
-    "vllm>=0.8.0; platform_system == 'Linux'",
+    "vllm>=0.8.3; platform_system == 'Linux'",
     "fbgemm-gpu>=1.0.0; platform_system == 'Linux'",
     "gradio>=4.26.0",
 ]
@@ -107,6 +107,7 @@ dev-dependencies = [
     "types-setuptools>=75.8.0.20250110",
     "types-ujson>=5.10.0.20240515",
     "types-simplejson>=3.2.0.2025032",
+    "debugpy>=1.8.13",
 ]
 [tool.ruff]
@@ -144,6 +145,16 @@ select = [
     # Pyflakes
     "F",
 ]
+ignore = [
+    # Type annotations for "self" arguments
+    "ANN101",
+    # Type annotations for "cls" arguments
+    "ANN102",
+    # Type annotations for **kwargs
+    "ANN003",
+    # Docstrings for **kwargs
+    "D417",
+]
 [tool.ruff.lint.extend-per-file-ignores]
 "__init__.py" = [

{euroeval-15.5.0 → euroeval-15.6.0}/src/euroeval/benchmark_modules/base.py RENAMED Viewed

@@ -10,7 +10,8 @@ from functools import cached_property, partial
 from datasets import DatasetDict
 from torch import nn
 from tqdm.auto import tqdm
-from transformers import PreTrainedTokenizer, Trainer
+from transformers.tokenization_utils import PreTrainedTokenizer
+from transformers.trainer import Trainer
 from ..data_models import (
     BenchmarkConfig,
@@ -21,7 +22,7 @@ from ..data_models import (
 )
 from ..enums import BatchingPreference, GenerativeType, TaskGroup
 from ..exceptions import NeedsEnvironmentVariable, NeedsExtraInstalled
-from ..task_utils import (
+from ..task_group_utils import (
     question_answering,
     sequence_classification,
     text_to_text,

{euroeval-15.5.0 → euroeval-15.6.0}/src/euroeval/benchmark_modules/fresh.py RENAMED Viewed

@@ -4,19 +4,21 @@ import os
 from functools import cached_property
 from json import JSONDecodeError
-from transformers import (
-    AutoConfig,
-    AutoTokenizer,
+from transformers.configuration_utils import PretrainedConfig
+from transformers.modeling_utils import PreTrainedModel
+from transformers.models.auto.configuration_auto import AutoConfig
+from transformers.models.auto.tokenization_auto import AutoTokenizer
+from transformers.models.electra import (
     ElectraForQuestionAnswering,
     ElectraForSequenceClassification,
     ElectraForTokenClassification,
-    PretrainedConfig,
-    PreTrainedModel,
-    PreTrainedTokenizer,
+)
+from transformers.models.xlm_roberta import (
     XLMRobertaForQuestionAnswering,
     XLMRobertaForSequenceClassification,
     XLMRobertaForTokenClassification,
 )
+from transformers.tokenization_utils import PreTrainedTokenizer
 from ..data_models import BenchmarkConfig, DatasetConfig, ModelConfig
 from ..enums import InferenceBackend, ModelType, TaskGroup

{euroeval-15.5.0 → euroeval-15.6.0}/src/euroeval/benchmark_modules/hf.py RENAMED Viewed

@@ -13,31 +13,29 @@ import torch
 from datasets import DatasetDict
 from huggingface_hub import HfApi
 from huggingface_hub import whoami as hf_whoami
-from huggingface_hub.hf_api import ModelInfo as HfApiModelInfo
-from huggingface_hub.hf_api import RepositoryNotFoundError, RevisionNotFoundError
-from huggingface_hub.utils import (
+from huggingface_hub.errors import (
     GatedRepoError,
     HFValidationError,
     LocalTokenNotFoundError,
+    RepositoryNotFoundError,
+    RevisionNotFoundError,
 )
+from huggingface_hub.hf_api import ModelInfo as HfApiModelInfo
 from peft import PeftConfig
 from requests.exceptions import RequestException
 from torch import nn
-from transformers import (
-    AutoConfig,
-    AutoTokenizer,
-    BatchEncoding,
+from transformers.configuration_utils import PretrainedConfig
+from transformers.data.data_collator import (
     DataCollatorForTokenClassification,
     DataCollatorWithPadding,
-    PretrainedConfig,
-    PreTrainedModel,
-    PreTrainedTokenizer,
-    Trainer,
 )
 from transformers.modelcard import TASK_MAPPING
-from transformers.models.auto.modeling_auto import (
-    MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES,
-)
+from transformers.modeling_utils import PreTrainedModel
+from transformers.models.auto.configuration_auto import AutoConfig
+from transformers.models.auto.tokenization_auto import AutoTokenizer
+from transformers.tokenization_utils import PreTrainedTokenizer
+from transformers.tokenization_utils_base import BatchEncoding
+from transformers.trainer import Trainer
 from urllib3.exceptions import RequestError
 from ..constants import (
@@ -65,18 +63,17 @@ from ..exceptions import (
     NoInternetConnection,
 )
 from ..languages import get_all_languages
-from ..task_utils import (
+from ..task_group_utils import (
     multiple_choice_classification,
     question_answering,
     token_classification,
 )
+from ..tokenization_utils import get_bos_token, get_eos_token
 from ..types import ExtractLabelsFunction
 from ..utils import (
     block_terminal_output,
     create_model_cache_dir,
-    get_bos_token,
     get_class_by_name,
-    get_eos_token,
     internet_connection_available,
     log_once,
 )
@@ -690,7 +687,7 @@ def load_model_and_tokenizer(
     assert model is not None, "The model should not be None."
     model.eval()
-    model.to(benchmark_config.device)
+    model.to(benchmark_config.device)  # type: ignore[arg-type]
     if (
         isinstance(model, PreTrainedModel)
@@ -797,12 +794,6 @@ def get_model_repo_info(
             tags += base_model_info.tags or list()
             tags = list(set(tags))
-    # TEMP: This extends the `TASK_MAPPING` dictionary to include the missing
-    # 'image-text-to-text' pipeline tag. This will be added as part of `TASK_MAPPING`
-    # when this PR has been merged in and published:
-    # https://github.com/huggingface/transformers/pull/37107
-    TASK_MAPPING["image-text-to-text"] = MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES
     # Get the pipeline tag for the model. If it is not specified, then we determine it
     # by checking the model's architecture as written in the model's Hugging Face config
     pipeline_tag = model_info.pipeline_tag
@@ -824,7 +815,7 @@ def get_model_repo_info(
         generative_class_names = [
             class_name
             for tag in GENERATIVE_PIPELINE_TAGS
-            for class_name in TASK_MAPPING.get(tag, dict()).values()
+            for class_name in TASK_MAPPING.get(tag, dict()).values()  # type: ignore[attr-defined]
         ]
         if class_names is not None and any(
             class_name in generative_class_names for class_name in class_names
@@ -1083,17 +1074,20 @@ def setup_model_for_question_answering(model: "PreTrainedModel") -> "PreTrainedM
         for attribute in attribute_list:
             token_type_embeddings = getattr(token_type_embeddings, attribute)
+        token_type_embedding_tensor = token_type_embeddings.weight.data
+        assert isinstance(token_type_embedding_tensor, torch.Tensor)
         # If the token type embeddings has shape (1, ...) then set the shape to
         # (2, ...) by randomly initializing the second token type embedding
-        if token_type_embeddings.weight.data.shape[0] == 1:
+        if token_type_embedding_tensor.shape[0] == 1:
             token_type_embeddings.weight.data = torch.cat(
                 (
-                    token_type_embeddings.weight.data,
-                    torch.rand_like(token_type_embeddings.weight.data),
+                    token_type_embedding_tensor,
+                    torch.rand_like(token_type_embedding_tensor),
                 ),
                 dim=0,
             )
-            token_type_embeddings.num_embeddings = 2
+            token_type_embeddings.num_embeddings = 2  # type: ignore[assignment]
         # Set the model config to use the new type vocab size
         model.config.type_vocab_size = 2
@@ -1160,7 +1154,7 @@ def align_model_and_tokenizer(
     # Move the model to the CPU, since otherwise we can't catch the IndexErrors when
     # finding the maximum sequence length of the model
     model_device = model.device
-    model.to(torch.device("cpu"))
+    model.to(torch.device("cpu"))  # type: ignore[arg-type]
     # Manually check that this model max length is valid for the model, and adjust
     # otherwise
@@ -1182,8 +1176,16 @@ def align_model_and_tokenizer(
             except IndexError:
                 continue
+            except ValueError as e:
+                # This happens when the model is using Triton, such as with ModernBERT,
+                # which doesn't work with CPU tensors at all
+                if "cpu tensor" in str(e):
+                    break
+                else:
+                    raise e
     # Move the model back to the original device
-    model.to(model_device)
+    model.to(model_device)  # type: ignore[arg-type]
     # If there is a mismatch between the vocab size according to the tokenizer and
     # the vocab size according to the model, we raise an error

EuroEval 15.5.0__tar.gz → 15.6.0__tar.gz

Potentially problematic release.

EuroEval 15.5.0tar.gz → 15.6.0tar.gz