PyPI - multimetriceval - Versions diffs - 0.1.0__tar.gz - Mend

multimetriceval 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

multimetriceval-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,15 @@
+Metadata-Version: 2.4
+Name: multimetriceval
+Version: 0.1.0
+Summary: 多指标翻译评测工具
+Requires-Python: >=3.8
+Requires-Dist: torch>=1.9.0
+Requires-Dist: numpy
+Requires-Dist: sacrebleu>=2.0.0
+Provides-Extra: comet
+Requires-Dist: unbabel-comet>=2.0.0; extra == "comet"
+Provides-Extra: whisper
+Requires-Dist: openai-whisper; extra == "whisper"
+Provides-Extra: all
+Requires-Dist: unbabel-comet>=2.0.0; extra == "all"
+Requires-Dist: openai-whisper; extra == "all"

multimetriceval-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,432 @@
+# 📊 MultiMetric-Eval
+多指标翻译评测工具，一行代码计算 BLEU、chrF++、COMET、BLEURT，支持文本和语音输入。
+[![PyPI version](https://badge.fury.io/py/multimetric-eval.svg)](https://badge.fury.io/py/multimetric-eval)
+[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
+<!-- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) -->
+---
+## 🚀 安装
+```bash
+# 基础安装（BLEU + chrF++ + COMET）
+pip install multimetric-eval
+# 可选依赖
+pip install unbabel-comet    # COMET 指标
+pip install openai-whisper   # 语音转文字
+pip install bleurt           # BLEURT 指标
+# 需要额外安装内容请见以下“支持的指标”部分表格
+```
+---
+## 📖 快速开始
+```python
+from multimetric_eval import ModelEvaluator
+# 初始化（首次会自动下载 COMET 模型）
+evaluator = ModelEvaluator()
+# 评测
+results = evaluator.evaluate(
+    hypothesis=["The cat sits on the mat."],
+    reference=["The cat is sitting on the mat."],
+    source=["猫坐在垫子上。"]
+)
+print(results)
+# {'sacreBLEU': 45.23, 'chrF++': 62.15, 'COMET': 0.8523}
+```
+---
+## 📁 使用内置数据集
+```python
+from multimetric_eval import ModelEvaluator, load_dataset
+# 加载内置数据集（自动下载到 ./datasets/，若有网络问题，也可以手动下载https://github.com/sjtuayj/MultiMetric-Eval/releases/download/v0.1.0/zh-en-littleprince.zip并将解压文件zh-en-littleprince保存至./datasets/）
+dataset = load_dataset("zh-en-littleprince")
+# 初始化评测器
+evaluator = ModelEvaluator(use_comet=True)
+```
+### 方式1：传入列表
+```python
+results = evaluator.evaluate_dataset(
+    dataset=dataset,
+    hypothesis=["Translation 1", "Translation 2", "Translation 3"],
+)
+```
+### 方式2：传入 JSON 文件
+```python
+results = evaluator.evaluate_dataset(
+    dataset=dataset,
+    hypothesis="translations.json",
+)
+```
+### 方式3：传入 TXT 文件
+```python
+results = evaluator.evaluate_dataset(
+    dataset=dataset,
+    hypothesis="translations.txt",
+)
+```
+### 方式4：传入音频文件夹
+```python
+# 需要启用 Whisper
+evaluator = ModelEvaluator(use_comet=True, use_whisper=True)
+results = evaluator.evaluate_dataset(
+    dataset=dataset,
+    audio_folder="./my_audio/",
+)
+```
+---
+## 📂 使用自定义数据集
+```python
+from multimetric_eval import ModelEvaluator
+evaluator = ModelEvaluator(use_comet=True)
+# 准备参考数据
+reference = ["Reference 1", "Reference 2"]
+source = ["源文本1", "源文本2"]  # COMET 需要
+```
+### 方式1：传入列表
+```python
+results = evaluator.evaluate(
+    hypothesis=["Translation 1", "Translation 2"],
+    reference=reference,
+    source=source,
+)
+```
+### 方式2：传入 JSON 文件
+```python
+results = evaluator.evaluate_file(
+    hypothesis_file="translations.json",
+    reference=reference,
+    source=source,
+)
+```
+### 方式3：传入 TXT 文件
+```python
+results = evaluator.evaluate_file(
+    hypothesis_file="translations.txt",
+    reference=reference,
+    source=source,
+)
+```
+### 方式4：传入音频文件夹
+```python
+evaluator = ModelEvaluator(use_comet=True, use_whisper=True)
+results = evaluator.evaluate_audio_folder(
+    audio_folder="./my_audio/",
+    reference=reference,
+    source=source,
+)
+```
+---
+## 📄 输入文件格式
+### JSON 文件（三种格式均支持）
+**格式1：字典格式**
+```json
+{
+    "hypothesis": [
+        "Translation sentence 1.",
+        "Translation sentence 2.",
+        "Translation sentence 3."
+    ]
+}
+```
+**格式2：对象数组格式**
+```json
+[
+    {"id": "001", "hypothesis": "Translation sentence 1."},
+    {"id": "002", "hypothesis": "Translation sentence 2."},
+    {"id": "003", "hypothesis": "Translation sentence 3."}
+]
+```
+**格式3：纯字符串数组**
+```json
+[
+    "Translation sentence 1.",
+    "Translation sentence 2.",
+    "Translation sentence 3."
+]
+```
+### TXT 文件
+每行一句，空行自动忽略：
+```text
+Translation sentence 1.
+Translation sentence 2.
+Translation sentence 3.
+```
+### 音频文件夹
+```
+my_audio/
+├── 001.wav
+├── 002.wav
+├── 003.mp3
+└── 004.flac
+```
+- **支持格式**：`.wav`、`.mp3`、`.flac`
+- **排序规则**：按文件名自动排序（确保与参考译文顺序一致）
+- **命名建议**：使用数字前缀如 `001.wav`、`002.wav`
+---
+## ⚙️ 参数配置
+### 评测器参数
+```python
+evaluator = ModelEvaluator(
+    use_comet=True,                        # 启用 COMET（需要 source）
+    use_bleurt=False,                      # 启用 BLEURT
+    use_whisper=False,                     # 启用语音转文字
+    comet_model="Unbabel/wmt22-comet-da",  # COMET 模型
+    whisper_model="medium",                # tiny/base/small/medium/large
+    bleurt_path=None,                      # BLEURT 模型路径
+    device=None,                           # cuda/cpu，默认自动检测
+)
+```
+| 参数 | 类型 | 默认值 | 说明 |
+|------|------|--------|------|
+| `use_comet` | bool | `True` | 启用 COMET 指标 |
+| `use_bleurt` | bool | `False` | 启用 BLEURT 指标 |
+| `use_whisper` | bool | `False` | 启用语音转文字 |
+| `comet_model` | str | `"Unbabel/wmt22-comet-da"` | COMET 模型名称 |
+| `whisper_model` | str | `"medium"` | Whisper 模型大小 |
+| `bleurt_path` | str | `None` | BLEURT 模型本地路径 |
+| `device` | str | `None` | 计算设备，自动检测 GPU |
+### 数据集参数
+```python
+dataset = load_dataset(
+    name="zh-en-littleprince",   # 数据集名称
+    cache_dir="./datasets",      # 缓存目录
+    force_download=False,        # 强制重新下载
+)
+```
+---
+## 🎯 常用场景
+### 场景1：快速评测（只用 BLEU + chrF++）
+```python
+evaluator = ModelEvaluator(use_comet=False)
+results = evaluator.evaluate(
+    hypothesis=["My translation"],
+    reference=["Reference translation"],
+)
+# {'sacreBLEU': 45.23, 'chrF++': 62.15}
+```
+### 场景2：完整评测（全部指标）
+```python
+evaluator = ModelEvaluator(
+    use_comet=True,
+    use_bleurt=True,
+    bleurt_path="./model/BLEURT-20",
+)
+results = evaluator.evaluate(
+    hypothesis=["My translation"],
+    reference=["Reference translation"],
+    source=["源文本"],
+)
+# {'sacreBLEU': 45.23, 'chrF++': 62.15, 'COMET': 0.85, 'BLEURT': 0.72}
+```
+### 场景3：语音评测
+```python
+evaluator = ModelEvaluator(
+    use_comet=True,
+    use_whisper=True,
+    whisper_model="large",  # 更高精度
+)
+results = evaluator.evaluate_audio_folder(
+    audio_folder="./speech_outputs/",
+    reference=["Reference 1", "Reference 2"],
+    source=["源文本1", "源文本2"],
+)
+# 查看 ASR 转写结果
+print(results["hypothesis"])
+```
+### 场景4：强制使用 CPU
+```python
+evaluator = ModelEvaluator(
+    use_comet=True,
+    device="cpu",
+)
+```
+---
+## 📊 支持的指标
+| 指标 | 说明 | 需要 source | 需要额外安装 |
+|------|------|-------------|--------------|
+| sacreBLEU | 标准 BLEU 分数 | ❌ | ❌ |
+| chrF++ | 字符级 F 分数 | ❌ | ❌ |
+| COMET | 神经网络评估 | ✅ | `unbabel-comet` |
+| BLEURT | Google BLEURT | ❌ | `bleurt` + 模型文件 |
+---
+## 📤 输出结果
+```python
+results = evaluator.evaluate(...)
+print(results)
+# {
+#     "sacreBLEU": 45.23,      # 始终返回
+#     "chrF++": 62.15,         # 始终返回
+#     "COMET": 0.8523,         # use_comet=True 时返回
+#     "BLEURT": 0.7234,        # use_bleurt=True 时返回
+#     "hypothesis": [...],     # 音频输入时返回转写结果
+# }
+```
+---
+## 📋 输入格式支持总结
+| 格式 | 方法 | 示例 |
+|------|------|------|
+| Python 列表 | `evaluate()` | `["sent1", "sent2"]` |
+| JSON 文件 | `evaluate_file()` | `"translations.json"` |
+| TXT 文件 | `evaluate_file()` | `"translations.txt"` |
+| 音频文件夹 | `evaluate_audio_folder()` | `"./audio/"` |
+---
+## 🔧 高级用法
+### 使用上下文管理器（自动释放显存）
+```python
+with ModelEvaluator(use_comet=True) as evaluator:
+    results = evaluator.evaluate(
+        hypothesis=["Translation"],
+        reference=["Reference"],
+        source=["源文本"],
+    )
+# 自动释放显存
+```
+### 从本地 JSON 创建自定义数据集
+```python
+from multimetric_eval import create_dataset_from_json
+# my_data.json 格式：
+# [
+#     {"id": "001", "source_text": "源文本1", "reference_text": "Ref 1"},
+#     {"id": "002", "source_text": "源文本2", "reference_text": "Ref 2"}
+# ]
+dataset = create_dataset_from_json("./my_data.json")
+results = evaluator.evaluate_dataset(
+    dataset=dataset,
+    hypothesis=["Translation 1", "Translation 2"],
+)
+```
+### 查看可用数据集
+```python
+from multimetric_eval import list_datasets, get_dataset_info
+# 列出所有可用数据集
+print(list_datasets())
+# ['zh-en-littleprince']
+# 查看数据集详情
+info = get_dataset_info("zh-en-littleprince")
+print(info)
+# {
+#     'name': 'zh-en-littleprince',
+#     'is_downloaded': True,
+#     'num_samples': 100,
+#     'audio_complete': True
+# }
+```
+---
+## ❓ 常见问题
+### Q: COMET 分数显示 -1.0？
+A: 请确保传入了 `source` 参数，COMET 需要源文本。
+### Q: CUDA out of memory？
+A: 使用上下文管理器或手动调用 `evaluator.cleanup()` 释放显存。
+### Q: 如何只使用基础指标？
+A: 设置 `use_comet=False`，只计算 BLEU 和 chrF++。
+### Q: 音频文件顺序不对？
+A: 使用数字前缀命名，如 `001.wav`、`002.wav`，确保排序正确。
+---
+## 📜 License
+MIT License
+---
+## 🤝 Contributing
+欢迎提交 Issue 和 Pull Request！

multimetriceval-0.1.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,22 @@
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "multimetriceval"
+version = "0.1.0"
+description = "多指标翻译评测工具"
+requires-python = ">=3.8"
+dependencies = [
+    "torch>=1.9.0",
+    "numpy",
+    "sacrebleu>=2.0.0",
+]
+[project.optional-dependencies]
+comet = ["unbabel-comet>=2.0.0"]
+whisper = ["openai-whisper"]
+all = ["unbabel-comet>=2.0.0", "openai-whisper"]
+[tool.setuptools.packages.find]
+where = ["src"]

multimetriceval-0.1.0/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

multimetriceval-0.1.0/src/multimetric_eval/__init__.py ADDED Viewed

@@ -0,0 +1,24 @@
+from .evaluator import (
+    ModelEvaluator,
+    load_hypothesis_from_file,
+    load_audio_from_folder,
+)
+from .dataset import (
+    Dataset,
+    load_dataset,
+    list_datasets,
+    get_dataset_info,
+    create_dataset_from_json,
+)
+__version__ = "0.1.0"
+__all__ = [
+    "ModelEvaluator",
+    "load_hypothesis_from_file",
+    "load_audio_from_folder",
+    "Dataset",
+    "load_dataset",
+    "list_datasets",
+    "get_dataset_info",
+    "create_dataset_from_json",
+]

multimetriceval-0.1.0/src/multimetric_eval/dataset.py ADDED Viewed

@@ -0,0 +1,199 @@
+"""
+内置数据集管理 - 支持音频文件下载
+优先使用本地缓存，无则联网下载
+"""
+import os
+import json
+import urllib.request
+import zipfile
+import shutil
+from typing import Dict, List, Optional, Union
+# 数据集下载地址
+DATASET_URLS = {
+    "zh-en-littleprince": "https://github.com/sjtuayj/MultiMetric-Eval/releases/download/v0.1.0/zh-en-littleprince.zip",
+}
+# 默认缓存目录
+# DEFAULT_CACHE_DIR = os.path.expanduser("~/.datasets")
+DEFAULT_CACHE_DIR = "./datasets"
+class Dataset:
+    """内置数据集类"""
+    def __init__(self, data: List[Dict], base_dir: str):
+        self._data = data
+        self._base_dir = base_dir
+    def __len__(self) -> int:
+        return len(self._data)
+    def __getitem__(self, idx: int) -> Dict:
+        item = self._data[idx].copy()
+        if "source_speech_path" in item:
+            filename = os.path.basename(item["source_speech_path"])
+            item["source_speech_path"] = os.path.join(self._base_dir, "audio", filename)
+        return item
+    @property
+    def ids(self) -> List[str]:
+        return [item.get("id", f"sample_{i}") for i, item in enumerate(self._data)]
+    @property
+    def source_texts(self) -> List[str]:
+        return [item["source_text"] for item in self._data]
+    @property
+    def reference_texts(self) -> List[str]:
+        return [item["reference_text"] for item in self._data]
+    @property
+    def audio_paths(self) -> List[str]:
+        return [self[i].get("source_speech_path", "") for i in range(len(self))]
+    def verify_audio_files(self) -> Dict[str, Union[int, List[str]]]:
+        """验证音频文件完整性"""
+        missing = [p for p in self.audio_paths if not os.path.exists(p)]
+        return {
+            "total": len(self),
+            "found": len(self) - len(missing),
+            "missing": len(missing),
+            "missing_files": missing,
+        }
+def list_datasets() -> List[str]:
+    """列出所有可用数据集"""
+    return list(DATASET_URLS.keys())
+def _is_dataset_cached(name: str, cache_dir: str) -> bool:
+    """检查数据集是否已下载"""
+    dataset_dir = os.path.join(cache_dir, name)
+    data_file = os.path.join(dataset_dir, "dataset_paired.json")
+    audio_dir = os.path.join(dataset_dir, "audio")
+    return os.path.exists(data_file) and os.path.exists(audio_dir)
+def get_dataset_info(name: str, cache_dir: Optional[str] = None) -> Dict:
+    """获取数据集信息"""
+    if name not in DATASET_URLS:
+        raise ValueError(f"未知数据集: {name}")
+    cache_dir = cache_dir or DEFAULT_CACHE_DIR
+    is_cached = _is_dataset_cached(name, cache_dir)
+    info = {
+        "name": name,
+        "url": DATASET_URLS[name],
+        "cache_dir": os.path.join(cache_dir, name),
+        "is_downloaded": is_cached,
+    }
+    if is_cached:
+        dataset = load_dataset(name, cache_dir=cache_dir)
+        info["num_samples"] = len(dataset)
+        verify = dataset.verify_audio_files()
+        info["audio_complete"] = verify["missing"] == 0
+    return info
+def load_dataset(
+    name: str,
+    cache_dir: Optional[str] = None,
+    force_download: bool = False,
+) -> Dataset:
+    """
+    加载数据集（优先本地，无则下载）
+    Args:
+        name: 数据集名称
+        cache_dir: 缓存目录
+        force_download: 强制重新下载
+    Returns:
+        Dataset 对象
+    """
+    if name not in DATASET_URLS:
+        available = ", ".join(DATASET_URLS.keys())
+        raise ValueError(f"未知数据集: {name}。可用: {available}")
+    cache_dir = cache_dir or DEFAULT_CACHE_DIR
+    dataset_dir = os.path.join(cache_dir, name)
+    data_file = os.path.join(dataset_dir, "dataset_paired.json")
+    # 检查本地缓存
+    if _is_dataset_cached(name, cache_dir) and not force_download:
+        print(f"✅ [Local] 使用本地数据集: {name}")
+        print(f"   路径: {dataset_dir}")
+    else:
+        print(f"⏳ [Online] 下载数据集: {name}")
+        _download_dataset(name, cache_dir)
+    # 加载数据
+    with open(data_file, "r", encoding="utf-8") as f:
+        data = json.load(f)
+    dataset = Dataset(data=data, base_dir=dataset_dir)
+    # 验证完整性
+    verify = dataset.verify_audio_files()
+    if verify["missing"] > 0:
+        print(f"   ⚠️ 缺少 {verify['missing']} 个音频文件")
+    else:
+        print(f"   ✅ 数据完整 ({verify['total']} 条样本, 音频齐全)")
+    return dataset
+def _download_dataset(name: str, cache_dir: str):
+    """下载数据集"""
+    url = DATASET_URLS[name]
+    dataset_dir = os.path.join(cache_dir, name)
+    zip_path = os.path.join(cache_dir, f"{name}.zip")
+    os.makedirs(cache_dir, exist_ok=True)
+    if os.path.exists(dataset_dir):
+        shutil.rmtree(dataset_dir)
+    print(f"   URL: {url}")
+    try:
+        urllib.request.urlretrieve(url, zip_path, _download_progress)
+        print()
+        print(f"   📦 解压中...")
+        os.makedirs(dataset_dir, exist_ok=True)
+        with zipfile.ZipFile(zip_path, 'r') as zf:
+            zf.extractall(dataset_dir)
+        os.remove(zip_path)
+        print(f"   ✅ 下载完成: {dataset_dir}")
+    except Exception as e:
+        if os.path.exists(zip_path):
+            os.remove(zip_path)
+        if os.path.exists(dataset_dir):
+            shutil.rmtree(dataset_dir)
+        raise RuntimeError(f"下载失败: {e}")
+def _download_progress(block_num, block_size, total_size):
+    """下载进度条"""
+    downloaded = block_num * block_size
+    if total_size > 0:
+        percent = min(100, downloaded * 100 // total_size)
+        bar_len = 40
+        filled = int(bar_len * percent // 100)
+        bar = "█" * filled + "░" * (bar_len - filled)
+        print(f"\r   [{bar}] {percent}%", end="", flush=True)
+def create_dataset_from_json(json_path: str) -> Dataset:
+    """从本地 JSON 创建数据集"""
+    with open(json_path, "r", encoding="utf-8") as f:
+        data = json.load(f)
+    base_dir = os.path.dirname(os.path.abspath(json_path))
+    return Dataset(data=data, base_dir=base_dir)

multimetriceval-0.1.0/src/multimetric_eval/evaluator.py ADDED Viewed

@@ -0,0 +1,320 @@
+"""
+MultiMetric Eval - 多指标翻译评测工具
+"""
+import os
+import gc
+import json
+import numpy as np
+import sacrebleu
+import torch
+from typing import Dict, List, Optional, Union
+from pathlib import Path
+# ==================== 配置 ====================
+CACHE_PATHS = {
+    "huggingface": os.path.expanduser("~/.cache/huggingface/hub"),
+    "whisper": os.path.expanduser("~/.cache/whisper"),
+}
+for var in ["HF_DATASETS_OFFLINE", "TRANSFORMERS_OFFLINE"]:
+    os.environ.pop(var, None)
+# ==================== 可选依赖 ====================
+try:
+    import whisper
+except ImportError:
+    whisper = None
+try:
+    from bleurt import score as bleurt_score
+except ImportError:
+    bleurt_score = None
+try:
+    from comet import download_model, load_from_checkpoint
+except ImportError:
+    download_model = None
+    load_from_checkpoint = None
+# ==================== 输入加载工具 ====================
+def load_hypothesis_from_file(file_path: str) -> List[str]:
+    """
+    从文件加载用户翻译结果
+    支持格式:
+    - .json: {"hypothesis": [...]} 或 [{"id": "x", "hypothesis": "..."}, ...]
+    - .txt: 每行一句
+    """
+    path = Path(file_path)
+    if not path.exists():
+        raise FileNotFoundError(f"文件不存在: {file_path}")
+    suffix = path.suffix.lower()
+    if suffix == ".json":
+        with open(path, "r", encoding="utf-8") as f:
+            data = json.load(f)
+        if isinstance(data, dict) and "hypothesis" in data:
+            return data["hypothesis"]
+        if isinstance(data, list) and len(data) > 0:
+            if isinstance(data[0], dict) and "hypothesis" in data[0]:
+                return [item["hypothesis"] for item in data]
+            if isinstance(data[0], str):
+                return data
+        raise ValueError("JSON 格式不正确")
+    elif suffix == ".txt":
+        with open(path, "r", encoding="utf-8") as f:
+            return [line.strip() for line in f if line.strip()]
+    else:
+        raise ValueError(f"不支持的文件格式: {suffix}")
+def load_audio_from_folder(folder_path: str, extensions: tuple = (".wav", ".mp3", ".flac")) -> List[str]:
+    """从文件夹加载音频文件路径"""
+    folder = Path(folder_path)
+    if not folder.exists():
+        raise FileNotFoundError(f"文件夹不存在: {folder_path}")
+    audio_files = []
+    for ext in extensions:
+        audio_files.extend(folder.glob(f"*{ext}"))
+    audio_files = sorted(audio_files, key=lambda x: x.stem)
+    if not audio_files:
+        raise ValueError(f"文件夹中没有音频文件: {folder_path}")
+    return [str(f) for f in audio_files]
+# ==================== 评测器 ====================
+class ModelEvaluator:
+    """多指标翻译评测器"""
+    def __init__(
+        self,
+        use_comet: bool = True,
+        use_bleurt: bool = False,
+        use_whisper: bool = False,
+        comet_model: str = "Unbabel/wmt22-comet-da",
+        whisper_model: str = "medium",
+        bleurt_path: Optional[str] = None,
+        device: Optional[str] = None,
+    ):
+        self.device = device or ("cuda" if torch.cuda.is_available() else "cpu")
+        print(f"🚀 初始化评测器 (设备: {self.device})")
+        self.comet = self._load_comet(comet_model) if use_comet else None
+        self.whisper = self._load_whisper(whisper_model) if use_whisper else None
+        self.bleurt = self._load_bleurt(bleurt_path) if use_bleurt else None
+        print("✅ 系统就绪！")
+    # -------------------- 上下文管理器 --------------------
+    def __enter__(self):
+        return self
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.cleanup()
+        return False
+    def cleanup(self):
+        """释放模型显存"""
+        if hasattr(self, 'comet') and self.comet is not None:
+            del self.comet
+            self.comet = None
+        if hasattr(self, 'whisper') and self.whisper is not None:
+            del self.whisper
+            self.whisper = None
+        if hasattr(self, 'bleurt') and self.bleurt is not None:
+            del self.bleurt
+            self.bleurt = None
+        if torch.cuda.is_available():
+            torch.cuda.empty_cache()
+        gc.collect()
+        print("🧹 已释放模型显存")
+    # -------------------- 模型加载 --------------------
+    def _load_comet(self, model_name: str):
+        """加载 COMET 模型"""
+        if not download_model:
+            print("⚠️ COMET 未安装: pip install unbabel-comet")
+            return None
+        cache = os.path.join(CACHE_PATHS["huggingface"], f"models--{model_name.replace('/', '--')}")
+        status = "[Local]" if os.path.exists(cache) else "[Online]"
+        print(f"⏳ {status} 加载 COMET: {model_name}")
+        model = load_from_checkpoint(download_model(model_name))
+        model = model.to(self.device) if self.device == "cuda" else model
+        print("✅ COMET 加载成功！")
+        return model
+    def _load_whisper(self, model_name: str):
+        """加载 Whisper 模型"""
+        if not whisper:
+            print("⚠️ Whisper 未安装: pip install openai-whisper")
+            return None
+        cache = os.path.join(CACHE_PATHS["whisper"], f"{model_name}.pt")
+        status = "[Local]" if os.path.exists(cache) else "[Online]"
+        print(f"⏳ {status} 加载 Whisper: {model_name}")
+        model = whisper.load_model(model_name, device=self.device)
+        print("✅ Whisper 加载成功！")
+        return model
+    def _load_bleurt(self, path: Optional[str]):
+        """加载 BLEURT 模型"""
+        if not bleurt_score:
+            print("⚠️ BLEURT 未安装: pip install bleurt")
+            return None
+        if not path or not os.path.exists(path):
+            print(f"⚠️ BLEURT 路径无效: {path}")
+            return None
+        print(f"⏳ [Local] 加载 BLEURT: {path}")
+        try:
+            # 强制 BLEURT 使用 CPU（避免与 PyTorch 抢显存）
+            import tensorflow as tf
+            tf.config.set_visible_devices([], 'GPU')
+            scorer = bleurt_score.BleurtScorer(path)
+            print("✅ BLEURT 加载成功！")
+            return scorer
+        except Exception as e:
+            print(f"⚠️ BLEURT 加载失败: {e}")
+            return None
+    # -------------------- 核心功能 --------------------
+    def transcribe(self, audio_paths: List[str]) -> List[str]:
+        """语音转文字"""
+        if not self.whisper:
+            raise RuntimeError("请设置 use_whisper=True")
+        print(f"🎤 ASR 转写 ({len(audio_paths)} 个文件)...")
+        results = []
+        for i, path in enumerate(audio_paths, 1):
+            if not os.path.exists(path):
+                print(f"   ⚠️ [{i}] 文件不存在")
+                results.append("")
+            else:
+                try:
+                    text = self.whisper.transcribe(path, fp16=(self.device == "cuda"))["text"]
+                    results.append(text.strip())
+                    print(f"   ✓ [{i}/{len(audio_paths)}] {os.path.basename(path)}")
+                except:
+                    results.append("")
+        return results
+    def evaluate(
+        self,
+        hypothesis: List[str],
+        reference: List[str],
+        source: Optional[List[str]] = None,
+    ) -> Dict[str, float]:
+        """计算评测指标"""
+        print("📊 计算指标...")
+        results = {
+            "sacreBLEU": self._safe_calc(lambda: sacrebleu.corpus_bleu(hypothesis, [reference]).score),
+            "chrF++": self._safe_calc(lambda: sacrebleu.corpus_chrf(hypothesis, [reference], word_order=2).score),
+        }
+        if self.bleurt:
+            results["BLEURT"] = self._safe_calc(
+                lambda: float(np.mean(self.bleurt.score(references=reference, candidates=hypothesis)))
+            )
+        if self.comet:
+            if source:
+                data = [{"src": s, "mt": h, "ref": r} for s, h, r in zip(source, hypothesis, reference)]
+                gpus = 1 if self.device == "cuda" else 0
+                results["COMET"] = self._safe_calc(lambda: self.comet.predict(data, batch_size=8, gpus=gpus).system_score)
+            else:
+                print("⚠️ COMET 需要 source 参数")
+                results["COMET"] = -1.0
+        return {k: round(v, 4) if v >= 0 else v for k, v in results.items()}
+    def evaluate_file(
+        self,
+        hypothesis_file: str,
+        reference: List[str],
+        source: Optional[List[str]] = None,
+    ) -> Dict[str, float]:
+        """从文件加载翻译结果并评测"""
+        print(f"📂 加载翻译结果: {hypothesis_file}")
+        hypothesis = load_hypothesis_from_file(hypothesis_file)
+        print(f"   加载了 {len(hypothesis)} 条翻译")
+        return self.evaluate(hypothesis, reference, source)
+    def evaluate_audio_folder(
+        self,
+        audio_folder: str,
+        reference: List[str],
+        source: Optional[List[str]] = None,
+    ) -> Dict[str, Union[float, List[str]]]:
+        """从文件夹加载音频并评测"""
+        print(f"📂 加载音频文件夹: {audio_folder}")
+        audio_paths = load_audio_from_folder(audio_folder)
+        print(f"   找到 {len(audio_paths)} 个音频文件")
+        hypothesis = self.transcribe(audio_paths)
+        results = self.evaluate(hypothesis, reference, source)
+        results["hypothesis"] = hypothesis
+        return results
+    def evaluate_dataset(
+        self,
+        dataset,
+        hypothesis: Optional[Union[List[str], str]] = None,
+        audio_folder: Optional[str] = None,
+    ) -> Dict[str, Union[float, List[str]]]:
+        """使用数据集评测"""
+        if hypothesis:
+            if isinstance(hypothesis, str):
+                hyp_list = load_hypothesis_from_file(hypothesis)
+            else:
+                hyp_list = hypothesis
+            results = self.evaluate(hyp_list, dataset.reference_texts, dataset.source_texts)
+            results["hypothesis"] = hyp_list
+        elif audio_folder:
+            results = self.evaluate_audio_folder(
+                audio_folder,
+                dataset.reference_texts,
+                dataset.source_texts
+            )
+        else:
+            raise ValueError("请提供 hypothesis（列表或文件路径）或 audio_folder")
+        return results
+    # -------------------- 工具方法 --------------------
+    @staticmethod
+    def _safe_calc(fn, default=-1.0) -> float:
+        try:
+            return fn()
+        except:
+            return default

multimetriceval-0.1.0/src/multimetriceval.egg-info/PKG-INFO ADDED Viewed

@@ -0,0 +1,15 @@
+Metadata-Version: 2.4
+Name: multimetriceval
+Version: 0.1.0
+Summary: 多指标翻译评测工具
+Requires-Python: >=3.8
+Requires-Dist: torch>=1.9.0
+Requires-Dist: numpy
+Requires-Dist: sacrebleu>=2.0.0
+Provides-Extra: comet
+Requires-Dist: unbabel-comet>=2.0.0; extra == "comet"
+Provides-Extra: whisper
+Requires-Dist: openai-whisper; extra == "whisper"
+Provides-Extra: all
+Requires-Dist: unbabel-comet>=2.0.0; extra == "all"
+Requires-Dist: openai-whisper; extra == "all"

multimetriceval-0.1.0/src/multimetriceval.egg-info/SOURCES.txt ADDED Viewed

@@ -0,0 +1,10 @@
+README.md
+pyproject.toml
+src/multimetric_eval/__init__.py
+src/multimetric_eval/dataset.py
+src/multimetric_eval/evaluator.py
+src/multimetriceval.egg-info/PKG-INFO
+src/multimetriceval.egg-info/SOURCES.txt
+src/multimetriceval.egg-info/dependency_links.txt
+src/multimetriceval.egg-info/requires.txt
+src/multimetriceval.egg-info/top_level.txt

multimetriceval-0.1.0/src/multimetriceval.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+

multimetriceval-0.1.0/src/multimetriceval.egg-info/requires.txt ADDED Viewed

@@ -0,0 +1,13 @@
+torch>=1.9.0
+numpy
+sacrebleu>=2.0.0
+[all]
+unbabel-comet>=2.0.0
+openai-whisper
+[comet]
+unbabel-comet>=2.0.0
+[whisper]
+openai-whisper

multimetriceval-0.1.0/src/multimetriceval.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+ multimetric_eval