PyPI - mkv-episode-matcher - Versions diffs - 0.3.3__tar.gz → 0.3.5__tar.gz - Mend

mkv-episode-matcher 0.3.3tar.gz → 0.3.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of mkv-episode-matcher might be problematic. Click here for more details.

Files changed (53) hide show

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/.coverage RENAMED Viewed

Binary file

mkv_episode_matcher-0.3.5/.github/workflows/tests.yml ADDED Viewed

@@ -0,0 +1,40 @@
+name: Tests
+on:
+  push:
+    branches: [main, master]
+  pull_request:
+    branches: [main, master]
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version:
+          - "3.9"
+          - "3.10"
+          - "3.11"
+          - "3.12"
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install uv and set the python version
+        uses: astral-sh/setup-uv@v4
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Install dependencies
+        run: |
+          uv venv
+          uv pip install -e .
+      - name: Run tests with pytest and coverage
+        run: |
+          uv run --dev pytest --cov-branch --cov-report=xml
+      - name: Upload coverage reports to Codecov
+        uses: codecov/codecov-action@v5
+        with:
+          token: ${{ secrets.CODECOV_TOKEN }}

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.2
 Name: mkv-episode-matcher
-Version: 0.3.3
+Version: 0.3.5
 Summary: The MKV Episode Matcher is a tool for identifying TV series episodes from MKV files and renaming the files accordingly.
 Home-page: https://github.com/Jsakkos/mkv-episode-matcher
 Author: Jonathan Sakkos
@@ -51,6 +51,14 @@ Automatically match and rename your MKV TV episodes using The Movie Database (TM
 - ✨ **Bulk Processing**: Handle entire seasons at once
 - 🧪 **Dry Run Mode**: Test changes before applying
+## Prerequisites
+- Python 3.9 or higher
+- [FFmpeg](https://ffmpeg.org/download.html) installed and available in system PATH
+- [Tesseract OCR](https://github.com/UB-Mannheim/tesseract/wiki) installed (required for image-based subtitle processing)
+- TMDb API key
+- OpenSubtitles account (optional, for subtitle downloads)
 ## Quick Start
 1. Install the package:
@@ -60,37 +68,13 @@ pip install mkv-episode-matcher
 2. Run on your show directory:
 ```bash
-mkv-match --show-dir "path/to/your/show" --season 1
+mkv-match --show-dir "path/to/your/show" --get-subs true
 ```
-## Requirements
-- Python 3.8 or higher
-- TMDb API key
-- OpenSubtitles account (optional, for subtitle downloads)
 ## Documentation
 Full documentation is available at [https://jsakkos.github.io/mkv-episode-matcher/](https://jsakkos.github.io/mkv-episode-matcher/)
-## Basic Usage
-```python
-from mkv_episode_matcher import process_show
-# Process all seasons
-process_show()
-# Process specific season
-process_show(season=1)
-# Test run without making changes
-process_show(season=1, dry_run=True)
-# Process and download subtitles
-process_show(get_subs=True)
-```
 ## Directory Structure
 MKV Episode Matcher expects your TV shows to be organized as follows:
@@ -105,6 +89,23 @@ Show Name/
 │   └── episode2.mkv
 ```
+## Reference Subtitle File Structure
+Subtitle files that are not automatically downloaded using the `--get-subs` flag should be named as follows:
+```
+~/.mkv-episode-matcher/cache/data/Show Name/
+├── Show Name - S01E01.srt
+├── Show Name - S01E02.srt
+└── ...
+```
+On Windows, the cache directory is located at `C:\Users\{username}\.mkv-episode-matcher\cache\data\`
+Reference subtitle files should follow this naming pattern:
+`{show_name} - S{season:02d}E{episode:02d}.srt`
 ## Contributing
 1. Fork the repository

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/README.md RENAMED Viewed

@@ -22,6 +22,14 @@ Automatically match and rename your MKV TV episodes using The Movie Database (TM
 - ✨ **Bulk Processing**: Handle entire seasons at once
 - 🧪 **Dry Run Mode**: Test changes before applying
+## Prerequisites
+- Python 3.9 or higher
+- [FFmpeg](https://ffmpeg.org/download.html) installed and available in system PATH
+- [Tesseract OCR](https://github.com/UB-Mannheim/tesseract/wiki) installed (required for image-based subtitle processing)
+- TMDb API key
+- OpenSubtitles account (optional, for subtitle downloads)
 ## Quick Start
 1. Install the package:
@@ -31,37 +39,13 @@ pip install mkv-episode-matcher
 2. Run on your show directory:
 ```bash
-mkv-match --show-dir "path/to/your/show" --season 1
+mkv-match --show-dir "path/to/your/show" --get-subs true
 ```
-## Requirements
-- Python 3.8 or higher
-- TMDb API key
-- OpenSubtitles account (optional, for subtitle downloads)
 ## Documentation
 Full documentation is available at [https://jsakkos.github.io/mkv-episode-matcher/](https://jsakkos.github.io/mkv-episode-matcher/)
-## Basic Usage
-```python
-from mkv_episode_matcher import process_show
-# Process all seasons
-process_show()
-# Process specific season
-process_show(season=1)
-# Test run without making changes
-process_show(season=1, dry_run=True)
-# Process and download subtitles
-process_show(get_subs=True)
-```
 ## Directory Structure
 MKV Episode Matcher expects your TV shows to be organized as follows:
@@ -76,6 +60,23 @@ Show Name/
 │   └── episode2.mkv
 ```
+## Reference Subtitle File Structure
+Subtitle files that are not automatically downloaded using the `--get-subs` flag should be named as follows:
+```
+~/.mkv-episode-matcher/cache/data/Show Name/
+├── Show Name - S01E01.srt
+├── Show Name - S01E02.srt
+└── ...
+```
+On Windows, the cache directory is located at `C:\Users\{username}\.mkv-episode-matcher\cache\data\`
+Reference subtitle files should follow this naming pattern:
+`{show_name} - S{season:02d}E{episode:02d}.srt`
 ## Contributing
 1. Fork the repository

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/docs/quickstart.md RENAMED Viewed

@@ -41,28 +41,22 @@ Show Name/
 │   ├── episode1.mkv
 │   └── episode2.mkv
 ```
+<!-- Add a note about the .srt reference files -->
-## Python API Usage
+## Reference Subtitle File Structure
-```python
-from mkv_episode_matcher import process_show
+Subtitle files that are not automatically downloaded using the `--get-subs` flag should be named as follows:
-# Process all seasons
-process_show()
-# Process specific season
-process_show(season=1)
-# Test run
-process_show(season=1, dry_run=True)
-# With subtitles
-process_show(season=1, get_subs=True)
+```plaintext
+~/.mkv-episode-matcher/cache/data/Show Name/
+├── Show Name - S01E01.srt
+├── Show Name - S01E02.srt
+└── ...
 ```
 ## Configuration
-Create a configuration file at `~/.mkv-episode-matcher/config.ini`:
+The configuration file is automatically generated at `~/.mkv-episode-matcher/config.ini`:
 ```ini
 [Config]

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/mkv_episode_matcher/episode_identification.py RENAMED Viewed

@@ -9,6 +9,10 @@ from loguru import logger
 import whisper
 import numpy as np
 import re
+from pathlib import Path
+import chardet
+from loguru import logger
 class EpisodeMatcher:
     def __init__(self, cache_dir, show_name, min_confidence=0.6):
         self.cache_dir = Path(cache_dir)
@@ -50,34 +54,32 @@ class EpisodeMatcher:
         return str(chunk_path)
     def load_reference_chunk(self, srt_file, chunk_idx):
-        """Load reference subtitles for a specific time chunk."""
+        """
+        Load reference subtitles for a specific time chunk with robust encoding handling.
+        Args:
+            srt_file (str or Path): Path to the SRT file
+            chunk_idx (int): Index of the chunk to load
+        Returns:
+            str: Combined text from the subtitle chunk
+        """
         chunk_start = chunk_idx * self.chunk_duration
         chunk_end = chunk_start + self.chunk_duration
-        text_lines = []
-        with open(srt_file, 'r', encoding='utf-8') as f:
-            content = f.read().strip()
+        try:
+            # Read the file content using our robust reader
+            reader = SubtitleReader()
+            content = reader.read_srt_file(srt_file)
-        for block in content.split('\n\n'):
-            lines = block.split('\n')
-            if len(lines) < 3 or '-->' not in lines[1]:  # Skip malformed blocks
-                continue
-            try:
-                timestamp = lines[1]
-                text = ' '.join(lines[2:])
-                end_time = timestamp.split(' --> ')[1].strip()
-                hours, minutes, seconds = map(float, end_time.replace(',','.').split(':'))
-                total_seconds = hours * 3600 + minutes * 60 + seconds
-                if chunk_start <= total_seconds <= chunk_end:
-                    text_lines.append(text)
-            except (IndexError, ValueError):
-                continue
-        return ' '.join(text_lines)
+            # Extract subtitles for the time chunk
+            text_lines = reader.extract_subtitle_chunk(content, chunk_start, chunk_end)
+            return ' '.join(text_lines)
+        except Exception as e:
+            logger.error(f"Error loading reference chunk from {srt_file}: {e}")
+            return ''
     def identify_episode(self, video_file, temp_dir, season_number):
         try:
@@ -147,4 +149,121 @@ class EpisodeMatcher:
         finally:
             # Cleanup temp files
             for file in self.temp_dir.glob("chunk_*.wav"):
-                file.unlink()
+                file.unlink()
+def detect_file_encoding(file_path):
+    """
+    Detect the encoding of a file using chardet.
+    Args:
+        file_path (str or Path): Path to the file
+    Returns:
+        str: Detected encoding, defaults to 'utf-8' if detection fails
+    """
+    try:
+        with open(file_path, 'rb') as f:
+            raw_data = f.read()
+        result = chardet.detect(raw_data)
+        encoding = result['encoding']
+        confidence = result['confidence']
+        logger.debug(f"Detected encoding {encoding} with {confidence:.2%} confidence for {file_path}")
+        return encoding if encoding else 'utf-8'
+    except Exception as e:
+        logger.warning(f"Error detecting encoding for {file_path}: {e}")
+        return 'utf-8'
+def read_file_with_fallback(file_path, encodings=None):
+    """
+    Read a file trying multiple encodings in order of preference.
+    Args:
+        file_path (str or Path): Path to the file
+        encodings (list): List of encodings to try, defaults to common subtitle encodings
+    Returns:
+        str: File contents
+    Raises:
+        ValueError: If file cannot be read with any encoding
+    """
+    if encodings is None:
+        # First try detected encoding, then fallback to common subtitle encodings
+        detected = detect_file_encoding(file_path)
+        encodings = [detected, 'utf-8', 'latin-1', 'cp1252', 'iso-8859-1']
+    file_path = Path(file_path)
+    errors = []
+    for encoding in encodings:
+        try:
+            with open(file_path, 'r', encoding=encoding) as f:
+                content = f.read()
+            logger.debug(f"Successfully read {file_path} using {encoding} encoding")
+            return content
+        except UnicodeDecodeError as e:
+            errors.append(f"{encoding}: {str(e)}")
+            continue
+    error_msg = f"Failed to read {file_path} with any encoding. Errors:\n" + "\n".join(errors)
+    logger.error(error_msg)
+    raise ValueError(error_msg)
+class SubtitleReader:
+    """Helper class for reading and parsing subtitle files."""
+    @staticmethod
+    def parse_timestamp(timestamp):
+        """Parse SRT timestamp into seconds."""
+        hours, minutes, seconds = timestamp.replace(',', '.').split(':')
+        return float(hours) * 3600 + float(minutes) * 60 + float(seconds)
+    @staticmethod
+    def read_srt_file(file_path):
+        """
+        Read an SRT file and return its contents with robust encoding handling.
+        Args:
+            file_path (str or Path): Path to the SRT file
+        Returns:
+            str: Contents of the SRT file
+        """
+        return read_file_with_fallback(file_path)
+    @staticmethod
+    def extract_subtitle_chunk(content, start_time, end_time):
+        """
+        Extract subtitle text for a specific time window.
+        Args:
+            content (str): Full SRT file content
+            start_time (float): Chunk start time in seconds
+            end_time (float): Chunk end time in seconds
+        Returns:
+            list: List of subtitle texts within the time window
+        """
+        text_lines = []
+        for block in content.strip().split('\n\n'):
+            lines = block.split('\n')
+            if len(lines) < 3 or '-->' not in lines[1]:
+                continue
+            try:
+                timestamp = lines[1]
+                text = ' '.join(lines[2:])
+                end_stamp = timestamp.split(' --> ')[1].strip()
+                total_seconds = SubtitleReader.parse_timestamp(end_stamp)
+                if start_time <= total_seconds <= end_time:
+                    text_lines.append(text)
+            except (IndexError, ValueError) as e:
+                logger.warning(f"Error parsing subtitle block: {e}")
+                continue
+        return text_lines

mkv_episode_matcher-0.3.5/mkv_episode_matcher/subtitle_utils.py ADDED Viewed

@@ -0,0 +1,82 @@
+from typing import List, Optional, Union
+import os
+import re
+def generate_subtitle_patterns(series_name: str, season: int, episode: int) -> List[str]:
+    """
+    Generate various common subtitle filename patterns.
+    Args:
+        series_name (str): Name of the series
+        season (int): Season number
+        episode (int): Episode number
+    Returns:
+        List[str]: List of possible subtitle filenames
+    """
+    patterns = [
+        # Standard format: "Show Name - S01E02.srt"
+        f"{series_name} - S{season:02d}E{episode:02d}.srt",
+        # Season x Episode format: "Show Name - 1x02.srt"
+        f"{series_name} - {season}x{episode:02d}.srt",
+        # Separate season/episode: "Show Name - Season 1 Episode 02.srt"
+        f"{series_name} - Season {season} Episode {episode:02d}.srt",
+        # Compact format: "ShowName.S01E02.srt"
+        f"{series_name.replace(' ', '')}.S{season:02d}E{episode:02d}.srt",
+        # Numbered format: "Show Name 102.srt"
+        f"{series_name} {season:01d}{episode:02d}.srt",
+        # Dot format: "Show.Name.1x02.srt"
+        f"{series_name.replace(' ', '.')}.{season}x{episode:02d}.srt",
+        # Underscore format: "Show_Name_S01E02.srt"
+        f"{series_name.replace(' ', '_')}_S{season:02d}E{episode:02d}.srt",
+    ]
+    return patterns
+def find_existing_subtitle(series_cache_dir: str, series_name: str, season: int, episode: int) -> Optional[str]:
+    """
+    Check for existing subtitle files in various naming formats.
+    Args:
+        series_cache_dir (str): Directory containing subtitle files
+        series_name (str): Name of the series
+        season (int): Season number
+        episode (int): Episode number
+    Returns:
+        Optional[str]: Path to existing subtitle file if found, None otherwise
+    """
+    patterns = generate_subtitle_patterns(series_name, season, episode)
+    for pattern in patterns:
+        filepath = os.path.join(series_cache_dir, pattern)
+        if os.path.exists(filepath):
+            return filepath
+    return None
+def sanitize_filename(filename: str) -> str:
+    """
+    Sanitize filename by removing/replacing invalid characters.
+    Args:
+        filename (str): Original filename
+    Returns:
+        str: Sanitized filename
+    """
+    # Replace problematic characters
+    filename = filename.replace(':', ' -')
+    filename = filename.replace('/', '-')
+    filename = filename.replace('\\', '-')
+    # Remove any other invalid characters
+    filename = re.sub(r'[<>:"/\\|?*]', '', filename)
+    return filename.strip()

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/mkv_episode_matcher/utils.py RENAMED Viewed

@@ -10,7 +10,7 @@ from opensubtitlescom import OpenSubtitles
 from mkv_episode_matcher.__main__ import CACHE_DIR, CONFIG_FILE
 from mkv_episode_matcher.config import get_config
 from mkv_episode_matcher.tmdb_client import fetch_season_details
+from mkv_episode_matcher.subtitle_utils import find_existing_subtitle,sanitize_filename
 def get_valid_seasons(show_dir):
     """
     Get all season directories that contain MKV files.
@@ -128,20 +128,17 @@ def get_subtitles(show_id, seasons: set[int]):
     Args:
         show_id (int): The ID of the TV show.
         seasons (Set[int]): A set of season numbers for which subtitles should be retrieved.
-    Returns:
-        None
     """
     logger.info(f"Getting subtitles for show ID {show_id}")
     config = get_config(CONFIG_FILE)
     show_dir = config.get("show_dir")
-    series_name = os.path.basename(show_dir)
+    series_name = sanitize_filename(os.path.basename(show_dir))
     tmdb_api_key = config.get("tmdb_api_key")
     open_subtitles_api_key = config.get("open_subtitles_api_key")
     open_subtitles_user_agent = config.get("open_subtitles_user_agent")
     open_subtitles_username = config.get("open_subtitles_username")
     open_subtitles_password = config.get("open_subtitles_password")
     if not all([
         show_dir,
         tmdb_api_key,
@@ -151,63 +148,66 @@ def get_subtitles(show_id, seasons: set[int]):
         open_subtitles_password,
     ]):
         logger.error("Missing configuration settings. Please run the setup script.")
+        return
     try:
-        # Initialize the OpenSubtitles client
         subtitles = OpenSubtitles(open_subtitles_user_agent, open_subtitles_api_key)
-        # Log in (retrieve auth token)
         subtitles.login(open_subtitles_username, open_subtitles_password)
     except Exception as e:
         logger.error(f"Failed to log in to OpenSubtitles: {e}")
         return
     for season in seasons:
         episodes = fetch_season_details(show_id, season)
         logger.info(f"Found {episodes} episodes in Season {season}")
         for episode in range(1, episodes + 1):
             logger.info(f"Processing Season {season}, Episode {episode}...")
             series_cache_dir = os.path.join(CACHE_DIR, "data", series_name)
             os.makedirs(series_cache_dir, exist_ok=True)
+            # Check for existing subtitle in any supported format
+            existing_subtitle = find_existing_subtitle(
+                series_cache_dir, series_name, season, episode
+            )
+            if existing_subtitle:
+                logger.info(f"Subtitle already exists: {os.path.basename(existing_subtitle)}")
+                continue
+            # Default to standard format for new downloads
             srt_filepath = os.path.join(
                 series_cache_dir,
                 f"{series_name} - S{season:02d}E{episode:02d}.srt",
             )
-            if not os.path.exists(srt_filepath):
-                # get the episode info from TMDB
-                url = f"https://api.themoviedb.org/3/tv/{show_id}/season/{season}/episode/{episode}?api_key={tmdb_api_key}"
-                response = requests.get(url)
-                response.raise_for_status()
-                episode_data = response.json()
-                episode_data["name"]
-                episode_id = episode_data["id"]
-                # search for the subtitle
-                response = subtitles.search(tmdb_id=episode_id, languages="en")
-                if len(response.data) == 0:
-                    logger.warning(
-                        f"No subtitles found for {series_name} - S{season:02d}E{episode:02d}"
-                    )
-                for subtitle in response.data:
-                    subtitle_dict = subtitle.to_dict()
-                    # Remove special characters and convert to uppercase
-                    filename_clean = re.sub(
-                        r"\W+", " ", subtitle_dict["file_name"]
-                    ).upper()
-                    if f"E{episode:02d}" in filename_clean:
-                        logger.info(f"Original filename: {subtitle_dict['file_name']}")
-                        srt_file = subtitles.download_and_save(subtitle)
-                        series_name = series_name.replace(":", " -")
-                        shutil.move(srt_file, srt_filepath)
-                        logger.info(f"Subtitle saved to {srt_filepath}")
-                        break
-                    else:
-                        continue
-            else:
-                logger.info(
-                    f"Subtitle already exists for {series_name} - S{season:02d}E{episode:02d}"
+            # get the episode info from TMDB
+            url = f"https://api.themoviedb.org/3/tv/{show_id}/season/{season}/episode/{episode}?api_key={tmdb_api_key}"
+            response = requests.get(url)
+            response.raise_for_status()
+            episode_data = response.json()
+            episode_id = episode_data["id"]
+            # search for the subtitle
+            response = subtitles.search(tmdb_id=episode_id, languages="en")
+            if len(response.data) == 0:
+                logger.warning(
+                    f"No subtitles found for {series_name} - S{season:02d}E{episode:02d}"
                 )
                 continue
+            for subtitle in response.data:
+                subtitle_dict = subtitle.to_dict()
+                # Remove special characters and convert to uppercase
+                filename_clean = re.sub(r"\W+", " ", subtitle_dict["file_name"]).upper()
+                if f"E{episode:02d}" in filename_clean:
+                    logger.info(f"Original filename: {subtitle_dict['file_name']}")
+                    srt_file = subtitles.download_and_save(subtitle)
+                    shutil.move(srt_file, srt_filepath)
+                    logger.info(f"Subtitle saved to {srt_filepath}")
+                    break
 def cleanup_ocr_files(show_dir):
     """
@@ -236,7 +236,7 @@ def clean_text(text):
     # Strip leading/trailing whitespace
     return cleaned_text.strip()
+@logger.catch
 def process_reference_srt_files(series_name):
     """
     Process reference SRT files for a given series.

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/mkv_episode_matcher.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.2
 Name: mkv-episode-matcher
-Version: 0.3.3
+Version: 0.3.5
 Summary: The MKV Episode Matcher is a tool for identifying TV series episodes from MKV files and renaming the files accordingly.
 Home-page: https://github.com/Jsakkos/mkv-episode-matcher
 Author: Jonathan Sakkos
@@ -51,6 +51,14 @@ Automatically match and rename your MKV TV episodes using The Movie Database (TM
 - ✨ **Bulk Processing**: Handle entire seasons at once
 - 🧪 **Dry Run Mode**: Test changes before applying
+## Prerequisites
+- Python 3.9 or higher
+- [FFmpeg](https://ffmpeg.org/download.html) installed and available in system PATH
+- [Tesseract OCR](https://github.com/UB-Mannheim/tesseract/wiki) installed (required for image-based subtitle processing)
+- TMDb API key
+- OpenSubtitles account (optional, for subtitle downloads)
 ## Quick Start
 1. Install the package:
@@ -60,37 +68,13 @@ pip install mkv-episode-matcher
 2. Run on your show directory:
 ```bash
-mkv-match --show-dir "path/to/your/show" --season 1
+mkv-match --show-dir "path/to/your/show" --get-subs true
 ```
-## Requirements
-- Python 3.8 or higher
-- TMDb API key
-- OpenSubtitles account (optional, for subtitle downloads)
 ## Documentation
 Full documentation is available at [https://jsakkos.github.io/mkv-episode-matcher/](https://jsakkos.github.io/mkv-episode-matcher/)
-## Basic Usage
-```python
-from mkv_episode_matcher import process_show
-# Process all seasons
-process_show()
-# Process specific season
-process_show(season=1)
-# Test run without making changes
-process_show(season=1, dry_run=True)
-# Process and download subtitles
-process_show(get_subs=True)
-```
 ## Directory Structure
 MKV Episode Matcher expects your TV shows to be organized as follows:
@@ -105,6 +89,23 @@ Show Name/
 │   └── episode2.mkv
 ```
+## Reference Subtitle File Structure
+Subtitle files that are not automatically downloaded using the `--get-subs` flag should be named as follows:
+```
+~/.mkv-episode-matcher/cache/data/Show Name/
+├── Show Name - S01E01.srt
+├── Show Name - S01E02.srt
+└── ...
+```
+On Windows, the cache directory is located at `C:\Users\{username}\.mkv-episode-matcher\cache\data\`
+Reference subtitle files should follow this naming pattern:
+`{show_name} - S{season:02d}E{episode:02d}.srt`
 ## Contributing
 1. Fork the repository

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/mkv_episode_matcher.egg-info/SOURCES.txt RENAMED Viewed

@@ -12,6 +12,7 @@ uv.lock
 .github/funding.yml
 .github/workflows/documentation.yml
 .github/workflows/python-publish.yml
+.github/workflows/tests.yml
 .vscode/settings.json
 docs/cli.md
 docs/configuration.md
@@ -27,6 +28,7 @@ mkv_episode_matcher/episode_identification.py
 mkv_episode_matcher/episode_matcher.py
 mkv_episode_matcher/mkv_to_srt.py
 mkv_episode_matcher/speech_to_text.py
+mkv_episode_matcher/subtitle_utils.py
 mkv_episode_matcher/tmdb_client.py
 mkv_episode_matcher/utils.py
 mkv_episode_matcher.egg-info/PKG-INFO
@@ -46,5 +48,4 @@ mkv_episode_matcher/libraries/pgs2srt/Libraries/SubZero/SubZero.py
 mkv_episode_matcher/libraries/pgs2srt/Libraries/SubZero/post_processing.py
 mkv_episode_matcher/libraries/pgs2srt/Libraries/SubZero/dictionaries/data.py
 tests/__init__.py
-tests/test_improvements.py
 tests/test_main.py

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/pyproject.toml RENAMED Viewed

@@ -47,6 +47,7 @@ dev = [
     "pytest-cov>=6.0.0",
     "pytest>=8.3.3",
     "ruff>=0.8.0",
+    "chardet>=5.2.0",
 ]

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/setup.cfg RENAMED Viewed

@@ -1,6 +1,6 @@
 [metadata]
 name = mkv_episode_matcher
-version = 0.3.3
+version = 0.3.5
 author = Jonathan Sakkos
 author_email = jonathansakkos@gmail.com
 description = The MKV Episode Matcher is a tool for identifying TV series episodes from MKV files and renaming the files accordingly.

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/tests/test_main.py RENAMED Viewed

@@ -13,6 +13,31 @@ from mkv_episode_matcher.utils import (
 )
 from mkv_episode_matcher.episode_identification import EpisodeMatcher
 from mkv_episode_matcher.config import get_config, set_config
+from unittest.mock import Mock, patch
+# @pytest.fixture
+# def mock_config():
+#     return {
+#         "tmdb_api_key": "test_key",
+#         "show_dir": "/test/path",
+#         "max_threads": 4,
+#         "tesseract_path": "/usr/bin/tesseract",
+#     }
+@pytest.fixture
+def mock_episode_data():
+    return {
+        "name": "Test Episode",
+        "season_number": 1,
+        "episode_number": 1,
+        "overview": "Test overview",
+    }
+@pytest.fixture
+def mock_seasons():
+    return ["/test/path/Season 1"]
 @pytest.fixture
 def temp_show_dir(tmp_path):
@@ -101,8 +126,8 @@ class TestEpisodeMatcher:
         return EpisodeMatcher(tmp_path, "Test Show")
     def test_clean_text(self, matcher):
-        text = "Test [action] <tag> T-t-test"
-        assert matcher.clean_text(text) == "test action tag test"
+        text = "Test [action] T-t-test"
+        assert matcher.clean_text(text) == "test action test"
     def test_chunk_score(self, matcher):
         score = matcher.chunk_score("Test dialogue", "test dialog")
@@ -116,22 +141,27 @@ class TestEpisodeMatcher:
         assert isinstance(chunk, str)
         assert mock_run.called
-class TestProcessShow:
-    @patch('mkv_episode_matcher.episode_matcher.get_valid_seasons')
-    @patch('mkv_episode_matcher.episode_matcher.get_config')
-    def test_process_show_no_seasons(self, mock_config, mock_seasons, mock_config_data):
-        mock_seasons.return_value = []
-        mock_config.return_value = mock_config_data
-        process_show()
-        mock_seasons.assert_called_once()
-    @patch('mkv_episode_matcher.episode_matcher.get_valid_seasons')
-    @patch('mkv_episode_matcher.episode_matcher.get_config')
-    def test_process_show_with_season(self, mock_config, mock_seasons, temp_show_dir, mock_config_data):
-        mock_seasons.return_value = [str(temp_show_dir / "Season 1")]
-        mock_config.return_value = mock_config_data
-        process_show(season=1)
-        mock_seasons.assert_called_once()
+class TestEpisodeMatcher:
+    def test_extract_season_episode(self):
+        from mkv_episode_matcher.utils import extract_season_episode
+        # Test valid filename
+        assert extract_season_episode("Show - S01E02.mkv") == (1, 2)
+        # Test invalid filename
+        assert extract_season_episode("invalid.mkv") == (None, None)
+    @patch("mkv_episode_matcher.tmdb_client.requests.get")
+    def test_fetch_show_id(self, mock_get):
+        from mkv_episode_matcher.tmdb_client import fetch_show_id
+        mock_response = Mock()
+        mock_response.status_code = 200
+        mock_response.json.return_value = {"results": [{"id": 12345}]}
+        mock_get.return_value = mock_response
+        assert fetch_show_id("Test Show") == "12345"
 if __name__ == '__main__':
     pytest.main(['-v'])

{mkv_episode_matcher-0.3.3 → mkv_episode_matcher-0.3.5}/uv.lock RENAMED Viewed

@@ -24,6 +24,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/12/90/3c9ff0512038035f59d279fddeb79f5f1eccd8859f06d6163c58798b9487/certifi-2024.8.30-py3-none-any.whl", hash = "sha256:922820b53db7a7257ffbda3f597266d435245903d80737e34f8a45ff3e3230d8", size = 167321 },
 ]
+[[package]]
+name = "chardet"
+version = "5.2.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/f3/0d/f7b6ab21ec75897ed80c17d79b15951a719226b9fababf1e40ea74d69079/chardet-5.2.0.tar.gz", hash = "sha256:1b3b6ff479a8c414bc3fa2c0852995695c4a026dcd6d0633b2dd092ca39c1cf7", size = 2069618 }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/38/6f/f5fbc992a329ee4e0f288c1fe0e2ad9485ed064cac731ed2fe47dcc38cbf/chardet-5.2.0-py3-none-any.whl", hash = "sha256:e1cf59446890a00105fe7b7912492ea04b6e6f06d4b742b2c788469e34c82970", size = 199385 },
+]
 [[package]]
 name = "charset-normalizer"
 version = "3.4.0"
@@ -374,7 +383,7 @@ wheels = [
 [[package]]
 name = "mkv-episode-matcher"
-version = "0.3.2.post1.dev0+g2c513fa.d20241126"
+version = "0.3.4.post1.dev3+g95f005b.d20250112"
 source = { editable = "." }
 dependencies = [
     { name = "configparser" },
@@ -391,6 +400,7 @@ dependencies = [
 [package.dev-dependencies]
 dev = [
+    { name = "chardet" },
     { name = "pytest" },
     { name = "pytest-cov" },
     { name = "ruff" },
@@ -412,6 +422,7 @@ requires-dist = [
 [package.metadata.requires-dev]
 dev = [
+    { name = "chardet", specifier = ">=5.2.0" },
     { name = "pytest", specifier = ">=8.3.3" },
     { name = "pytest-cov", specifier = ">=6.0.0" },
     { name = "ruff", specifier = ">=0.8.0" },

mkv_episode_matcher-0.3.3/tests/test_improvements.py DELETED Viewed

@@ -1,59 +0,0 @@
-from unittest.mock import Mock, patch
-import pytest
-@pytest.fixture
-def mock_config():
-    return {
-        "tmdb_api_key": "test_key",
-        "show_dir": "/test/path",
-        "max_threads": 4,
-        "tesseract_path": "/usr/bin/tesseract",
-    }
-@pytest.fixture
-def mock_episode_data():
-    return {
-        "name": "Test Episode",
-        "season_number": 1,
-        "episode_number": 1,
-        "overview": "Test overview",
-    }
-class TestEpisodeMatcher:
-    def test_extract_season_episode(self):
-        from mkv_episode_matcher.episode_matcher import extract_season_episode
-        # Test valid filename
-        assert extract_season_episode("Show - S01E02.mkv") == (1, 2)
-        # Test invalid filename
-        assert extract_season_episode("invalid.mkv") == (None, None)
-    @patch("mkv_episode_matcher.tmdb_client.requests.get")
-    def test_fetch_show_id(self, mock_get):
-        from mkv_episode_matcher.tmdb_client import fetch_show_id
-        mock_response = Mock()
-        mock_response.status_code = 200
-        mock_response.json.return_value = {"results": [{"id": 12345}]}
-        mock_get.return_value = mock_response
-        assert fetch_show_id("Test Show") == "12345"
-    @patch("mkv_episode_matcher.utils.OpenSubtitles")
-    def test_get_subtitles(self, mock_subtitles):
-        from mkv_episode_matcher.utils import get_subtitles
-        # Test subtitle download
-        mock_subtitles.return_value.search.return_value.data = [
-            {"file_name": "Test.Show.S01E01.srt"}
-        ]
-        with patch("pathlib.Path.exists", return_value=False):
-            get_subtitles(12345, {1})
-            mock_subtitles.return_value.download_and_save.assert_called_once()