PyPI - videopython - Versions diffs - 0.33.2__tar.gz → 0.33.3__tar.gz - Mend

videopython 0.33.2tar.gz → 0.33.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (62) hide show

videopython-0.33.3/PKG-INFO ADDED Viewed

@@ -0,0 +1,133 @@
+Metadata-Version: 2.4
+Name: videopython
+Version: 0.33.3
+Summary: Minimal video generation and processing library.
+Project-URL: Homepage, https://videopython.com
+Project-URL: Repository, https://github.com/bartwojtowicz/videopython/
+Project-URL: Documentation, https://videopython.com
+Author-email: Bartosz Wójtowicz <bartoszwojtowicz@outlook.com>, Bartosz Rudnikowicz <bartoszrudnikowicz840@gmail.com>, Piotr Pukisz <piotr.pukisz@gmail.com>
+License: Apache-2.0
+License-File: LICENSE
+Keywords: ai,editing,generation,movie,opencv,python,shorts,video,videopython
+Classifier: License :: OSI Approved :: Apache Software License
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Requires-Python: <3.14,>=3.10
+Requires-Dist: numpy>=1.25.2
+Requires-Dist: opencv-python-headless>=4.9.0.80
+Requires-Dist: pillow>=12.1.1
+Requires-Dist: pydantic>=2.8.0
+Requires-Dist: tqdm>=4.66.3
+Provides-Extra: ai
+Requires-Dist: accelerate>=0.29.2; extra == 'ai'
+Requires-Dist: chatterbox-tts>=0.1.7; extra == 'ai'
+Requires-Dist: demucs>=4.0.0; extra == 'ai'
+Requires-Dist: diffusers>=0.30.0; extra == 'ai'
+Requires-Dist: hf-transfer>=0.1.9; extra == 'ai'
+Requires-Dist: imagehash>=4.3; extra == 'ai'
+Requires-Dist: llama-cpp-python>=0.3.0; extra == 'ai'
+Requires-Dist: numba>=0.61.0; extra == 'ai'
+Requires-Dist: ollama>=0.4.5; extra == 'ai'
+Requires-Dist: openai-whisper>=20240930; extra == 'ai'
+Requires-Dist: pyannote-audio>=4.0.0; extra == 'ai'
+Requires-Dist: pyloudnorm>=0.1.1; extra == 'ai'
+Requires-Dist: qwen-vl-utils>=0.0.10; extra == 'ai'
+Requires-Dist: scikit-learn>=1.3.0; extra == 'ai'
+Requires-Dist: scipy>=1.10.0; extra == 'ai'
+Requires-Dist: sentencepiece>=0.1.99; extra == 'ai'
+Requires-Dist: silero-vad>=5.1; extra == 'ai'
+Requires-Dist: torch>=2.8.0; extra == 'ai'
+Requires-Dist: torchaudio>=2.8.0; extra == 'ai'
+Requires-Dist: transformers>=5.2.0; extra == 'ai'
+Requires-Dist: transnetv2-pytorch>=1.0.5; extra == 'ai'
+Requires-Dist: ultralytics>=8.0.0; extra == 'ai'
+Description-Content-Type: text/markdown
+# videopython
+[![PyPI](https://img.shields.io/pypi/v/videopython)](https://pypi.org/project/videopython/)
+[![Python](https://img.shields.io/pypi/pyversions/videopython)](https://pypi.org/project/videopython/)
+[![License](https://img.shields.io/github/license/BartWojtowicz/videopython)](LICENSE)
+Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
+Full documentation: [videopython.com](https://videopython.com)
+> **Disclaimer:** This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
+## Installation
+```bash
+# Install FFmpeg first (macOS: brew install ffmpeg | Debian: apt-get install ffmpeg)
+pip install videopython          # core video/audio editing
+pip install "videopython[ai]"    # + local AI features (GPU recommended)
+```
+Python `>=3.10, <3.14`. AI features run locally — no cloud API keys required, but model weights are downloaded on first use.
+## Quick Start
+### JSON editing plans
+A `VideoEdit` is a multi-segment plan, defined as a dict (or JSON), validated and executed against the source files:
+```python
+from videopython.editing import VideoEdit
+edit = VideoEdit.from_dict({
+    "segments": [{
+        "source": "raw.mp4",
+        "start": 10.0,
+        "end": 20.0,
+        "operations": [
+            {"op": "resize", "width": 1080, "height": 1920},
+            {"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
+            {"op": "fade", "mode": "in", "duration": 0.5},
+        ],
+    }],
+})
+edit.validate()                  # dry-run via metadata, no frames loaded
+edit.run_to_file("output.mp4")   # streams ffmpeg decode → effects → encode
+```
+`run_to_file()` streams ffmpeg decode → per-frame effects → encode, so memory stays bounded even for hour-long sources. Use `edit.run()` to get a `Video` back in memory instead.
+### AI generation
+```python
+from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
+image = TextToImage().generate_image("A cinematic mountain sunrise")
+video = ImageToVideo().generate_video(image=image)
+audio = TextToSpeech().generate_audio("Welcome to videopython.")
+video.add_audio(audio).save("ai_video.mp4")
+```
+## LLM & AI Agent Integration
+Every operation is a Pydantic model whose fields ARE the JSON wire format. `VideoEdit.json_schema()` returns a JSON Schema with a discriminated union over every registered `Operation` — pass it straight to Anthropic tool use, OpenAI function calling, or any structured-output API. Then `edit.validate()` dry-runs the plan via metadata before any frames are loaded, so a failed LLM output can be fed back as an error and retried cheaply.
+See the [LLM Integration Guide](https://videopython.com/guides/llm-integration/) for end-to-end examples, validation error loops, and operation discovery patterns.
+## Features
+- **`videopython.base`** — `Video`, `VideoMetadata`, `FrameIterator`, `ImageText`, `Transcription`, and shared result types (`BoundingBox`, `FaceTrack`, `SceneBoundary`, ...). No AI dependencies.
+- **`videopython.audio`** — `Audio` with overlay, concat, normalize, time-stretch, silence detection, segment classification.
+- **`videopython.editing`** — `Operation`/`Effect` foundation, `VideoEdit` plan runner with JSON Schema + streaming execution. Transforms (cut, resize, crop, fps, speed, reverse, freeze, silence removal) and effects (blur, zoom, color grading, vignette, Ken Burns, fade, overlays, animated subtitles).
+- **`videopython.ai`** *(install with `[ai]`)* — generation (`TextToVideo`, `ImageToVideo`, `TextToImage`, `TextToSpeech`, `TextToMusic`), understanding (`AudioToText`, `AudioClassifier`, `SceneVLM`, `FaceTracker`, `SemanticSceneDetector`), `FaceTrackingCrop` transform, and the full-pipeline `VideoAnalyzer`.
+- **`videopython.ai.dubbing`** — `VideoDubber` for voice-cloned revoicing with timing sync.
+## Examples
+- [Social Media Clip](https://videopython.com/examples/social-clip/)
+- [AI-Generated Video](https://videopython.com/examples/ai-video/)
+- [Auto-Subtitles](https://videopython.com/examples/auto-subtitles/)
+- [Processing Large Videos](https://videopython.com/examples/large-videos/)
+## Development
+See [`DEVELOPMENT.md`](DEVELOPMENT.md) for local setup, testing, and contribution workflow.

videopython-0.33.3/README.md ADDED Viewed

@@ -0,0 +1,84 @@
+# videopython
+[![PyPI](https://img.shields.io/pypi/v/videopython)](https://pypi.org/project/videopython/)
+[![Python](https://img.shields.io/pypi/pyversions/videopython)](https://pypi.org/project/videopython/)
+[![License](https://img.shields.io/github/license/BartWojtowicz/videopython)](LICENSE)
+Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
+Full documentation: [videopython.com](https://videopython.com)
+> **Disclaimer:** This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
+## Installation
+```bash
+# Install FFmpeg first (macOS: brew install ffmpeg | Debian: apt-get install ffmpeg)
+pip install videopython          # core video/audio editing
+pip install "videopython[ai]"    # + local AI features (GPU recommended)
+```
+Python `>=3.10, <3.14`. AI features run locally — no cloud API keys required, but model weights are downloaded on first use.
+## Quick Start
+### JSON editing plans
+A `VideoEdit` is a multi-segment plan, defined as a dict (or JSON), validated and executed against the source files:
+```python
+from videopython.editing import VideoEdit
+edit = VideoEdit.from_dict({
+    "segments": [{
+        "source": "raw.mp4",
+        "start": 10.0,
+        "end": 20.0,
+        "operations": [
+            {"op": "resize", "width": 1080, "height": 1920},
+            {"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
+            {"op": "fade", "mode": "in", "duration": 0.5},
+        ],
+    }],
+})
+edit.validate()                  # dry-run via metadata, no frames loaded
+edit.run_to_file("output.mp4")   # streams ffmpeg decode → effects → encode
+```
+`run_to_file()` streams ffmpeg decode → per-frame effects → encode, so memory stays bounded even for hour-long sources. Use `edit.run()` to get a `Video` back in memory instead.
+### AI generation
+```python
+from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
+image = TextToImage().generate_image("A cinematic mountain sunrise")
+video = ImageToVideo().generate_video(image=image)
+audio = TextToSpeech().generate_audio("Welcome to videopython.")
+video.add_audio(audio).save("ai_video.mp4")
+```
+## LLM & AI Agent Integration
+Every operation is a Pydantic model whose fields ARE the JSON wire format. `VideoEdit.json_schema()` returns a JSON Schema with a discriminated union over every registered `Operation` — pass it straight to Anthropic tool use, OpenAI function calling, or any structured-output API. Then `edit.validate()` dry-runs the plan via metadata before any frames are loaded, so a failed LLM output can be fed back as an error and retried cheaply.
+See the [LLM Integration Guide](https://videopython.com/guides/llm-integration/) for end-to-end examples, validation error loops, and operation discovery patterns.
+## Features
+- **`videopython.base`** — `Video`, `VideoMetadata`, `FrameIterator`, `ImageText`, `Transcription`, and shared result types (`BoundingBox`, `FaceTrack`, `SceneBoundary`, ...). No AI dependencies.
+- **`videopython.audio`** — `Audio` with overlay, concat, normalize, time-stretch, silence detection, segment classification.
+- **`videopython.editing`** — `Operation`/`Effect` foundation, `VideoEdit` plan runner with JSON Schema + streaming execution. Transforms (cut, resize, crop, fps, speed, reverse, freeze, silence removal) and effects (blur, zoom, color grading, vignette, Ken Burns, fade, overlays, animated subtitles).
+- **`videopython.ai`** *(install with `[ai]`)* — generation (`TextToVideo`, `ImageToVideo`, `TextToImage`, `TextToSpeech`, `TextToMusic`), understanding (`AudioToText`, `AudioClassifier`, `SceneVLM`, `FaceTracker`, `SemanticSceneDetector`), `FaceTrackingCrop` transform, and the full-pipeline `VideoAnalyzer`.
+- **`videopython.ai.dubbing`** — `VideoDubber` for voice-cloned revoicing with timing sync.
+## Examples
+- [Social Media Clip](https://videopython.com/examples/social-clip/)
+- [AI-Generated Video](https://videopython.com/examples/ai-video/)
+- [Auto-Subtitles](https://videopython.com/examples/auto-subtitles/)
+- [Processing Large Videos](https://videopython.com/examples/large-videos/)
+## Development
+See [`DEVELOPMENT.md`](DEVELOPMENT.md) for local setup, testing, and contribution workflow.

{videopython-0.33.2 → videopython-0.33.3}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "videopython"
-version = "0.33.2"
+version = "0.33.3"
 description = "Minimal video generation and processing library."
 authors = [
     { name = "Bartosz Wójtowicz", email = "bartoszwojtowicz@outlook.com" },
@@ -186,9 +186,11 @@ build-backend = "hatchling.build"
 [tool.hatch.build.targets.wheel]
 packages = ["src/videopython"]
+artifacts = ["src/videopython/base/fonts/*.ttf", "src/videopython/base/fonts/LICENSE_DEJAVU"]
 [tool.hatch.build.targets.sdist]
 include = ["src/videopython", "src/videopython/py.typed"]
+artifacts = ["src/videopython/base/fonts/*.ttf", "src/videopython/base/fonts/LICENSE_DEJAVU"]
 [tool.pytest.ini_options]
 pythonpath = ["src/"]

videopython-0.33.3/src/videopython/base/fonts/DejaVuSans.ttf ADDED Viewed

Binary file

videopython-0.33.3/src/videopython/base/fonts/LICENSE_DEJAVU ADDED Viewed

@@ -0,0 +1,99 @@
+Fonts are (c) Bitstream (see below). DejaVu changes are in public domain.
+Glyphs imported from Arev fonts are (c) Tavmjong Bah (see below)
+Bitstream Vera Fonts Copyright
+------------------------------
+Copyright (c) 2003 by Bitstream, Inc. All Rights Reserved. Bitstream Vera is
+a trademark of Bitstream, Inc.
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of the fonts accompanying this license ("Fonts") and associated
+documentation files (the "Font Software"), to reproduce and distribute the
+Font Software, including without limitation the rights to use, copy, merge,
+publish, distribute, and/or sell copies of the Font Software, and to permit
+persons to whom the Font Software is furnished to do so, subject to the
+following conditions:
+The above copyright and trademark notices and this permission notice shall
+be included in all copies of one or more of the Font Software typefaces.
+The Font Software may be modified, altered, or added to, and in particular
+the designs of glyphs or characters in the Fonts may be modified and
+additional glyphs or characters may be added to the Fonts, only if the fonts
+are renamed to names not containing either the words "Bitstream" or the word
+"Vera".
+This License becomes null and void to the extent applicable to Fonts or Font
+Software that has been modified and is distributed under the "Bitstream
+Vera" names.
+The Font Software may be sold as part of a larger software package but no
+copy of one or more of the Font Software typefaces may be sold by itself.
+THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF COPYRIGHT, PATENT,
+TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL BITSTREAM OR THE GNOME
+FOUNDATION BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, INCLUDING
+ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES,
+WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
+THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE
+FONT SOFTWARE.
+Except as contained in this notice, the names of Gnome, the Gnome
+Foundation, and Bitstream Inc., shall not be used in advertising or
+otherwise to promote the sale, use or other dealings in this Font Software
+without prior written authorization from the Gnome Foundation or Bitstream
+Inc., respectively. For further information, contact: fonts at gnome dot
+org.
+Arev Fonts Copyright
+------------------------------
+Copyright (c) 2006 by Tavmjong Bah. All Rights Reserved.
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of the fonts accompanying this license ("Fonts") and
+associated documentation files (the "Font Software"), to reproduce
+and distribute the modifications to the Bitstream Vera Font Software,
+including without limitation the rights to use, copy, merge, publish,
+distribute, and/or sell copies of the Font Software, and to permit
+persons to whom the Font Software is furnished to do so, subject to
+the following conditions:
+The above copyright and trademark notices and this permission notice
+shall be included in all copies of one or more of the Font Software
+typefaces.
+The Font Software may be modified, altered, or added to, and in
+particular the designs of glyphs or characters in the Fonts may be
+modified and additional glyphs or characters may be added to the
+Fonts, only if the fonts are renamed to names not containing either
+the words "Tavmjong Bah" or the word "Arev".
+This License becomes null and void to the extent applicable to Fonts
+or Font Software that has been modified and is distributed under the
+"Tavmjong Bah Arev" names.
+The Font Software may be sold as part of a larger software package but
+no copy of one or more of the Font Software typefaces may be sold by
+itself.
+THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
+OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL
+TAVMJONG BAH BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
+DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
+OTHER DEALINGS IN THE FONT SOFTWARE.
+Except as contained in this notice, the name of Tavmjong Bah shall not
+be used in advertising or otherwise to promote the sale, use or other
+dealings in this Font Software without prior written authorization
+from Tavmjong Bah. For further information, contact: tavmjong @ free
+. fr.
+$Id: LICENSE 2133 2007-11-28 02:46:28Z lechimp $

videopython-0.33.3/src/videopython/base/fonts/__init__.py ADDED Viewed

@@ -0,0 +1,58 @@
+"""Bundled default font and graceful font loading.
+Text operations let callers omit a font path. This module provides a
+reliable resolution chain so rendering never hard-fails on a missing or
+unreadable font:
+1. The explicit ``font_filename`` if given and loadable.
+2. The bundled DejaVu Sans (broad Unicode coverage).
+3. PIL's built-in font (always available).
+"""
+from __future__ import annotations
+from importlib.resources import as_file, files
+from PIL import ImageFont
+__all__ = ["DEFAULT_FONT_FILENAME", "load_font"]
+DEFAULT_FONT_FILENAME = "DejaVuSans.ttf"
+def _try_truetype(path: str, font_size: int) -> ImageFont.FreeTypeFont | None:
+    try:
+        return ImageFont.truetype(path, font_size)
+    except (OSError, ValueError):
+        return None
+def load_font(font_filename: str | None, font_size: int) -> ImageFont.FreeTypeFont | ImageFont.ImageFont:
+    """Load a font, falling back gracefully when one is unavailable.
+    Resolution order: the given ``font_filename`` -> the bundled DejaVu
+    Sans -> PIL's built-in bitmap font. Never raises for a missing or
+    unreadable font, so callers may pass ``None`` to mean "use the
+    default".
+    Args:
+        font_filename: Path to a ``.ttf``/``.otf`` file, or ``None``.
+        font_size: Font size in points.
+    Returns:
+        A loaded PIL font object.
+    """
+    if font_filename:
+        font = _try_truetype(font_filename, font_size)
+        if font is not None:
+            return font
+    try:
+        with as_file(files(__package__).joinpath(DEFAULT_FONT_FILENAME)) as bundled:
+            font = _try_truetype(str(bundled), font_size)
+            if font is not None:
+                return font
+    except (FileNotFoundError, ModuleNotFoundError):
+        pass
+    return ImageFont.load_default(font_size)

{videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/image_text.py RENAMED Viewed

@@ -16,6 +16,7 @@ import numpy as np
 from PIL import Image, ImageDraw, ImageFont
 from videopython.base.exceptions import OutOfBoundsError
+from videopython.base.fonts import load_font
 __all__ = ["ImageText", "TextAlign", "AnchorPoint"]
@@ -106,7 +107,7 @@ class ImageText:
         # PIL uses (width, height), so we reverse for Image.new
         self.image = Image.new(mode, (image_size[1], image_size[0]), color=background)
         self._draw = ImageDraw.Draw(self.image)
-        self._font_cache: dict[tuple[str, int], ImageFont.FreeTypeFont] = {}  # Cache for font objects
+        self._font_cache: dict[tuple[str | None, int], ImageFont.FreeTypeFont | ImageFont.ImageFont] = {}
     @property
     def img_array(self) -> np.ndarray:
@@ -119,7 +120,7 @@ class ImageText:
             raise ValueError("Filename cannot be empty")
         self.image.save(filename)
-    def _fit_font_width(self, text: str, font: str, max_width: int) -> int:
+    def _fit_font_width(self, text: str, font: str | None, max_width: int) -> int:
         """
         Find the maximum font size where the text width is less than or equal to max_width.
@@ -150,7 +151,7 @@ class ImageText:
             raise ValueError(f"Max width {max_width} is too small for any font size!")
         return max_font_size
-    def _fit_font_height(self, text: str, font: str, max_height: int) -> int:
+    def _fit_font_height(self, text: str, font: str | None, max_height: int) -> int:
         """
         Find the maximum font size where the text height is less than or equal to max_height.
@@ -184,7 +185,7 @@ class ImageText:
     def _get_font_size(
         self,
         text: str,
-        font: str,
+        font: str | None,
         max_width: int | None = None,
         max_height: int | None = None,
     ) -> int:
@@ -333,7 +334,7 @@ class ImageText:
     def write_text(
         self,
         text: str,
-        font_filename: str,
+        font_filename: str | None,
         xy: PositionType,
         font_size: int | None = 11,
         font_border_size: int = 0,
@@ -368,9 +369,6 @@ class ImageText:
         if not text:
             raise ValueError("Text cannot be empty")
-        if not font_filename:
-            raise ValueError("Font filename cannot be empty")
         if font_size is not None and font_size <= 0:
             raise ValueError("Font size must be positive")
@@ -405,12 +403,16 @@ class ImageText:
         self._draw.text((x, y), text, font=font, fill=color)
         return text_dimensions
-    def _get_font(self, font_filename: str, font_size: int) -> ImageFont.FreeTypeFont:
+    def _get_font(self, font_filename: str | None, font_size: int) -> ImageFont.FreeTypeFont | ImageFont.ImageFont:
         """
         Get a font object, using cache if available.
+        Resolves via :func:`videopython.base.fonts.load_font`, so a missing,
+        unreadable, or ``None`` ``font_filename`` gracefully falls back to
+        the bundled default font instead of raising.
         Args:
-            font_filename: Path to the font file
+            font_filename: Path to the font file, or None for the default.
             font_size: Size of the font in points
         Returns:
@@ -418,13 +420,10 @@ class ImageText:
         """
         key = (font_filename, font_size)
         if key not in self._font_cache:
-            try:
-                self._font_cache[key] = ImageFont.truetype(font_filename, font_size)
-            except (OSError, IOError) as e:
-                raise ValueError(f"Error loading font '{font_filename}': {str(e)}")
+            self._font_cache[key] = load_font(font_filename, font_size)
         return self._font_cache[key]
-    def get_text_dimensions(self, font_filename: str, font_size: int, text: str) -> tuple[int, int]:
+    def get_text_dimensions(self, font_filename: str | None, font_size: int, text: str) -> tuple[int, int]:
         """
         Return dimensions (width, height) of the rendered text.
@@ -455,7 +454,11 @@ class ImageText:
             raise ValueError(f"Error measuring text: {str(e)}")
     def _get_font_baseline_offset(
-        self, base_font_filename: str, base_font_size: int, highlight_font_filename: str, highlight_font_size: int
+        self,
+        base_font_filename: str | None,
+        base_font_size: int,
+        highlight_font_filename: str | None,
+        highlight_font_size: int,
     ) -> int:
         """
         Calculate the vertical offset needed to align baselines of different fonts and sizes.
@@ -497,7 +500,7 @@ class ImageText:
     def _split_lines_by_width(
         self,
         text: str,
-        font_filename: str,
+        font_filename: str | None,
         font_size: int,
         box_width: int,
     ) -> list[str]:
@@ -566,7 +569,7 @@ class ImageText:
     def write_text_box(
         self,
         text: str,
-        font_filename: str,
+        font_filename: str | None,
         xy: PositionType,
         box_width: int | float | None = None,
         font_size: int = 11,
@@ -615,9 +618,6 @@ class ImageText:
         if not text:
             raise ValueError("Text cannot be empty")
-        if not font_filename:
-            raise ValueError("Font filename cannot be empty")
         if font_size <= 0:
             raise ValueError("Font size must be positive")
@@ -831,7 +831,7 @@ class ImageText:
     def _write_line_with_highlight(
         self,
         line: str,
-        font_filename: str,
+        font_filename: str | None,
         font_size: int,
         font_border_size: int,
         text_color: RGBColor,

{videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/effects.py RENAMED Viewed

@@ -24,6 +24,7 @@ from pydantic import Field, PrivateAttr, model_validator
 from tqdm import tqdm
 from videopython.base.description import BoundingBox
+from videopython.base.fonts import load_font
 from videopython.editing.operation import Effect
 if TYPE_CHECKING:
@@ -643,12 +644,7 @@ class TextOverlay(Effect):
         return self
     def _get_font(self) -> ImageFont.FreeTypeFont | ImageFont.ImageFont:
-        if self.font_filename:
-            return ImageFont.truetype(self.font_filename, self.font_size)
-        try:
-            return ImageFont.truetype("DejaVuSans.ttf", self.font_size)
-        except OSError:
-            return ImageFont.load_default()
+        return load_font(self.font_filename, self.font_size)
     def _wrap_text(self, text: str, font: ImageFont.FreeTypeFont | ImageFont.ImageFont, max_px: int) -> str:
         lines: list[str] = []

{videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/transcription_overlay.py RENAMED Viewed

@@ -38,7 +38,10 @@ class TranscriptionOverlay(Effect):
     streamable: ClassVar[bool] = False
     requires: ClassVar[tuple[str, ...]] = ("transcription",)
-    font_filename: str = Field(description="Path to a .ttf font file for rendering subtitle text.")
+    font_filename: str | None = Field(
+        None,
+        description="Path to a .ttf font file for rendering subtitle text, or None for the bundled default font.",
+    )
     font_size: int = Field(40, ge=1, description="Base font size in pixels.")
     font_border_size: int = Field(
         2, ge=0, description="Outline thickness around each character in pixels. 0 = no outline."

videopython-0.33.2/PKG-INFO DELETED Viewed

@@ -1,258 +0,0 @@
-Metadata-Version: 2.4
-Name: videopython
-Version: 0.33.2
-Summary: Minimal video generation and processing library.
-Project-URL: Homepage, https://videopython.com
-Project-URL: Repository, https://github.com/bartwojtowicz/videopython/
-Project-URL: Documentation, https://videopython.com
-Author-email: Bartosz Wójtowicz <bartoszwojtowicz@outlook.com>, Bartosz Rudnikowicz <bartoszrudnikowicz840@gmail.com>, Piotr Pukisz <piotr.pukisz@gmail.com>
-License: Apache-2.0
-License-File: LICENSE
-Keywords: ai,editing,generation,movie,opencv,python,shorts,video,videopython
-Classifier: License :: OSI Approved :: Apache Software License
-Classifier: Operating System :: OS Independent
-Classifier: Programming Language :: Python :: 3
-Classifier: Programming Language :: Python :: 3.10
-Classifier: Programming Language :: Python :: 3.11
-Classifier: Programming Language :: Python :: 3.12
-Classifier: Programming Language :: Python :: 3.13
-Requires-Python: <3.14,>=3.10
-Requires-Dist: numpy>=1.25.2
-Requires-Dist: opencv-python-headless>=4.9.0.80
-Requires-Dist: pillow>=12.1.1
-Requires-Dist: pydantic>=2.8.0
-Requires-Dist: tqdm>=4.66.3
-Provides-Extra: ai
-Requires-Dist: accelerate>=0.29.2; extra == 'ai'
-Requires-Dist: chatterbox-tts>=0.1.7; extra == 'ai'
-Requires-Dist: demucs>=4.0.0; extra == 'ai'
-Requires-Dist: diffusers>=0.30.0; extra == 'ai'
-Requires-Dist: hf-transfer>=0.1.9; extra == 'ai'
-Requires-Dist: imagehash>=4.3; extra == 'ai'
-Requires-Dist: llama-cpp-python>=0.3.0; extra == 'ai'
-Requires-Dist: numba>=0.61.0; extra == 'ai'
-Requires-Dist: ollama>=0.4.5; extra == 'ai'
-Requires-Dist: openai-whisper>=20240930; extra == 'ai'
-Requires-Dist: pyannote-audio>=4.0.0; extra == 'ai'
-Requires-Dist: pyloudnorm>=0.1.1; extra == 'ai'
-Requires-Dist: qwen-vl-utils>=0.0.10; extra == 'ai'
-Requires-Dist: scikit-learn>=1.3.0; extra == 'ai'
-Requires-Dist: scipy>=1.10.0; extra == 'ai'
-Requires-Dist: sentencepiece>=0.1.99; extra == 'ai'
-Requires-Dist: silero-vad>=5.1; extra == 'ai'
-Requires-Dist: torch>=2.8.0; extra == 'ai'
-Requires-Dist: torchaudio>=2.8.0; extra == 'ai'
-Requires-Dist: transformers>=5.2.0; extra == 'ai'
-Requires-Dist: transnetv2-pytorch>=1.0.5; extra == 'ai'
-Requires-Dist: ultralytics>=8.0.0; extra == 'ai'
-Description-Content-Type: text/markdown
-# videopython
-[![PyPI](https://img.shields.io/pypi/v/videopython)](https://pypi.org/project/videopython/)
-[![Python](https://img.shields.io/pypi/pyversions/videopython)](https://pypi.org/project/videopython/)
-[![License](https://img.shields.io/github/license/BartWojtowicz/videopython)](LICENSE)
-Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
-Full documentation: [videopython.com](https://videopython.com)
-> **Disclaimer:** This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
-## Installation
-### 1. Install FFmpeg
-```bash
-# macOS
-brew install ffmpeg
-# Ubuntu / Debian
-sudo apt-get install ffmpeg
-# Windows (Chocolatey)
-choco install ffmpeg
-```
-### 2. Install videopython
-```bash
-pip install videopython          # core video/audio editing
-pip install "videopython[ai]"    # + local AI features (GPU recommended)
-```
-Python `>=3.10, <3.14`. AI features run locally - no cloud API keys required, but model weights are downloaded on first use.
-## Quick Start
-### Imperative editing
-Every editing primitive is an `Operation` subclass — a Pydantic model
-whose fields ARE the JSON wire format. Apply one to a `Video`:
-```python
-from videopython.base import Video
-from videopython.editing import CutSeconds, Resize, Fade
-video = Video.from_path("raw.mp4")
-video = CutSeconds(start=10, end=25).apply(video)
-video = Resize(width=1080, height=1920).apply(video)
-video = Fade(mode="in", duration=0.5).apply(video)
-video.save("output.mp4")
-```
-Concatenate clips with `+` (must share fps + dimensions):
-```python
-combined = video_a + video_b
-```
-### JSON editing plans
-Define multi-segment edits as JSON — the format LLM-driven workflows
-generate against. `VideoEdit.json_schema()` returns the schema:
-```python
-from videopython.editing import VideoEdit
-plan = {
-    "segments": [{
-        "source": "raw.mp4",
-        "start": 10.0,
-        "end": 20.0,
-        "operations": [
-            {"op": "resize", "width": 1080, "height": 1920},
-            {"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
-            {"op": "fade", "mode": "in", "duration": 0.5,
-             "window": {"stop": 0.5}},
-        ],
-    }],
-}
-edit = VideoEdit.from_dict(plan)
-edit.validate()                  # dry-run via metadata, no frames loaded
-edit.run_to_file("output.mp4")   # stream to disk, ~constant memory
-```
-`run_to_file()` pipes ffmpeg decode → per-frame effects → ffmpeg encode,
-so memory stays bounded even for hour-long sources. Use `edit.run()`
-instead if you want the result back in memory as a `Video`.
-### AI generation
-```python
-from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
-from videopython.editing import Resize
-image = TextToImage().generate_image("A cinematic mountain sunrise")
-video = ImageToVideo().generate_video(image=image)
-audio = TextToSpeech().generate_audio("Welcome to videopython.")
-video = Resize(width=1080, height=1920).apply(video)
-video.add_audio(audio).save("ai_video.mp4")
-```
-## LLM & AI Agent Integration
-The library is built for LLM-driven editing. Two surfaces matter:
-**1. Plan schema for tool / structured-output calls.**
-`VideoEdit.json_schema()` returns a JSON Schema covering segments,
-`post_operations`, and a discriminated union over every registered
-`Operation`. Drop it into any LLM API:
-```python
-from videopython.editing import VideoEdit
-schema = VideoEdit.json_schema()
-# Anthropic: tools=[{"name": "edit", "input_schema": schema}]
-# OpenAI:    tools=[{"type": "function",
-#                    "function": {"name": "edit", "parameters": schema}}]
-```
-Validate the LLM's output without touching the filesystem, then run it:
-```python
-edit = VideoEdit.from_dict(plan)
-edit.validate()                  # catches bad ops, time ranges, fps mismatches
-edit.run_to_file("output.mp4")
-```
-**2. Operation discovery for agent loops.**
-Every registered op exposes its own Pydantic schema, so an agent can
-introspect what's available without hardcoded lists:
-```python
-from videopython.editing import Operation, OpCategory
-for op_id, cls in Operation.registry().items():
-    print(f"{op_id}: {(cls.__doc__ or '').splitlines()[0]}")
-schema = Operation.get("color_adjust").model_json_schema()  # per-op schema
-```
-Field constraints (`minimum`, `maximum`, `enum`, `exclusiveMinimum`,
-nullability) flow through to the schema, so LLMs that support
-constrained generation produce valid parameters on the first try.
-For ops that need side-channel data (e.g. `silence_removal` and
-`add_subtitles` need a `Transcription`), pass it via `context`:
-```python
-edit.run(context={"transcription": my_transcription})
-```
-Docs: [Editing Plans](https://videopython.com/api/editing/) | [Operations](https://videopython.com/api/operations/) | [LLM Integration Guide](https://videopython.com/guides/llm-integration/)
-## Features
-### `videopython.base` - data containers + I/O (no AI dependencies)
-| Area | Highlights |
-|---|---|
-| **Video I/O** | `Video`, `VideoMetadata`, `FrameIterator` - load, save, inspect |
-| **Text rendering** | `ImageText` - generic PIL text-on-image primitive |
-| **Transcription** | `Transcription`, `TranscriptionSegment`, `TranscriptionWord` - data classes returned by transcription backends |
-| **Result types** | `BoundingBox`, `DetectedFace`, `FaceTrack`, `SceneBoundary`, `AudioEvent`, `MotionInfo`, ... - shared by editing and AI |
-### `videopython.audio` - audio data container
-| Area | Highlights |
-|---|---|
-| **Audio** | `Audio`, `AudioMetadata` - load/save, overlay, concat, normalize, time-stretch, silence detection, segment classification |
-### `videopython.editing` - editing primitives + plan runner
-| Area | Highlights |
-|---|---|
-| **Operation foundation** | `Operation`, `Effect`, `TimeRange`, `OpCategory` - Pydantic base + auto-registry + discriminated-union schema |
-| **Editing plans** | `VideoEdit`, `SegmentConfig` - JSON/LLM-friendly multi-segment plans with JSON Schema generation, dry-run validation, and streaming `run_to_file` |
-| **Transforms** | Cut (time/frame), resize, crop, FPS resampling, speed change, reverse, freeze frame, silence removal |
-| **Effects** | Blur, zoom, color grading, vignette, Ken Burns, image overlay, fade, text overlay, volume adjust |
-| **Subtitles** | `TranscriptionOverlay` - animated word-by-word subtitle rendering |
-API docs: [Core](https://videopython.com/api/index/) | [Video](https://videopython.com/api/core/video/) | [Audio](https://videopython.com/api/core/audio/) | [Editing Plans](https://videopython.com/api/editing/) | [Operations](https://videopython.com/api/operations/) | [Transforms](https://videopython.com/api/transforms/) | [Effects](https://videopython.com/api/effects/) | [Text](https://videopython.com/api/text/)
-### `videopython.ai` - local AI features (install with `[ai]`)
-| Area | Highlights |
-|---|---|
-| **Generation** | `TextToVideo`, `ImageToVideo`, `TextToImage`, `TextToSpeech`, `TextToMusic` |
-| **Understanding** | `AudioToText` (transcription), `AudioClassifier`, `SceneVLM` (structured visual scene description), `FaceTracker` (per-shot face tracks) |
-| **Scene detection** | `SemanticSceneDetector` (neural scene boundaries) |
-| **Video analysis** | `VideoAnalyzer` - full-pipeline analysis combining multiple AI capabilities |
-| **Transforms** | `FaceTrackingCrop` |
-| **Dubbing** | `VideoDubber` - voice cloning and revoicing with timing sync |
-API docs: [Generation](https://videopython.com/api/ai/generation/) | [Understanding](https://videopython.com/api/ai/understanding/) | [Transforms](https://videopython.com/api/ai/transforms/) | [Dubbing](https://videopython.com/api/ai/dubbing/)
-## Examples
-- [Social Media Clip](https://videopython.com/examples/social-clip/)
-- [AI-Generated Video](https://videopython.com/examples/ai-video/)
-- [Auto-Subtitles](https://videopython.com/examples/auto-subtitles/)
-- [Processing Large Videos](https://videopython.com/examples/large-videos/)
-## Development
-See [`DEVELOPMENT.md`](DEVELOPMENT.md) for local setup, testing, and contribution workflow.

videopython-0.33.2/README.md DELETED Viewed

@@ -1,209 +0,0 @@
-# videopython
-[![PyPI](https://img.shields.io/pypi/v/videopython)](https://pypi.org/project/videopython/)
-[![Python](https://img.shields.io/pypi/pyversions/videopython)](https://pypi.org/project/videopython/)
-[![License](https://img.shields.io/github/license/BartWojtowicz/videopython)](LICENSE)
-Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
-Full documentation: [videopython.com](https://videopython.com)
-> **Disclaimer:** This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
-## Installation
-### 1. Install FFmpeg
-```bash
-# macOS
-brew install ffmpeg
-# Ubuntu / Debian
-sudo apt-get install ffmpeg
-# Windows (Chocolatey)
-choco install ffmpeg
-```
-### 2. Install videopython
-```bash
-pip install videopython          # core video/audio editing
-pip install "videopython[ai]"    # + local AI features (GPU recommended)
-```
-Python `>=3.10, <3.14`. AI features run locally - no cloud API keys required, but model weights are downloaded on first use.
-## Quick Start
-### Imperative editing
-Every editing primitive is an `Operation` subclass — a Pydantic model
-whose fields ARE the JSON wire format. Apply one to a `Video`:
-```python
-from videopython.base import Video
-from videopython.editing import CutSeconds, Resize, Fade
-video = Video.from_path("raw.mp4")
-video = CutSeconds(start=10, end=25).apply(video)
-video = Resize(width=1080, height=1920).apply(video)
-video = Fade(mode="in", duration=0.5).apply(video)
-video.save("output.mp4")
-```
-Concatenate clips with `+` (must share fps + dimensions):
-```python
-combined = video_a + video_b
-```
-### JSON editing plans
-Define multi-segment edits as JSON — the format LLM-driven workflows
-generate against. `VideoEdit.json_schema()` returns the schema:
-```python
-from videopython.editing import VideoEdit
-plan = {
-    "segments": [{
-        "source": "raw.mp4",
-        "start": 10.0,
-        "end": 20.0,
-        "operations": [
-            {"op": "resize", "width": 1080, "height": 1920},
-            {"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
-            {"op": "fade", "mode": "in", "duration": 0.5,
-             "window": {"stop": 0.5}},
-        ],
-    }],
-}
-edit = VideoEdit.from_dict(plan)
-edit.validate()                  # dry-run via metadata, no frames loaded
-edit.run_to_file("output.mp4")   # stream to disk, ~constant memory
-```
-`run_to_file()` pipes ffmpeg decode → per-frame effects → ffmpeg encode,
-so memory stays bounded even for hour-long sources. Use `edit.run()`
-instead if you want the result back in memory as a `Video`.
-### AI generation
-```python
-from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
-from videopython.editing import Resize
-image = TextToImage().generate_image("A cinematic mountain sunrise")
-video = ImageToVideo().generate_video(image=image)
-audio = TextToSpeech().generate_audio("Welcome to videopython.")
-video = Resize(width=1080, height=1920).apply(video)
-video.add_audio(audio).save("ai_video.mp4")
-```
-## LLM & AI Agent Integration
-The library is built for LLM-driven editing. Two surfaces matter:
-**1. Plan schema for tool / structured-output calls.**
-`VideoEdit.json_schema()` returns a JSON Schema covering segments,
-`post_operations`, and a discriminated union over every registered
-`Operation`. Drop it into any LLM API:
-```python
-from videopython.editing import VideoEdit
-schema = VideoEdit.json_schema()
-# Anthropic: tools=[{"name": "edit", "input_schema": schema}]
-# OpenAI:    tools=[{"type": "function",
-#                    "function": {"name": "edit", "parameters": schema}}]
-```
-Validate the LLM's output without touching the filesystem, then run it:
-```python
-edit = VideoEdit.from_dict(plan)
-edit.validate()                  # catches bad ops, time ranges, fps mismatches
-edit.run_to_file("output.mp4")
-```
-**2. Operation discovery for agent loops.**
-Every registered op exposes its own Pydantic schema, so an agent can
-introspect what's available without hardcoded lists:
-```python
-from videopython.editing import Operation, OpCategory
-for op_id, cls in Operation.registry().items():
-    print(f"{op_id}: {(cls.__doc__ or '').splitlines()[0]}")
-schema = Operation.get("color_adjust").model_json_schema()  # per-op schema
-```
-Field constraints (`minimum`, `maximum`, `enum`, `exclusiveMinimum`,
-nullability) flow through to the schema, so LLMs that support
-constrained generation produce valid parameters on the first try.
-For ops that need side-channel data (e.g. `silence_removal` and
-`add_subtitles` need a `Transcription`), pass it via `context`:
-```python
-edit.run(context={"transcription": my_transcription})
-```
-Docs: [Editing Plans](https://videopython.com/api/editing/) | [Operations](https://videopython.com/api/operations/) | [LLM Integration Guide](https://videopython.com/guides/llm-integration/)
-## Features
-### `videopython.base` - data containers + I/O (no AI dependencies)
-| Area | Highlights |
-|---|---|
-| **Video I/O** | `Video`, `VideoMetadata`, `FrameIterator` - load, save, inspect |
-| **Text rendering** | `ImageText` - generic PIL text-on-image primitive |
-| **Transcription** | `Transcription`, `TranscriptionSegment`, `TranscriptionWord` - data classes returned by transcription backends |
-| **Result types** | `BoundingBox`, `DetectedFace`, `FaceTrack`, `SceneBoundary`, `AudioEvent`, `MotionInfo`, ... - shared by editing and AI |
-### `videopython.audio` - audio data container
-| Area | Highlights |
-|---|---|
-| **Audio** | `Audio`, `AudioMetadata` - load/save, overlay, concat, normalize, time-stretch, silence detection, segment classification |
-### `videopython.editing` - editing primitives + plan runner
-| Area | Highlights |
-|---|---|
-| **Operation foundation** | `Operation`, `Effect`, `TimeRange`, `OpCategory` - Pydantic base + auto-registry + discriminated-union schema |
-| **Editing plans** | `VideoEdit`, `SegmentConfig` - JSON/LLM-friendly multi-segment plans with JSON Schema generation, dry-run validation, and streaming `run_to_file` |
-| **Transforms** | Cut (time/frame), resize, crop, FPS resampling, speed change, reverse, freeze frame, silence removal |
-| **Effects** | Blur, zoom, color grading, vignette, Ken Burns, image overlay, fade, text overlay, volume adjust |
-| **Subtitles** | `TranscriptionOverlay` - animated word-by-word subtitle rendering |
-API docs: [Core](https://videopython.com/api/index/) | [Video](https://videopython.com/api/core/video/) | [Audio](https://videopython.com/api/core/audio/) | [Editing Plans](https://videopython.com/api/editing/) | [Operations](https://videopython.com/api/operations/) | [Transforms](https://videopython.com/api/transforms/) | [Effects](https://videopython.com/api/effects/) | [Text](https://videopython.com/api/text/)
-### `videopython.ai` - local AI features (install with `[ai]`)
-| Area | Highlights |
-|---|---|
-| **Generation** | `TextToVideo`, `ImageToVideo`, `TextToImage`, `TextToSpeech`, `TextToMusic` |
-| **Understanding** | `AudioToText` (transcription), `AudioClassifier`, `SceneVLM` (structured visual scene description), `FaceTracker` (per-shot face tracks) |
-| **Scene detection** | `SemanticSceneDetector` (neural scene boundaries) |
-| **Video analysis** | `VideoAnalyzer` - full-pipeline analysis combining multiple AI capabilities |
-| **Transforms** | `FaceTrackingCrop` |
-| **Dubbing** | `VideoDubber` - voice cloning and revoicing with timing sync |
-API docs: [Generation](https://videopython.com/api/ai/generation/) | [Understanding](https://videopython.com/api/ai/understanding/) | [Transforms](https://videopython.com/api/ai/transforms/) | [Dubbing](https://videopython.com/api/ai/dubbing/)
-## Examples
-- [Social Media Clip](https://videopython.com/examples/social-clip/)
-- [AI-Generated Video](https://videopython.com/examples/ai-video/)
-- [Auto-Subtitles](https://videopython.com/examples/auto-subtitles/)
-- [Processing Large Videos](https://videopython.com/examples/large-videos/)
-## Development
-See [`DEVELOPMENT.md`](DEVELOPMENT.md) for local setup, testing, and contribution workflow.