videopython 0.33.2__tar.gz → 0.33.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (62) hide show
  1. videopython-0.33.3/PKG-INFO +133 -0
  2. videopython-0.33.3/README.md +84 -0
  3. {videopython-0.33.2 → videopython-0.33.3}/pyproject.toml +3 -1
  4. videopython-0.33.3/src/videopython/base/fonts/DejaVuSans.ttf +0 -0
  5. videopython-0.33.3/src/videopython/base/fonts/LICENSE_DEJAVU +99 -0
  6. videopython-0.33.3/src/videopython/base/fonts/__init__.py +58 -0
  7. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/image_text.py +22 -22
  8. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/effects.py +2 -6
  9. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/transcription_overlay.py +4 -1
  10. videopython-0.33.2/PKG-INFO +0 -258
  11. videopython-0.33.2/README.md +0 -209
  12. {videopython-0.33.2 → videopython-0.33.3}/.gitignore +0 -0
  13. {videopython-0.33.2 → videopython-0.33.3}/LICENSE +0 -0
  14. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/__init__.py +0 -0
  15. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/__init__.py +0 -0
  16. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/_device.py +0 -0
  17. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/__init__.py +0 -0
  18. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/config.py +0 -0
  19. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/dubber.py +0 -0
  20. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/expressiveness.py +0 -0
  21. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/loudness.py +0 -0
  22. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/models.py +0 -0
  23. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/pipeline.py +0 -0
  24. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/quality.py +0 -0
  25. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/remux.py +0 -0
  26. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/timing.py +0 -0
  27. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/dubbing/voice_sample.py +0 -0
  28. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/generation/__init__.py +0 -0
  29. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/generation/audio.py +0 -0
  30. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/generation/image.py +0 -0
  31. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/generation/qwen3.py +0 -0
  32. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/generation/translation.py +0 -0
  33. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/generation/video.py +0 -0
  34. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/transforms.py +0 -0
  35. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/understanding/__init__.py +0 -0
  36. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/understanding/audio.py +0 -0
  37. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/understanding/faces.py +0 -0
  38. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/understanding/image.py +0 -0
  39. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/understanding/separation.py +0 -0
  40. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/understanding/temporal.py +0 -0
  41. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/video_analysis/__init__.py +0 -0
  42. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/video_analysis/analyzer.py +0 -0
  43. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/video_analysis/models.py +0 -0
  44. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/video_analysis/sampling.py +0 -0
  45. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/ai/video_analysis/stages.py +0 -0
  46. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/audio/__init__.py +0 -0
  47. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/audio/analysis.py +0 -0
  48. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/audio/audio.py +0 -0
  49. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/__init__.py +0 -0
  50. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/_dimensions.py +0 -0
  51. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/_ffmpeg.py +0 -0
  52. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/_video_io.py +0 -0
  53. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/description.py +0 -0
  54. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/exceptions.py +0 -0
  55. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/transcription.py +0 -0
  56. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/base/video.py +0 -0
  57. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/__init__.py +0 -0
  58. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/operation.py +0 -0
  59. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/streaming.py +0 -0
  60. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/transforms.py +0 -0
  61. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/editing/video_edit.py +0 -0
  62. {videopython-0.33.2 → videopython-0.33.3}/src/videopython/py.typed +0 -0
@@ -0,0 +1,133 @@
1
+ Metadata-Version: 2.4
2
+ Name: videopython
3
+ Version: 0.33.3
4
+ Summary: Minimal video generation and processing library.
5
+ Project-URL: Homepage, https://videopython.com
6
+ Project-URL: Repository, https://github.com/bartwojtowicz/videopython/
7
+ Project-URL: Documentation, https://videopython.com
8
+ Author-email: Bartosz Wójtowicz <bartoszwojtowicz@outlook.com>, Bartosz Rudnikowicz <bartoszrudnikowicz840@gmail.com>, Piotr Pukisz <piotr.pukisz@gmail.com>
9
+ License: Apache-2.0
10
+ License-File: LICENSE
11
+ Keywords: ai,editing,generation,movie,opencv,python,shorts,video,videopython
12
+ Classifier: License :: OSI Approved :: Apache Software License
13
+ Classifier: Operating System :: OS Independent
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Programming Language :: Python :: 3.13
19
+ Requires-Python: <3.14,>=3.10
20
+ Requires-Dist: numpy>=1.25.2
21
+ Requires-Dist: opencv-python-headless>=4.9.0.80
22
+ Requires-Dist: pillow>=12.1.1
23
+ Requires-Dist: pydantic>=2.8.0
24
+ Requires-Dist: tqdm>=4.66.3
25
+ Provides-Extra: ai
26
+ Requires-Dist: accelerate>=0.29.2; extra == 'ai'
27
+ Requires-Dist: chatterbox-tts>=0.1.7; extra == 'ai'
28
+ Requires-Dist: demucs>=4.0.0; extra == 'ai'
29
+ Requires-Dist: diffusers>=0.30.0; extra == 'ai'
30
+ Requires-Dist: hf-transfer>=0.1.9; extra == 'ai'
31
+ Requires-Dist: imagehash>=4.3; extra == 'ai'
32
+ Requires-Dist: llama-cpp-python>=0.3.0; extra == 'ai'
33
+ Requires-Dist: numba>=0.61.0; extra == 'ai'
34
+ Requires-Dist: ollama>=0.4.5; extra == 'ai'
35
+ Requires-Dist: openai-whisper>=20240930; extra == 'ai'
36
+ Requires-Dist: pyannote-audio>=4.0.0; extra == 'ai'
37
+ Requires-Dist: pyloudnorm>=0.1.1; extra == 'ai'
38
+ Requires-Dist: qwen-vl-utils>=0.0.10; extra == 'ai'
39
+ Requires-Dist: scikit-learn>=1.3.0; extra == 'ai'
40
+ Requires-Dist: scipy>=1.10.0; extra == 'ai'
41
+ Requires-Dist: sentencepiece>=0.1.99; extra == 'ai'
42
+ Requires-Dist: silero-vad>=5.1; extra == 'ai'
43
+ Requires-Dist: torch>=2.8.0; extra == 'ai'
44
+ Requires-Dist: torchaudio>=2.8.0; extra == 'ai'
45
+ Requires-Dist: transformers>=5.2.0; extra == 'ai'
46
+ Requires-Dist: transnetv2-pytorch>=1.0.5; extra == 'ai'
47
+ Requires-Dist: ultralytics>=8.0.0; extra == 'ai'
48
+ Description-Content-Type: text/markdown
49
+
50
+ # videopython
51
+
52
+ [![PyPI](https://img.shields.io/pypi/v/videopython)](https://pypi.org/project/videopython/)
53
+ [![Python](https://img.shields.io/pypi/pyversions/videopython)](https://pypi.org/project/videopython/)
54
+ [![License](https://img.shields.io/github/license/BartWojtowicz/videopython)](LICENSE)
55
+
56
+ Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
57
+
58
+ Full documentation: [videopython.com](https://videopython.com)
59
+
60
+ > **Disclaimer:** This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
61
+
62
+ ## Installation
63
+
64
+ ```bash
65
+ # Install FFmpeg first (macOS: brew install ffmpeg | Debian: apt-get install ffmpeg)
66
+ pip install videopython # core video/audio editing
67
+ pip install "videopython[ai]" # + local AI features (GPU recommended)
68
+ ```
69
+
70
+ Python `>=3.10, <3.14`. AI features run locally — no cloud API keys required, but model weights are downloaded on first use.
71
+
72
+ ## Quick Start
73
+
74
+ ### JSON editing plans
75
+
76
+ A `VideoEdit` is a multi-segment plan, defined as a dict (or JSON), validated and executed against the source files:
77
+
78
+ ```python
79
+ from videopython.editing import VideoEdit
80
+
81
+ edit = VideoEdit.from_dict({
82
+ "segments": [{
83
+ "source": "raw.mp4",
84
+ "start": 10.0,
85
+ "end": 20.0,
86
+ "operations": [
87
+ {"op": "resize", "width": 1080, "height": 1920},
88
+ {"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
89
+ {"op": "fade", "mode": "in", "duration": 0.5},
90
+ ],
91
+ }],
92
+ })
93
+ edit.validate() # dry-run via metadata, no frames loaded
94
+ edit.run_to_file("output.mp4") # streams ffmpeg decode → effects → encode
95
+ ```
96
+
97
+ `run_to_file()` streams ffmpeg decode → per-frame effects → encode, so memory stays bounded even for hour-long sources. Use `edit.run()` to get a `Video` back in memory instead.
98
+
99
+ ### AI generation
100
+
101
+ ```python
102
+ from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
103
+
104
+ image = TextToImage().generate_image("A cinematic mountain sunrise")
105
+ video = ImageToVideo().generate_video(image=image)
106
+ audio = TextToSpeech().generate_audio("Welcome to videopython.")
107
+ video.add_audio(audio).save("ai_video.mp4")
108
+ ```
109
+
110
+ ## LLM & AI Agent Integration
111
+
112
+ Every operation is a Pydantic model whose fields ARE the JSON wire format. `VideoEdit.json_schema()` returns a JSON Schema with a discriminated union over every registered `Operation` — pass it straight to Anthropic tool use, OpenAI function calling, or any structured-output API. Then `edit.validate()` dry-runs the plan via metadata before any frames are loaded, so a failed LLM output can be fed back as an error and retried cheaply.
113
+
114
+ See the [LLM Integration Guide](https://videopython.com/guides/llm-integration/) for end-to-end examples, validation error loops, and operation discovery patterns.
115
+
116
+ ## Features
117
+
118
+ - **`videopython.base`** — `Video`, `VideoMetadata`, `FrameIterator`, `ImageText`, `Transcription`, and shared result types (`BoundingBox`, `FaceTrack`, `SceneBoundary`, ...). No AI dependencies.
119
+ - **`videopython.audio`** — `Audio` with overlay, concat, normalize, time-stretch, silence detection, segment classification.
120
+ - **`videopython.editing`** — `Operation`/`Effect` foundation, `VideoEdit` plan runner with JSON Schema + streaming execution. Transforms (cut, resize, crop, fps, speed, reverse, freeze, silence removal) and effects (blur, zoom, color grading, vignette, Ken Burns, fade, overlays, animated subtitles).
121
+ - **`videopython.ai`** *(install with `[ai]`)* — generation (`TextToVideo`, `ImageToVideo`, `TextToImage`, `TextToSpeech`, `TextToMusic`), understanding (`AudioToText`, `AudioClassifier`, `SceneVLM`, `FaceTracker`, `SemanticSceneDetector`), `FaceTrackingCrop` transform, and the full-pipeline `VideoAnalyzer`.
122
+ - **`videopython.ai.dubbing`** — `VideoDubber` for voice-cloned revoicing with timing sync.
123
+
124
+ ## Examples
125
+
126
+ - [Social Media Clip](https://videopython.com/examples/social-clip/)
127
+ - [AI-Generated Video](https://videopython.com/examples/ai-video/)
128
+ - [Auto-Subtitles](https://videopython.com/examples/auto-subtitles/)
129
+ - [Processing Large Videos](https://videopython.com/examples/large-videos/)
130
+
131
+ ## Development
132
+
133
+ See [`DEVELOPMENT.md`](DEVELOPMENT.md) for local setup, testing, and contribution workflow.
@@ -0,0 +1,84 @@
1
+ # videopython
2
+
3
+ [![PyPI](https://img.shields.io/pypi/v/videopython)](https://pypi.org/project/videopython/)
4
+ [![Python](https://img.shields.io/pypi/pyversions/videopython)](https://pypi.org/project/videopython/)
5
+ [![License](https://img.shields.io/github/license/BartWojtowicz/videopython)](LICENSE)
6
+
7
+ Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
8
+
9
+ Full documentation: [videopython.com](https://videopython.com)
10
+
11
+ > **Disclaimer:** This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
12
+
13
+ ## Installation
14
+
15
+ ```bash
16
+ # Install FFmpeg first (macOS: brew install ffmpeg | Debian: apt-get install ffmpeg)
17
+ pip install videopython # core video/audio editing
18
+ pip install "videopython[ai]" # + local AI features (GPU recommended)
19
+ ```
20
+
21
+ Python `>=3.10, <3.14`. AI features run locally — no cloud API keys required, but model weights are downloaded on first use.
22
+
23
+ ## Quick Start
24
+
25
+ ### JSON editing plans
26
+
27
+ A `VideoEdit` is a multi-segment plan, defined as a dict (or JSON), validated and executed against the source files:
28
+
29
+ ```python
30
+ from videopython.editing import VideoEdit
31
+
32
+ edit = VideoEdit.from_dict({
33
+ "segments": [{
34
+ "source": "raw.mp4",
35
+ "start": 10.0,
36
+ "end": 20.0,
37
+ "operations": [
38
+ {"op": "resize", "width": 1080, "height": 1920},
39
+ {"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
40
+ {"op": "fade", "mode": "in", "duration": 0.5},
41
+ ],
42
+ }],
43
+ })
44
+ edit.validate() # dry-run via metadata, no frames loaded
45
+ edit.run_to_file("output.mp4") # streams ffmpeg decode → effects → encode
46
+ ```
47
+
48
+ `run_to_file()` streams ffmpeg decode → per-frame effects → encode, so memory stays bounded even for hour-long sources. Use `edit.run()` to get a `Video` back in memory instead.
49
+
50
+ ### AI generation
51
+
52
+ ```python
53
+ from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
54
+
55
+ image = TextToImage().generate_image("A cinematic mountain sunrise")
56
+ video = ImageToVideo().generate_video(image=image)
57
+ audio = TextToSpeech().generate_audio("Welcome to videopython.")
58
+ video.add_audio(audio).save("ai_video.mp4")
59
+ ```
60
+
61
+ ## LLM & AI Agent Integration
62
+
63
+ Every operation is a Pydantic model whose fields ARE the JSON wire format. `VideoEdit.json_schema()` returns a JSON Schema with a discriminated union over every registered `Operation` — pass it straight to Anthropic tool use, OpenAI function calling, or any structured-output API. Then `edit.validate()` dry-runs the plan via metadata before any frames are loaded, so a failed LLM output can be fed back as an error and retried cheaply.
64
+
65
+ See the [LLM Integration Guide](https://videopython.com/guides/llm-integration/) for end-to-end examples, validation error loops, and operation discovery patterns.
66
+
67
+ ## Features
68
+
69
+ - **`videopython.base`** — `Video`, `VideoMetadata`, `FrameIterator`, `ImageText`, `Transcription`, and shared result types (`BoundingBox`, `FaceTrack`, `SceneBoundary`, ...). No AI dependencies.
70
+ - **`videopython.audio`** — `Audio` with overlay, concat, normalize, time-stretch, silence detection, segment classification.
71
+ - **`videopython.editing`** — `Operation`/`Effect` foundation, `VideoEdit` plan runner with JSON Schema + streaming execution. Transforms (cut, resize, crop, fps, speed, reverse, freeze, silence removal) and effects (blur, zoom, color grading, vignette, Ken Burns, fade, overlays, animated subtitles).
72
+ - **`videopython.ai`** *(install with `[ai]`)* — generation (`TextToVideo`, `ImageToVideo`, `TextToImage`, `TextToSpeech`, `TextToMusic`), understanding (`AudioToText`, `AudioClassifier`, `SceneVLM`, `FaceTracker`, `SemanticSceneDetector`), `FaceTrackingCrop` transform, and the full-pipeline `VideoAnalyzer`.
73
+ - **`videopython.ai.dubbing`** — `VideoDubber` for voice-cloned revoicing with timing sync.
74
+
75
+ ## Examples
76
+
77
+ - [Social Media Clip](https://videopython.com/examples/social-clip/)
78
+ - [AI-Generated Video](https://videopython.com/examples/ai-video/)
79
+ - [Auto-Subtitles](https://videopython.com/examples/auto-subtitles/)
80
+ - [Processing Large Videos](https://videopython.com/examples/large-videos/)
81
+
82
+ ## Development
83
+
84
+ See [`DEVELOPMENT.md`](DEVELOPMENT.md) for local setup, testing, and contribution workflow.
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "videopython"
3
- version = "0.33.2"
3
+ version = "0.33.3"
4
4
  description = "Minimal video generation and processing library."
5
5
  authors = [
6
6
  { name = "Bartosz Wójtowicz", email = "bartoszwojtowicz@outlook.com" },
@@ -186,9 +186,11 @@ build-backend = "hatchling.build"
186
186
 
187
187
  [tool.hatch.build.targets.wheel]
188
188
  packages = ["src/videopython"]
189
+ artifacts = ["src/videopython/base/fonts/*.ttf", "src/videopython/base/fonts/LICENSE_DEJAVU"]
189
190
 
190
191
  [tool.hatch.build.targets.sdist]
191
192
  include = ["src/videopython", "src/videopython/py.typed"]
193
+ artifacts = ["src/videopython/base/fonts/*.ttf", "src/videopython/base/fonts/LICENSE_DEJAVU"]
192
194
 
193
195
  [tool.pytest.ini_options]
194
196
  pythonpath = ["src/"]
@@ -0,0 +1,99 @@
1
+ Fonts are (c) Bitstream (see below). DejaVu changes are in public domain.
2
+ Glyphs imported from Arev fonts are (c) Tavmjong Bah (see below)
3
+
4
+ Bitstream Vera Fonts Copyright
5
+ ------------------------------
6
+
7
+ Copyright (c) 2003 by Bitstream, Inc. All Rights Reserved. Bitstream Vera is
8
+ a trademark of Bitstream, Inc.
9
+
10
+ Permission is hereby granted, free of charge, to any person obtaining a copy
11
+ of the fonts accompanying this license ("Fonts") and associated
12
+ documentation files (the "Font Software"), to reproduce and distribute the
13
+ Font Software, including without limitation the rights to use, copy, merge,
14
+ publish, distribute, and/or sell copies of the Font Software, and to permit
15
+ persons to whom the Font Software is furnished to do so, subject to the
16
+ following conditions:
17
+
18
+ The above copyright and trademark notices and this permission notice shall
19
+ be included in all copies of one or more of the Font Software typefaces.
20
+
21
+ The Font Software may be modified, altered, or added to, and in particular
22
+ the designs of glyphs or characters in the Fonts may be modified and
23
+ additional glyphs or characters may be added to the Fonts, only if the fonts
24
+ are renamed to names not containing either the words "Bitstream" or the word
25
+ "Vera".
26
+
27
+ This License becomes null and void to the extent applicable to Fonts or Font
28
+ Software that has been modified and is distributed under the "Bitstream
29
+ Vera" names.
30
+
31
+ The Font Software may be sold as part of a larger software package but no
32
+ copy of one or more of the Font Software typefaces may be sold by itself.
33
+
34
+ THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
35
+ OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY,
36
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF COPYRIGHT, PATENT,
37
+ TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL BITSTREAM OR THE GNOME
38
+ FOUNDATION BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, INCLUDING
39
+ ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES,
40
+ WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
41
+ THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE
42
+ FONT SOFTWARE.
43
+
44
+ Except as contained in this notice, the names of Gnome, the Gnome
45
+ Foundation, and Bitstream Inc., shall not be used in advertising or
46
+ otherwise to promote the sale, use or other dealings in this Font Software
47
+ without prior written authorization from the Gnome Foundation or Bitstream
48
+ Inc., respectively. For further information, contact: fonts at gnome dot
49
+ org.
50
+
51
+ Arev Fonts Copyright
52
+ ------------------------------
53
+
54
+ Copyright (c) 2006 by Tavmjong Bah. All Rights Reserved.
55
+
56
+ Permission is hereby granted, free of charge, to any person obtaining
57
+ a copy of the fonts accompanying this license ("Fonts") and
58
+ associated documentation files (the "Font Software"), to reproduce
59
+ and distribute the modifications to the Bitstream Vera Font Software,
60
+ including without limitation the rights to use, copy, merge, publish,
61
+ distribute, and/or sell copies of the Font Software, and to permit
62
+ persons to whom the Font Software is furnished to do so, subject to
63
+ the following conditions:
64
+
65
+ The above copyright and trademark notices and this permission notice
66
+ shall be included in all copies of one or more of the Font Software
67
+ typefaces.
68
+
69
+ The Font Software may be modified, altered, or added to, and in
70
+ particular the designs of glyphs or characters in the Fonts may be
71
+ modified and additional glyphs or characters may be added to the
72
+ Fonts, only if the fonts are renamed to names not containing either
73
+ the words "Tavmjong Bah" or the word "Arev".
74
+
75
+ This License becomes null and void to the extent applicable to Fonts
76
+ or Font Software that has been modified and is distributed under the
77
+ "Tavmjong Bah Arev" names.
78
+
79
+ The Font Software may be sold as part of a larger software package but
80
+ no copy of one or more of the Font Software typefaces may be sold by
81
+ itself.
82
+
83
+ THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
84
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
85
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
86
+ OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL
87
+ TAVMJONG BAH BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
88
+ INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
89
+ DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
90
+ FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
91
+ OTHER DEALINGS IN THE FONT SOFTWARE.
92
+
93
+ Except as contained in this notice, the name of Tavmjong Bah shall not
94
+ be used in advertising or otherwise to promote the sale, use or other
95
+ dealings in this Font Software without prior written authorization
96
+ from Tavmjong Bah. For further information, contact: tavmjong @ free
97
+ . fr.
98
+
99
+ $Id: LICENSE 2133 2007-11-28 02:46:28Z lechimp $
@@ -0,0 +1,58 @@
1
+ """Bundled default font and graceful font loading.
2
+
3
+ Text operations let callers omit a font path. This module provides a
4
+ reliable resolution chain so rendering never hard-fails on a missing or
5
+ unreadable font:
6
+
7
+ 1. The explicit ``font_filename`` if given and loadable.
8
+ 2. The bundled DejaVu Sans (broad Unicode coverage).
9
+ 3. PIL's built-in font (always available).
10
+ """
11
+
12
+ from __future__ import annotations
13
+
14
+ from importlib.resources import as_file, files
15
+
16
+ from PIL import ImageFont
17
+
18
+ __all__ = ["DEFAULT_FONT_FILENAME", "load_font"]
19
+
20
+ DEFAULT_FONT_FILENAME = "DejaVuSans.ttf"
21
+
22
+
23
+ def _try_truetype(path: str, font_size: int) -> ImageFont.FreeTypeFont | None:
24
+ try:
25
+ return ImageFont.truetype(path, font_size)
26
+ except (OSError, ValueError):
27
+ return None
28
+
29
+
30
+ def load_font(font_filename: str | None, font_size: int) -> ImageFont.FreeTypeFont | ImageFont.ImageFont:
31
+ """Load a font, falling back gracefully when one is unavailable.
32
+
33
+ Resolution order: the given ``font_filename`` -> the bundled DejaVu
34
+ Sans -> PIL's built-in bitmap font. Never raises for a missing or
35
+ unreadable font, so callers may pass ``None`` to mean "use the
36
+ default".
37
+
38
+ Args:
39
+ font_filename: Path to a ``.ttf``/``.otf`` file, or ``None``.
40
+ font_size: Font size in points.
41
+
42
+ Returns:
43
+ A loaded PIL font object.
44
+ """
45
+ if font_filename:
46
+ font = _try_truetype(font_filename, font_size)
47
+ if font is not None:
48
+ return font
49
+
50
+ try:
51
+ with as_file(files(__package__).joinpath(DEFAULT_FONT_FILENAME)) as bundled:
52
+ font = _try_truetype(str(bundled), font_size)
53
+ if font is not None:
54
+ return font
55
+ except (FileNotFoundError, ModuleNotFoundError):
56
+ pass
57
+
58
+ return ImageFont.load_default(font_size)
@@ -16,6 +16,7 @@ import numpy as np
16
16
  from PIL import Image, ImageDraw, ImageFont
17
17
 
18
18
  from videopython.base.exceptions import OutOfBoundsError
19
+ from videopython.base.fonts import load_font
19
20
 
20
21
  __all__ = ["ImageText", "TextAlign", "AnchorPoint"]
21
22
 
@@ -106,7 +107,7 @@ class ImageText:
106
107
  # PIL uses (width, height), so we reverse for Image.new
107
108
  self.image = Image.new(mode, (image_size[1], image_size[0]), color=background)
108
109
  self._draw = ImageDraw.Draw(self.image)
109
- self._font_cache: dict[tuple[str, int], ImageFont.FreeTypeFont] = {} # Cache for font objects
110
+ self._font_cache: dict[tuple[str | None, int], ImageFont.FreeTypeFont | ImageFont.ImageFont] = {}
110
111
 
111
112
  @property
112
113
  def img_array(self) -> np.ndarray:
@@ -119,7 +120,7 @@ class ImageText:
119
120
  raise ValueError("Filename cannot be empty")
120
121
  self.image.save(filename)
121
122
 
122
- def _fit_font_width(self, text: str, font: str, max_width: int) -> int:
123
+ def _fit_font_width(self, text: str, font: str | None, max_width: int) -> int:
123
124
  """
124
125
  Find the maximum font size where the text width is less than or equal to max_width.
125
126
 
@@ -150,7 +151,7 @@ class ImageText:
150
151
  raise ValueError(f"Max width {max_width} is too small for any font size!")
151
152
  return max_font_size
152
153
 
153
- def _fit_font_height(self, text: str, font: str, max_height: int) -> int:
154
+ def _fit_font_height(self, text: str, font: str | None, max_height: int) -> int:
154
155
  """
155
156
  Find the maximum font size where the text height is less than or equal to max_height.
156
157
 
@@ -184,7 +185,7 @@ class ImageText:
184
185
  def _get_font_size(
185
186
  self,
186
187
  text: str,
187
- font: str,
188
+ font: str | None,
188
189
  max_width: int | None = None,
189
190
  max_height: int | None = None,
190
191
  ) -> int:
@@ -333,7 +334,7 @@ class ImageText:
333
334
  def write_text(
334
335
  self,
335
336
  text: str,
336
- font_filename: str,
337
+ font_filename: str | None,
337
338
  xy: PositionType,
338
339
  font_size: int | None = 11,
339
340
  font_border_size: int = 0,
@@ -368,9 +369,6 @@ class ImageText:
368
369
  if not text:
369
370
  raise ValueError("Text cannot be empty")
370
371
 
371
- if not font_filename:
372
- raise ValueError("Font filename cannot be empty")
373
-
374
372
  if font_size is not None and font_size <= 0:
375
373
  raise ValueError("Font size must be positive")
376
374
 
@@ -405,12 +403,16 @@ class ImageText:
405
403
  self._draw.text((x, y), text, font=font, fill=color)
406
404
  return text_dimensions
407
405
 
408
- def _get_font(self, font_filename: str, font_size: int) -> ImageFont.FreeTypeFont:
406
+ def _get_font(self, font_filename: str | None, font_size: int) -> ImageFont.FreeTypeFont | ImageFont.ImageFont:
409
407
  """
410
408
  Get a font object, using cache if available.
411
409
 
410
+ Resolves via :func:`videopython.base.fonts.load_font`, so a missing,
411
+ unreadable, or ``None`` ``font_filename`` gracefully falls back to
412
+ the bundled default font instead of raising.
413
+
412
414
  Args:
413
- font_filename: Path to the font file
415
+ font_filename: Path to the font file, or None for the default.
414
416
  font_size: Size of the font in points
415
417
 
416
418
  Returns:
@@ -418,13 +420,10 @@ class ImageText:
418
420
  """
419
421
  key = (font_filename, font_size)
420
422
  if key not in self._font_cache:
421
- try:
422
- self._font_cache[key] = ImageFont.truetype(font_filename, font_size)
423
- except (OSError, IOError) as e:
424
- raise ValueError(f"Error loading font '{font_filename}': {str(e)}")
423
+ self._font_cache[key] = load_font(font_filename, font_size)
425
424
  return self._font_cache[key]
426
425
 
427
- def get_text_dimensions(self, font_filename: str, font_size: int, text: str) -> tuple[int, int]:
426
+ def get_text_dimensions(self, font_filename: str | None, font_size: int, text: str) -> tuple[int, int]:
428
427
  """
429
428
  Return dimensions (width, height) of the rendered text.
430
429
 
@@ -455,7 +454,11 @@ class ImageText:
455
454
  raise ValueError(f"Error measuring text: {str(e)}")
456
455
 
457
456
  def _get_font_baseline_offset(
458
- self, base_font_filename: str, base_font_size: int, highlight_font_filename: str, highlight_font_size: int
457
+ self,
458
+ base_font_filename: str | None,
459
+ base_font_size: int,
460
+ highlight_font_filename: str | None,
461
+ highlight_font_size: int,
459
462
  ) -> int:
460
463
  """
461
464
  Calculate the vertical offset needed to align baselines of different fonts and sizes.
@@ -497,7 +500,7 @@ class ImageText:
497
500
  def _split_lines_by_width(
498
501
  self,
499
502
  text: str,
500
- font_filename: str,
503
+ font_filename: str | None,
501
504
  font_size: int,
502
505
  box_width: int,
503
506
  ) -> list[str]:
@@ -566,7 +569,7 @@ class ImageText:
566
569
  def write_text_box(
567
570
  self,
568
571
  text: str,
569
- font_filename: str,
572
+ font_filename: str | None,
570
573
  xy: PositionType,
571
574
  box_width: int | float | None = None,
572
575
  font_size: int = 11,
@@ -615,9 +618,6 @@ class ImageText:
615
618
  if not text:
616
619
  raise ValueError("Text cannot be empty")
617
620
 
618
- if not font_filename:
619
- raise ValueError("Font filename cannot be empty")
620
-
621
621
  if font_size <= 0:
622
622
  raise ValueError("Font size must be positive")
623
623
 
@@ -831,7 +831,7 @@ class ImageText:
831
831
  def _write_line_with_highlight(
832
832
  self,
833
833
  line: str,
834
- font_filename: str,
834
+ font_filename: str | None,
835
835
  font_size: int,
836
836
  font_border_size: int,
837
837
  text_color: RGBColor,
@@ -24,6 +24,7 @@ from pydantic import Field, PrivateAttr, model_validator
24
24
  from tqdm import tqdm
25
25
 
26
26
  from videopython.base.description import BoundingBox
27
+ from videopython.base.fonts import load_font
27
28
  from videopython.editing.operation import Effect
28
29
 
29
30
  if TYPE_CHECKING:
@@ -643,12 +644,7 @@ class TextOverlay(Effect):
643
644
  return self
644
645
 
645
646
  def _get_font(self) -> ImageFont.FreeTypeFont | ImageFont.ImageFont:
646
- if self.font_filename:
647
- return ImageFont.truetype(self.font_filename, self.font_size)
648
- try:
649
- return ImageFont.truetype("DejaVuSans.ttf", self.font_size)
650
- except OSError:
651
- return ImageFont.load_default()
647
+ return load_font(self.font_filename, self.font_size)
652
648
 
653
649
  def _wrap_text(self, text: str, font: ImageFont.FreeTypeFont | ImageFont.ImageFont, max_px: int) -> str:
654
650
  lines: list[str] = []
@@ -38,7 +38,10 @@ class TranscriptionOverlay(Effect):
38
38
  streamable: ClassVar[bool] = False
39
39
  requires: ClassVar[tuple[str, ...]] = ("transcription",)
40
40
 
41
- font_filename: str = Field(description="Path to a .ttf font file for rendering subtitle text.")
41
+ font_filename: str | None = Field(
42
+ None,
43
+ description="Path to a .ttf font file for rendering subtitle text, or None for the bundled default font.",
44
+ )
42
45
  font_size: int = Field(40, ge=1, description="Base font size in pixels.")
43
46
  font_border_size: int = Field(
44
47
  2, ge=0, description="Outline thickness around each character in pixels. 0 = no outline."
@@ -1,258 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: videopython
3
- Version: 0.33.2
4
- Summary: Minimal video generation and processing library.
5
- Project-URL: Homepage, https://videopython.com
6
- Project-URL: Repository, https://github.com/bartwojtowicz/videopython/
7
- Project-URL: Documentation, https://videopython.com
8
- Author-email: Bartosz Wójtowicz <bartoszwojtowicz@outlook.com>, Bartosz Rudnikowicz <bartoszrudnikowicz840@gmail.com>, Piotr Pukisz <piotr.pukisz@gmail.com>
9
- License: Apache-2.0
10
- License-File: LICENSE
11
- Keywords: ai,editing,generation,movie,opencv,python,shorts,video,videopython
12
- Classifier: License :: OSI Approved :: Apache Software License
13
- Classifier: Operating System :: OS Independent
14
- Classifier: Programming Language :: Python :: 3
15
- Classifier: Programming Language :: Python :: 3.10
16
- Classifier: Programming Language :: Python :: 3.11
17
- Classifier: Programming Language :: Python :: 3.12
18
- Classifier: Programming Language :: Python :: 3.13
19
- Requires-Python: <3.14,>=3.10
20
- Requires-Dist: numpy>=1.25.2
21
- Requires-Dist: opencv-python-headless>=4.9.0.80
22
- Requires-Dist: pillow>=12.1.1
23
- Requires-Dist: pydantic>=2.8.0
24
- Requires-Dist: tqdm>=4.66.3
25
- Provides-Extra: ai
26
- Requires-Dist: accelerate>=0.29.2; extra == 'ai'
27
- Requires-Dist: chatterbox-tts>=0.1.7; extra == 'ai'
28
- Requires-Dist: demucs>=4.0.0; extra == 'ai'
29
- Requires-Dist: diffusers>=0.30.0; extra == 'ai'
30
- Requires-Dist: hf-transfer>=0.1.9; extra == 'ai'
31
- Requires-Dist: imagehash>=4.3; extra == 'ai'
32
- Requires-Dist: llama-cpp-python>=0.3.0; extra == 'ai'
33
- Requires-Dist: numba>=0.61.0; extra == 'ai'
34
- Requires-Dist: ollama>=0.4.5; extra == 'ai'
35
- Requires-Dist: openai-whisper>=20240930; extra == 'ai'
36
- Requires-Dist: pyannote-audio>=4.0.0; extra == 'ai'
37
- Requires-Dist: pyloudnorm>=0.1.1; extra == 'ai'
38
- Requires-Dist: qwen-vl-utils>=0.0.10; extra == 'ai'
39
- Requires-Dist: scikit-learn>=1.3.0; extra == 'ai'
40
- Requires-Dist: scipy>=1.10.0; extra == 'ai'
41
- Requires-Dist: sentencepiece>=0.1.99; extra == 'ai'
42
- Requires-Dist: silero-vad>=5.1; extra == 'ai'
43
- Requires-Dist: torch>=2.8.0; extra == 'ai'
44
- Requires-Dist: torchaudio>=2.8.0; extra == 'ai'
45
- Requires-Dist: transformers>=5.2.0; extra == 'ai'
46
- Requires-Dist: transnetv2-pytorch>=1.0.5; extra == 'ai'
47
- Requires-Dist: ultralytics>=8.0.0; extra == 'ai'
48
- Description-Content-Type: text/markdown
49
-
50
- # videopython
51
-
52
- [![PyPI](https://img.shields.io/pypi/v/videopython)](https://pypi.org/project/videopython/)
53
- [![Python](https://img.shields.io/pypi/pyversions/videopython)](https://pypi.org/project/videopython/)
54
- [![License](https://img.shields.io/github/license/BartWojtowicz/videopython)](LICENSE)
55
-
56
- Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
57
-
58
- Full documentation: [videopython.com](https://videopython.com)
59
-
60
- > **Disclaimer:** This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
61
-
62
- ## Installation
63
-
64
- ### 1. Install FFmpeg
65
-
66
- ```bash
67
- # macOS
68
- brew install ffmpeg
69
-
70
- # Ubuntu / Debian
71
- sudo apt-get install ffmpeg
72
-
73
- # Windows (Chocolatey)
74
- choco install ffmpeg
75
- ```
76
-
77
- ### 2. Install videopython
78
-
79
- ```bash
80
- pip install videopython # core video/audio editing
81
- pip install "videopython[ai]" # + local AI features (GPU recommended)
82
- ```
83
-
84
- Python `>=3.10, <3.14`. AI features run locally - no cloud API keys required, but model weights are downloaded on first use.
85
-
86
- ## Quick Start
87
-
88
- ### Imperative editing
89
-
90
- Every editing primitive is an `Operation` subclass — a Pydantic model
91
- whose fields ARE the JSON wire format. Apply one to a `Video`:
92
-
93
- ```python
94
- from videopython.base import Video
95
- from videopython.editing import CutSeconds, Resize, Fade
96
-
97
- video = Video.from_path("raw.mp4")
98
- video = CutSeconds(start=10, end=25).apply(video)
99
- video = Resize(width=1080, height=1920).apply(video)
100
- video = Fade(mode="in", duration=0.5).apply(video)
101
- video.save("output.mp4")
102
- ```
103
-
104
- Concatenate clips with `+` (must share fps + dimensions):
105
-
106
- ```python
107
- combined = video_a + video_b
108
- ```
109
-
110
- ### JSON editing plans
111
-
112
- Define multi-segment edits as JSON — the format LLM-driven workflows
113
- generate against. `VideoEdit.json_schema()` returns the schema:
114
-
115
- ```python
116
- from videopython.editing import VideoEdit
117
-
118
- plan = {
119
- "segments": [{
120
- "source": "raw.mp4",
121
- "start": 10.0,
122
- "end": 20.0,
123
- "operations": [
124
- {"op": "resize", "width": 1080, "height": 1920},
125
- {"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
126
- {"op": "fade", "mode": "in", "duration": 0.5,
127
- "window": {"stop": 0.5}},
128
- ],
129
- }],
130
- }
131
-
132
- edit = VideoEdit.from_dict(plan)
133
- edit.validate() # dry-run via metadata, no frames loaded
134
- edit.run_to_file("output.mp4") # stream to disk, ~constant memory
135
- ```
136
-
137
- `run_to_file()` pipes ffmpeg decode → per-frame effects → ffmpeg encode,
138
- so memory stays bounded even for hour-long sources. Use `edit.run()`
139
- instead if you want the result back in memory as a `Video`.
140
-
141
- ### AI generation
142
-
143
- ```python
144
- from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
145
- from videopython.editing import Resize
146
-
147
- image = TextToImage().generate_image("A cinematic mountain sunrise")
148
- video = ImageToVideo().generate_video(image=image)
149
- audio = TextToSpeech().generate_audio("Welcome to videopython.")
150
-
151
- video = Resize(width=1080, height=1920).apply(video)
152
- video.add_audio(audio).save("ai_video.mp4")
153
- ```
154
-
155
- ## LLM & AI Agent Integration
156
-
157
- The library is built for LLM-driven editing. Two surfaces matter:
158
-
159
- **1. Plan schema for tool / structured-output calls.**
160
- `VideoEdit.json_schema()` returns a JSON Schema covering segments,
161
- `post_operations`, and a discriminated union over every registered
162
- `Operation`. Drop it into any LLM API:
163
-
164
- ```python
165
- from videopython.editing import VideoEdit
166
-
167
- schema = VideoEdit.json_schema()
168
- # Anthropic: tools=[{"name": "edit", "input_schema": schema}]
169
- # OpenAI: tools=[{"type": "function",
170
- # "function": {"name": "edit", "parameters": schema}}]
171
- ```
172
-
173
- Validate the LLM's output without touching the filesystem, then run it:
174
-
175
- ```python
176
- edit = VideoEdit.from_dict(plan)
177
- edit.validate() # catches bad ops, time ranges, fps mismatches
178
- edit.run_to_file("output.mp4")
179
- ```
180
-
181
- **2. Operation discovery for agent loops.**
182
- Every registered op exposes its own Pydantic schema, so an agent can
183
- introspect what's available without hardcoded lists:
184
-
185
- ```python
186
- from videopython.editing import Operation, OpCategory
187
-
188
- for op_id, cls in Operation.registry().items():
189
- print(f"{op_id}: {(cls.__doc__ or '').splitlines()[0]}")
190
-
191
- schema = Operation.get("color_adjust").model_json_schema() # per-op schema
192
- ```
193
-
194
- Field constraints (`minimum`, `maximum`, `enum`, `exclusiveMinimum`,
195
- nullability) flow through to the schema, so LLMs that support
196
- constrained generation produce valid parameters on the first try.
197
-
198
- For ops that need side-channel data (e.g. `silence_removal` and
199
- `add_subtitles` need a `Transcription`), pass it via `context`:
200
-
201
- ```python
202
- edit.run(context={"transcription": my_transcription})
203
- ```
204
-
205
- Docs: [Editing Plans](https://videopython.com/api/editing/) | [Operations](https://videopython.com/api/operations/) | [LLM Integration Guide](https://videopython.com/guides/llm-integration/)
206
-
207
- ## Features
208
-
209
- ### `videopython.base` - data containers + I/O (no AI dependencies)
210
-
211
- | Area | Highlights |
212
- |---|---|
213
- | **Video I/O** | `Video`, `VideoMetadata`, `FrameIterator` - load, save, inspect |
214
- | **Text rendering** | `ImageText` - generic PIL text-on-image primitive |
215
- | **Transcription** | `Transcription`, `TranscriptionSegment`, `TranscriptionWord` - data classes returned by transcription backends |
216
- | **Result types** | `BoundingBox`, `DetectedFace`, `FaceTrack`, `SceneBoundary`, `AudioEvent`, `MotionInfo`, ... - shared by editing and AI |
217
-
218
- ### `videopython.audio` - audio data container
219
-
220
- | Area | Highlights |
221
- |---|---|
222
- | **Audio** | `Audio`, `AudioMetadata` - load/save, overlay, concat, normalize, time-stretch, silence detection, segment classification |
223
-
224
- ### `videopython.editing` - editing primitives + plan runner
225
-
226
- | Area | Highlights |
227
- |---|---|
228
- | **Operation foundation** | `Operation`, `Effect`, `TimeRange`, `OpCategory` - Pydantic base + auto-registry + discriminated-union schema |
229
- | **Editing plans** | `VideoEdit`, `SegmentConfig` - JSON/LLM-friendly multi-segment plans with JSON Schema generation, dry-run validation, and streaming `run_to_file` |
230
- | **Transforms** | Cut (time/frame), resize, crop, FPS resampling, speed change, reverse, freeze frame, silence removal |
231
- | **Effects** | Blur, zoom, color grading, vignette, Ken Burns, image overlay, fade, text overlay, volume adjust |
232
- | **Subtitles** | `TranscriptionOverlay` - animated word-by-word subtitle rendering |
233
-
234
- API docs: [Core](https://videopython.com/api/index/) | [Video](https://videopython.com/api/core/video/) | [Audio](https://videopython.com/api/core/audio/) | [Editing Plans](https://videopython.com/api/editing/) | [Operations](https://videopython.com/api/operations/) | [Transforms](https://videopython.com/api/transforms/) | [Effects](https://videopython.com/api/effects/) | [Text](https://videopython.com/api/text/)
235
-
236
- ### `videopython.ai` - local AI features (install with `[ai]`)
237
-
238
- | Area | Highlights |
239
- |---|---|
240
- | **Generation** | `TextToVideo`, `ImageToVideo`, `TextToImage`, `TextToSpeech`, `TextToMusic` |
241
- | **Understanding** | `AudioToText` (transcription), `AudioClassifier`, `SceneVLM` (structured visual scene description), `FaceTracker` (per-shot face tracks) |
242
- | **Scene detection** | `SemanticSceneDetector` (neural scene boundaries) |
243
- | **Video analysis** | `VideoAnalyzer` - full-pipeline analysis combining multiple AI capabilities |
244
- | **Transforms** | `FaceTrackingCrop` |
245
- | **Dubbing** | `VideoDubber` - voice cloning and revoicing with timing sync |
246
-
247
- API docs: [Generation](https://videopython.com/api/ai/generation/) | [Understanding](https://videopython.com/api/ai/understanding/) | [Transforms](https://videopython.com/api/ai/transforms/) | [Dubbing](https://videopython.com/api/ai/dubbing/)
248
-
249
- ## Examples
250
-
251
- - [Social Media Clip](https://videopython.com/examples/social-clip/)
252
- - [AI-Generated Video](https://videopython.com/examples/ai-video/)
253
- - [Auto-Subtitles](https://videopython.com/examples/auto-subtitles/)
254
- - [Processing Large Videos](https://videopython.com/examples/large-videos/)
255
-
256
- ## Development
257
-
258
- See [`DEVELOPMENT.md`](DEVELOPMENT.md) for local setup, testing, and contribution workflow.
@@ -1,209 +0,0 @@
1
- # videopython
2
-
3
- [![PyPI](https://img.shields.io/pypi/v/videopython)](https://pypi.org/project/videopython/)
4
- [![Python](https://img.shields.io/pypi/pyversions/videopython)](https://pypi.org/project/videopython/)
5
- [![License](https://img.shields.io/github/license/BartWojtowicz/videopython)](LICENSE)
6
-
7
- Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
8
-
9
- Full documentation: [videopython.com](https://videopython.com)
10
-
11
- > **Disclaimer:** This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
12
-
13
- ## Installation
14
-
15
- ### 1. Install FFmpeg
16
-
17
- ```bash
18
- # macOS
19
- brew install ffmpeg
20
-
21
- # Ubuntu / Debian
22
- sudo apt-get install ffmpeg
23
-
24
- # Windows (Chocolatey)
25
- choco install ffmpeg
26
- ```
27
-
28
- ### 2. Install videopython
29
-
30
- ```bash
31
- pip install videopython # core video/audio editing
32
- pip install "videopython[ai]" # + local AI features (GPU recommended)
33
- ```
34
-
35
- Python `>=3.10, <3.14`. AI features run locally - no cloud API keys required, but model weights are downloaded on first use.
36
-
37
- ## Quick Start
38
-
39
- ### Imperative editing
40
-
41
- Every editing primitive is an `Operation` subclass — a Pydantic model
42
- whose fields ARE the JSON wire format. Apply one to a `Video`:
43
-
44
- ```python
45
- from videopython.base import Video
46
- from videopython.editing import CutSeconds, Resize, Fade
47
-
48
- video = Video.from_path("raw.mp4")
49
- video = CutSeconds(start=10, end=25).apply(video)
50
- video = Resize(width=1080, height=1920).apply(video)
51
- video = Fade(mode="in", duration=0.5).apply(video)
52
- video.save("output.mp4")
53
- ```
54
-
55
- Concatenate clips with `+` (must share fps + dimensions):
56
-
57
- ```python
58
- combined = video_a + video_b
59
- ```
60
-
61
- ### JSON editing plans
62
-
63
- Define multi-segment edits as JSON — the format LLM-driven workflows
64
- generate against. `VideoEdit.json_schema()` returns the schema:
65
-
66
- ```python
67
- from videopython.editing import VideoEdit
68
-
69
- plan = {
70
- "segments": [{
71
- "source": "raw.mp4",
72
- "start": 10.0,
73
- "end": 20.0,
74
- "operations": [
75
- {"op": "resize", "width": 1080, "height": 1920},
76
- {"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
77
- {"op": "fade", "mode": "in", "duration": 0.5,
78
- "window": {"stop": 0.5}},
79
- ],
80
- }],
81
- }
82
-
83
- edit = VideoEdit.from_dict(plan)
84
- edit.validate() # dry-run via metadata, no frames loaded
85
- edit.run_to_file("output.mp4") # stream to disk, ~constant memory
86
- ```
87
-
88
- `run_to_file()` pipes ffmpeg decode → per-frame effects → ffmpeg encode,
89
- so memory stays bounded even for hour-long sources. Use `edit.run()`
90
- instead if you want the result back in memory as a `Video`.
91
-
92
- ### AI generation
93
-
94
- ```python
95
- from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
96
- from videopython.editing import Resize
97
-
98
- image = TextToImage().generate_image("A cinematic mountain sunrise")
99
- video = ImageToVideo().generate_video(image=image)
100
- audio = TextToSpeech().generate_audio("Welcome to videopython.")
101
-
102
- video = Resize(width=1080, height=1920).apply(video)
103
- video.add_audio(audio).save("ai_video.mp4")
104
- ```
105
-
106
- ## LLM & AI Agent Integration
107
-
108
- The library is built for LLM-driven editing. Two surfaces matter:
109
-
110
- **1. Plan schema for tool / structured-output calls.**
111
- `VideoEdit.json_schema()` returns a JSON Schema covering segments,
112
- `post_operations`, and a discriminated union over every registered
113
- `Operation`. Drop it into any LLM API:
114
-
115
- ```python
116
- from videopython.editing import VideoEdit
117
-
118
- schema = VideoEdit.json_schema()
119
- # Anthropic: tools=[{"name": "edit", "input_schema": schema}]
120
- # OpenAI: tools=[{"type": "function",
121
- # "function": {"name": "edit", "parameters": schema}}]
122
- ```
123
-
124
- Validate the LLM's output without touching the filesystem, then run it:
125
-
126
- ```python
127
- edit = VideoEdit.from_dict(plan)
128
- edit.validate() # catches bad ops, time ranges, fps mismatches
129
- edit.run_to_file("output.mp4")
130
- ```
131
-
132
- **2. Operation discovery for agent loops.**
133
- Every registered op exposes its own Pydantic schema, so an agent can
134
- introspect what's available without hardcoded lists:
135
-
136
- ```python
137
- from videopython.editing import Operation, OpCategory
138
-
139
- for op_id, cls in Operation.registry().items():
140
- print(f"{op_id}: {(cls.__doc__ or '').splitlines()[0]}")
141
-
142
- schema = Operation.get("color_adjust").model_json_schema() # per-op schema
143
- ```
144
-
145
- Field constraints (`minimum`, `maximum`, `enum`, `exclusiveMinimum`,
146
- nullability) flow through to the schema, so LLMs that support
147
- constrained generation produce valid parameters on the first try.
148
-
149
- For ops that need side-channel data (e.g. `silence_removal` and
150
- `add_subtitles` need a `Transcription`), pass it via `context`:
151
-
152
- ```python
153
- edit.run(context={"transcription": my_transcription})
154
- ```
155
-
156
- Docs: [Editing Plans](https://videopython.com/api/editing/) | [Operations](https://videopython.com/api/operations/) | [LLM Integration Guide](https://videopython.com/guides/llm-integration/)
157
-
158
- ## Features
159
-
160
- ### `videopython.base` - data containers + I/O (no AI dependencies)
161
-
162
- | Area | Highlights |
163
- |---|---|
164
- | **Video I/O** | `Video`, `VideoMetadata`, `FrameIterator` - load, save, inspect |
165
- | **Text rendering** | `ImageText` - generic PIL text-on-image primitive |
166
- | **Transcription** | `Transcription`, `TranscriptionSegment`, `TranscriptionWord` - data classes returned by transcription backends |
167
- | **Result types** | `BoundingBox`, `DetectedFace`, `FaceTrack`, `SceneBoundary`, `AudioEvent`, `MotionInfo`, ... - shared by editing and AI |
168
-
169
- ### `videopython.audio` - audio data container
170
-
171
- | Area | Highlights |
172
- |---|---|
173
- | **Audio** | `Audio`, `AudioMetadata` - load/save, overlay, concat, normalize, time-stretch, silence detection, segment classification |
174
-
175
- ### `videopython.editing` - editing primitives + plan runner
176
-
177
- | Area | Highlights |
178
- |---|---|
179
- | **Operation foundation** | `Operation`, `Effect`, `TimeRange`, `OpCategory` - Pydantic base + auto-registry + discriminated-union schema |
180
- | **Editing plans** | `VideoEdit`, `SegmentConfig` - JSON/LLM-friendly multi-segment plans with JSON Schema generation, dry-run validation, and streaming `run_to_file` |
181
- | **Transforms** | Cut (time/frame), resize, crop, FPS resampling, speed change, reverse, freeze frame, silence removal |
182
- | **Effects** | Blur, zoom, color grading, vignette, Ken Burns, image overlay, fade, text overlay, volume adjust |
183
- | **Subtitles** | `TranscriptionOverlay` - animated word-by-word subtitle rendering |
184
-
185
- API docs: [Core](https://videopython.com/api/index/) | [Video](https://videopython.com/api/core/video/) | [Audio](https://videopython.com/api/core/audio/) | [Editing Plans](https://videopython.com/api/editing/) | [Operations](https://videopython.com/api/operations/) | [Transforms](https://videopython.com/api/transforms/) | [Effects](https://videopython.com/api/effects/) | [Text](https://videopython.com/api/text/)
186
-
187
- ### `videopython.ai` - local AI features (install with `[ai]`)
188
-
189
- | Area | Highlights |
190
- |---|---|
191
- | **Generation** | `TextToVideo`, `ImageToVideo`, `TextToImage`, `TextToSpeech`, `TextToMusic` |
192
- | **Understanding** | `AudioToText` (transcription), `AudioClassifier`, `SceneVLM` (structured visual scene description), `FaceTracker` (per-shot face tracks) |
193
- | **Scene detection** | `SemanticSceneDetector` (neural scene boundaries) |
194
- | **Video analysis** | `VideoAnalyzer` - full-pipeline analysis combining multiple AI capabilities |
195
- | **Transforms** | `FaceTrackingCrop` |
196
- | **Dubbing** | `VideoDubber` - voice cloning and revoicing with timing sync |
197
-
198
- API docs: [Generation](https://videopython.com/api/ai/generation/) | [Understanding](https://videopython.com/api/ai/understanding/) | [Transforms](https://videopython.com/api/ai/transforms/) | [Dubbing](https://videopython.com/api/ai/dubbing/)
199
-
200
- ## Examples
201
-
202
- - [Social Media Clip](https://videopython.com/examples/social-clip/)
203
- - [AI-Generated Video](https://videopython.com/examples/ai-video/)
204
- - [Auto-Subtitles](https://videopython.com/examples/auto-subtitles/)
205
- - [Processing Large Videos](https://videopython.com/examples/large-videos/)
206
-
207
- ## Development
208
-
209
- See [`DEVELOPMENT.md`](DEVELOPMENT.md) for local setup, testing, and contribution workflow.
File without changes
File without changes