subcap 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- subcap-0.1.0/.gitignore +4 -0
- subcap-0.1.0/LICENSE +21 -0
- subcap-0.1.0/PKG-INFO +93 -0
- subcap-0.1.0/README.md +83 -0
- subcap-0.1.0/docs/superpowers/plans/2026-04-11-subcap.md +1245 -0
- subcap-0.1.0/docs/superpowers/specs/2026-04-11-subcap-design.md +126 -0
- subcap-0.1.0/pyproject.toml +23 -0
- subcap-0.1.0/src/subcap/__init__.py +1 -0
- subcap-0.1.0/src/subcap/align.py +82 -0
- subcap-0.1.0/src/subcap/cli.py +116 -0
- subcap-0.1.0/src/subcap/detect.py +78 -0
- subcap-0.1.0/src/subcap/encode.py +68 -0
- subcap-0.1.0/src/subcap/segment.py +93 -0
- subcap-0.1.0/src/subcap/styles.py +139 -0
- subcap-0.1.0/src/subcap/types.py +17 -0
- subcap-0.1.0/tests/__init__.py +0 -0
- subcap-0.1.0/tests/test_align.py +150 -0
- subcap-0.1.0/tests/test_cli.py +50 -0
- subcap-0.1.0/tests/test_detect.py +115 -0
- subcap-0.1.0/tests/test_encode.py +48 -0
- subcap-0.1.0/tests/test_segment.py +249 -0
- subcap-0.1.0/tests/test_styles.py +154 -0
subcap-0.1.0/.gitignore
ADDED
subcap-0.1.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Joseph Nordqvist
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
subcap-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: subcap
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Burn precisely-timed captions into video using forced alignment.
|
|
5
|
+
License-Expression: MIT
|
|
6
|
+
License-File: LICENSE
|
|
7
|
+
Requires-Python: >=3.10
|
|
8
|
+
Requires-Dist: stable-ts>=2.16
|
|
9
|
+
Description-Content-Type: text/markdown
|
|
10
|
+
|
|
11
|
+
# subcap
|
|
12
|
+
|
|
13
|
+
Burn precisely-timed captions into video. Give it a video and a transcript — it handles alignment, styling, and encoding.
|
|
14
|
+
|
|
15
|
+
Unlike speech-to-text tools that guess both *what* is said and *when*, subcap uses **forced alignment**: you provide the transcript, and it maps each word to its exact position in the audio waveform. The result is phoneme-level timing accuracy.
|
|
16
|
+
|
|
17
|
+
## Install
|
|
18
|
+
|
|
19
|
+
```
|
|
20
|
+
pip install subcap
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
Requires [ffmpeg](https://ffmpeg.org/) with libass support.
|
|
24
|
+
|
|
25
|
+
## Usage
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
# Align a transcript and burn captions in
|
|
29
|
+
subcap video.mov transcript.txt -o output.mp4
|
|
30
|
+
|
|
31
|
+
# Use an existing SRT file (skips alignment)
|
|
32
|
+
subcap video.mov subtitles.srt -o output.mp4
|
|
33
|
+
|
|
34
|
+
# Choose a style
|
|
35
|
+
subcap video.mov transcript.txt --style outline
|
|
36
|
+
|
|
37
|
+
# ProRes output for editing
|
|
38
|
+
subcap video.mov transcript.txt --quality studio -o output.mov
|
|
39
|
+
|
|
40
|
+
# Portrait/vertical video (auto-detected)
|
|
41
|
+
subcap shorts.mp4 transcript.txt -o shorts_captioned.mp4
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
## Options
|
|
45
|
+
|
|
46
|
+
```
|
|
47
|
+
subcap <video> <transcript> [options]
|
|
48
|
+
|
|
49
|
+
-o, --output Output path (default: <input>_captioned.mp4)
|
|
50
|
+
--style modern | outline | minimal | bold (default: modern)
|
|
51
|
+
--quality standard | high | studio (default: standard)
|
|
52
|
+
--max-lines Max lines per subtitle (default: 2)
|
|
53
|
+
--max-chars Max characters per line (default: auto)
|
|
54
|
+
--line-spacing Gap between lines in px (default: auto)
|
|
55
|
+
--position bottom | center | top (default: bottom)
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
### Styles
|
|
59
|
+
|
|
60
|
+
| Preset | Look |
|
|
61
|
+
|--------|------|
|
|
62
|
+
| `modern` | White bold text, semi-transparent dark box |
|
|
63
|
+
| `outline` | White text with black outline |
|
|
64
|
+
| `minimal` | Lighter weight, subtle shadow |
|
|
65
|
+
| `bold` | Large text, opaque dark box |
|
|
66
|
+
|
|
67
|
+
### Quality
|
|
68
|
+
|
|
69
|
+
| Preset | Codec | Use case |
|
|
70
|
+
|--------|-------|----------|
|
|
71
|
+
| `standard` | H.264 | Sharing, uploading |
|
|
72
|
+
| `high` | H.265 | Smaller files |
|
|
73
|
+
| `studio` | ProRes 422 | Editing, broadcast |
|
|
74
|
+
|
|
75
|
+
## How it works
|
|
76
|
+
|
|
77
|
+
1. Extracts audio from the video
|
|
78
|
+
2. Runs forced alignment via [stable-ts](https://github.com/jianfch/stable-ts) to map each word to its exact position in the audio
|
|
79
|
+
3. Segments words into readable subtitle chunks
|
|
80
|
+
4. Generates styled ASS subtitles adapted to the video's aspect ratio
|
|
81
|
+
5. Burns captions into the video via ffmpeg
|
|
82
|
+
|
|
83
|
+
## Acknowledgments
|
|
84
|
+
|
|
85
|
+
Built on:
|
|
86
|
+
|
|
87
|
+
- **[stable-ts](https://github.com/jianfch/stable-ts)** — Stabilized Whisper timestamps and forced alignment
|
|
88
|
+
- **[OpenAI Whisper](https://github.com/openai/whisper)** — Speech recognition model used as the acoustic backbone
|
|
89
|
+
- **[ffmpeg](https://ffmpeg.org/)** — Video encoding and subtitle rendering via libass
|
|
90
|
+
|
|
91
|
+
## License
|
|
92
|
+
|
|
93
|
+
MIT
|
subcap-0.1.0/README.md
ADDED
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
# subcap
|
|
2
|
+
|
|
3
|
+
Burn precisely-timed captions into video. Give it a video and a transcript — it handles alignment, styling, and encoding.
|
|
4
|
+
|
|
5
|
+
Unlike speech-to-text tools that guess both *what* is said and *when*, subcap uses **forced alignment**: you provide the transcript, and it maps each word to its exact position in the audio waveform. The result is phoneme-level timing accuracy.
|
|
6
|
+
|
|
7
|
+
## Install
|
|
8
|
+
|
|
9
|
+
```
|
|
10
|
+
pip install subcap
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
Requires [ffmpeg](https://ffmpeg.org/) with libass support.
|
|
14
|
+
|
|
15
|
+
## Usage
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
# Align a transcript and burn captions in
|
|
19
|
+
subcap video.mov transcript.txt -o output.mp4
|
|
20
|
+
|
|
21
|
+
# Use an existing SRT file (skips alignment)
|
|
22
|
+
subcap video.mov subtitles.srt -o output.mp4
|
|
23
|
+
|
|
24
|
+
# Choose a style
|
|
25
|
+
subcap video.mov transcript.txt --style outline
|
|
26
|
+
|
|
27
|
+
# ProRes output for editing
|
|
28
|
+
subcap video.mov transcript.txt --quality studio -o output.mov
|
|
29
|
+
|
|
30
|
+
# Portrait/vertical video (auto-detected)
|
|
31
|
+
subcap shorts.mp4 transcript.txt -o shorts_captioned.mp4
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
## Options
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
subcap <video> <transcript> [options]
|
|
38
|
+
|
|
39
|
+
-o, --output Output path (default: <input>_captioned.mp4)
|
|
40
|
+
--style modern | outline | minimal | bold (default: modern)
|
|
41
|
+
--quality standard | high | studio (default: standard)
|
|
42
|
+
--max-lines Max lines per subtitle (default: 2)
|
|
43
|
+
--max-chars Max characters per line (default: auto)
|
|
44
|
+
--line-spacing Gap between lines in px (default: auto)
|
|
45
|
+
--position bottom | center | top (default: bottom)
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
### Styles
|
|
49
|
+
|
|
50
|
+
| Preset | Look |
|
|
51
|
+
|--------|------|
|
|
52
|
+
| `modern` | White bold text, semi-transparent dark box |
|
|
53
|
+
| `outline` | White text with black outline |
|
|
54
|
+
| `minimal` | Lighter weight, subtle shadow |
|
|
55
|
+
| `bold` | Large text, opaque dark box |
|
|
56
|
+
|
|
57
|
+
### Quality
|
|
58
|
+
|
|
59
|
+
| Preset | Codec | Use case |
|
|
60
|
+
|--------|-------|----------|
|
|
61
|
+
| `standard` | H.264 | Sharing, uploading |
|
|
62
|
+
| `high` | H.265 | Smaller files |
|
|
63
|
+
| `studio` | ProRes 422 | Editing, broadcast |
|
|
64
|
+
|
|
65
|
+
## How it works
|
|
66
|
+
|
|
67
|
+
1. Extracts audio from the video
|
|
68
|
+
2. Runs forced alignment via [stable-ts](https://github.com/jianfch/stable-ts) to map each word to its exact position in the audio
|
|
69
|
+
3. Segments words into readable subtitle chunks
|
|
70
|
+
4. Generates styled ASS subtitles adapted to the video's aspect ratio
|
|
71
|
+
5. Burns captions into the video via ffmpeg
|
|
72
|
+
|
|
73
|
+
## Acknowledgments
|
|
74
|
+
|
|
75
|
+
Built on:
|
|
76
|
+
|
|
77
|
+
- **[stable-ts](https://github.com/jianfch/stable-ts)** — Stabilized Whisper timestamps and forced alignment
|
|
78
|
+
- **[OpenAI Whisper](https://github.com/openai/whisper)** — Speech recognition model used as the acoustic backbone
|
|
79
|
+
- **[ffmpeg](https://ffmpeg.org/)** — Video encoding and subtitle rendering via libass
|
|
80
|
+
|
|
81
|
+
## License
|
|
82
|
+
|
|
83
|
+
MIT
|