star-reader 0.1.8__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- star/CHANGELOG.md +549 -0
- star/LICENSE +674 -0
- star/README.md +2045 -0
- star/__init__.py +28 -0
- star/__main__.py +5 -0
- star/_runtime.py +507 -0
- star/annotations.py +146 -0
- star/app.py +256 -0
- star/braille.py +175 -0
- star/cache.py +71 -0
- star/citations.py +271 -0
- star/convert.py +276 -0
- star/diagnostics.py +366 -0
- star/documents.py +1790 -0
- star/feeds.py +58 -0
- star/flashcards.py +78 -0
- star/gui.py +5653 -0
- star/markup.py +748 -0
- star/render.py +335 -0
- star/search.py +194 -0
- star/settings.py +227 -0
- star/spellcheck.py +101 -0
- star/stats.py +305 -0
- star/summarize.py +72 -0
- star/themes.py +256 -0
- star/transcribe.py +75 -0
- star/translate.py +68 -0
- star/tts.py +2942 -0
- star/ttstext.py +826 -0
- star/tui.py +4404 -0
- star/vocab.py +56 -0
- star/watch.py +316 -0
- star_reader-0.1.8.dist-info/METADATA +2127 -0
- star_reader-0.1.8.dist-info/RECORD +38 -0
- star_reader-0.1.8.dist-info/WHEEL +5 -0
- star_reader-0.1.8.dist-info/entry_points.txt +2 -0
- star_reader-0.1.8.dist-info/licenses/LICENSE +674 -0
- star_reader-0.1.8.dist-info/top_level.txt +1 -0
star/CHANGELOG.md
ADDED
|
@@ -0,0 +1,549 @@
|
|
|
1
|
+
# π Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to **star β Speaking Terminal Access Reader** are documented
|
|
4
|
+
in this file.
|
|
5
|
+
|
|
6
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
7
|
+
and this project aims to follow [Semantic Versioning](https://semver.org/).
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## [0.1.8] 2026-06-23
|
|
12
|
+
|
|
13
|
+
### β¨ Added
|
|
14
|
+
|
|
15
|
+
- **Published to PyPI.** star is now installable with `pip install star-reader`
|
|
16
|
+
(or `pipx install star-reader`) β no manual wheel download. The release
|
|
17
|
+
workflow publishes the wheel and sdist via PyPI **trusted publishing** (OIDC,
|
|
18
|
+
no stored API token): pre-release tags (e.g. `v0.1.8-rc1`) go to TestPyPI and
|
|
19
|
+
final tags to PyPI.
|
|
20
|
+
|
|
21
|
+
### ποΈ Build & CI
|
|
22
|
+
|
|
23
|
+
- **Continuous integration.** A GitHub Actions test matrix (Linux / Windows /
|
|
24
|
+
macOS Γ Python 3.11β3.13, with one leg that installs the optional packages so
|
|
25
|
+
the real-behaviour tests run) and a non-blocking `ruff` lint gate run on every
|
|
26
|
+
push and pull request.
|
|
27
|
+
- **Automated releases.** A tag-triggered workflow builds the universal wheel +
|
|
28
|
+
sdist, the Windows `star.pyz`, and the Windows `star.exe`, and publishes a
|
|
29
|
+
GitHub Release with generated notes.
|
|
30
|
+
- **Optional lean Windows build.** The Windows `star.exe` still bundles the
|
|
31
|
+
offline dictation stack (Whisper + PyTorch + the `base` model) **by default**,
|
|
32
|
+
so users get voice dictation out of the box. A new `-Lean` switch on
|
|
33
|
+
`tools/build-windows.ps1` (or the release workflow's `lean: true` input) skips
|
|
34
|
+
that multi-GB stack for a fast, small build β useful for quick test builds and
|
|
35
|
+
CI iteration; a lean `star.exe` reports dictation as unavailable in
|
|
36
|
+
`star --deps` and is otherwise fully functional.
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## [0.1.7] 2026-06-23
|
|
41
|
+
|
|
42
|
+
### β¨ Added
|
|
43
|
+
|
|
44
|
+
- **Document translation.** A new **Tools βΈ Translate Documentβ¦**
|
|
45
|
+
(`Ctrl+Shift+X`) translates the open document into any of 15 common languages
|
|
46
|
+
via Google Translate (no API key, no account). A picker dialog chooses the
|
|
47
|
+
target language and shows the result in a read-only pane; the network call
|
|
48
|
+
runs on a background thread so the window stays responsive, and the input is
|
|
49
|
+
capped at 15 000 characters per request to stay within rate limits. Requires
|
|
50
|
+
the optional `deep-translator` package; the menu item prompts to install it
|
|
51
|
+
when absent.
|
|
52
|
+
- **RSS / Atom feed reading.** **File βΈ Open Feedβ¦** (`Ctrl+Shift+M`) fetches a
|
|
53
|
+
feed URL, lists its articles, and opens the chosen one in the reader through
|
|
54
|
+
star's normal URL-loading path. Useful for tracking arXiv, PubMed, bioRxiv,
|
|
55
|
+
or journal feeds without leaving star. Requires the optional `feedparser`
|
|
56
|
+
package; the menu item prompts to install it when absent.
|
|
57
|
+
- **Difficult-word overlay.** **View βΈ Reading Aids βΈ Highlight Difficult
|
|
58
|
+
Words** (`Ctrl+Alt+O`) tints uncommon / academic vocabulary by word
|
|
59
|
+
frequency, giving a visual pre-scan of dense terminology before reading. The
|
|
60
|
+
overlay is non-destructive (it rides the existing extra-selection pipeline,
|
|
61
|
+
sitting under user highlights and the TTS word highlight), persists across
|
|
62
|
+
sessions (`qt_vocab_highlight`), and recomputes on each document load.
|
|
63
|
+
Requires the optional `wordfreq` package.
|
|
64
|
+
- **Dependency status report.** A new `star --deps` flag prints the
|
|
65
|
+
availability of every optional dependency, grouped by area, with a one-line
|
|
66
|
+
description and a copy-paste install hint for anything missing β backed by a
|
|
67
|
+
new `star.diagnostics` module that is the single source of truth for star's
|
|
68
|
+
optional dependencies.
|
|
69
|
+
- **New optional-dependency groups.** `translate` (`deep-translator`), `feeds`
|
|
70
|
+
(`feedparser`), and `vocab` (`wordfreq`), all folded into the `all` extra and
|
|
71
|
+
mirrored in `requirements-optional.txt`.
|
|
72
|
+
|
|
73
|
+
### π§ͺ Tests
|
|
74
|
+
|
|
75
|
+
- **General dependency harness.** `tests/test_dependencies.py` verifies the new
|
|
76
|
+
diagnostics registry against the codebase: a completeness check fails if any
|
|
77
|
+
import guard is ever added without being registered, and a per-dependency
|
|
78
|
+
consistency check asserts that anything reported as available really does
|
|
79
|
+
import. `tests/test_features.py` covers the translation, feed, and
|
|
80
|
+
difficult-word logic, including their graceful-degradation paths.
|
|
81
|
+
|
|
82
|
+
### π Notes
|
|
83
|
+
|
|
84
|
+
- The three new commands use `Ctrl+Shift+X`, `Ctrl+Shift+M`, and `Ctrl+Alt+O`
|
|
85
|
+
β the more intuitive `Ctrl+Shift+L/F` and `Ctrl+Alt+W` were already bound
|
|
86
|
+
(live preview, themes folder, text spacing). All three are also reachable
|
|
87
|
+
from the F2 command palette, which now additionally lists Summarize, Anki
|
|
88
|
+
export, and Check Spelling for completeness.
|
|
89
|
+
|
|
90
|
+
---
|
|
91
|
+
|
|
92
|
+
## [0.1.6] 2026-06-23
|
|
93
|
+
|
|
94
|
+
### β¨ Added
|
|
95
|
+
|
|
96
|
+
- **Document summarization.** A new **Tools βΈ Summarize Document**
|
|
97
|
+
(`Ctrl+Shift+U`) condenses the open document to its key sentences using the
|
|
98
|
+
LexRank algorithm (via the optional `sumy` package) and shows the result in a
|
|
99
|
+
read-only dialog. The number of sentences is configurable through the
|
|
100
|
+
`summary_sentences` setting (default 7). Summarization runs on a background
|
|
101
|
+
thread so the window stays responsive on long documents. Requires
|
|
102
|
+
`pip install sumy`; the menu item prompts to install it when absent. The
|
|
103
|
+
NLTK sentence-tokenizer data it needs is fetched automatically on first use.
|
|
104
|
+
- **Anki flashcard export.** **File βΈ Export βΈ Anki Flashcardsβ¦**
|
|
105
|
+
(`Ctrl+Alt+H`) turns the current document's notes into an Anki deck
|
|
106
|
+
(`.apkg`): each note becomes one card with the highlighted passage on the
|
|
107
|
+
front and your note on the back. Requires the optional `genanki` package;
|
|
108
|
+
the menu item prompts to install it when absent, and prompts you to add a
|
|
109
|
+
note first if the document has none.
|
|
110
|
+
- **Spell checking in edit mode.** While editing a document's Markdown source,
|
|
111
|
+
misspelled words are underlined with a red squiggle, rechecked as you type.
|
|
112
|
+
**Edit βΈ Check Spelling** (`F7`) counts the misspellings and lists them in a
|
|
113
|
+
dialog, in or out of edit mode. Both use the optional `pyspellchecker`
|
|
114
|
+
package and degrade gracefully β edit mode stays fully usable, and the menu
|
|
115
|
+
item prompts to install it β when it is absent.
|
|
116
|
+
- **New optional-dependency groups.** `summarize` (`sumy`), `flashcards`
|
|
117
|
+
(`genanki`), and `spellcheck` (`pyspellchecker`), all folded into the `all`
|
|
118
|
+
extra, plus a comment-annotated `requirements-optional.txt` mirroring the
|
|
119
|
+
optional packages for `pip install -r` users.
|
|
120
|
+
|
|
121
|
+
### π Fixed
|
|
122
|
+
|
|
123
|
+
- **Reading highlight no longer runs ahead of eSpeak-NG speech.** In its
|
|
124
|
+
in-process (libespeak-ng) mode, eSpeak synthesizes a whole sentence's audio
|
|
125
|
+
in a burst and reports all of that sentence's word events at once β well
|
|
126
|
+
before the words are actually heard β which made the highlight race up to a
|
|
127
|
+
sentence ahead of the audio. star now paces each word event to the word's
|
|
128
|
+
real audio position (which the engine reports per event) and only advances
|
|
129
|
+
the highlight when that moment arrives, so the highlight follows playback
|
|
130
|
+
instead of synthesis. The highlight timer also tracks these playback-accurate
|
|
131
|
+
events tightly (within a single word) for this backend. A new
|
|
132
|
+
`espeak_highlight_offset_ms` setting (default 120) compensates for audio
|
|
133
|
+
output latency β raise it if highlights still lead the audio, lower it toward
|
|
134
|
+
0 if they lag.
|
|
135
|
+
|
|
136
|
+
### π Notes
|
|
137
|
+
|
|
138
|
+
- Summarize Document uses `Ctrl+Shift+U` rather than `Ctrl+Shift+S`, which was
|
|
139
|
+
already bound to Reading Statistics. Every new command has both a menu entry
|
|
140
|
+
and a keyboard shortcut, keeping star fully keyboard-drivable.
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
## [0.1.5] 2026-06-22
|
|
145
|
+
|
|
146
|
+
### β¨ Added
|
|
147
|
+
|
|
148
|
+
- **In-process eSpeak-NG via libespeak-ng (ctypes).** A new backend drives
|
|
149
|
+
eSpeak-NG through its C library instead of the `espeak-ng` command line. The
|
|
150
|
+
library reports a per-word event for every spoken word, tagged with the
|
|
151
|
+
word's audio position (milliseconds into the output stream), which `star`
|
|
152
|
+
forwards to the reading highlight. It is preferred automatically when the
|
|
153
|
+
shared library is available β the bundled `libespeak-ng.dll` in the
|
|
154
|
+
self-contained Windows build, or a system `libespeak-ng` on Linux/macOS β and
|
|
155
|
+
falls back to the `espeak-ng` command-line backend otherwise. Speech is
|
|
156
|
+
synthesized in short sentence-sized chunks, so pausing, stopping, or switching
|
|
157
|
+
away silences playback promptly instead of running on in the background.
|
|
158
|
+
- **Bundled libespeak-ng in the self-contained Windows build.**
|
|
159
|
+
`tools/build-vendor.py` now fetches eSpeak-NG (1.52.0) and vendors its 64-bit
|
|
160
|
+
`libespeak-ng.dll` plus the `espeak-ng-data` tree, so `star.exe` speaks with
|
|
161
|
+
eSpeak β and the playback-synced highlight β with no separate install.
|
|
162
|
+
- **Batch conversion.** Convert many documents β selected files or a whole
|
|
163
|
+
folder β to one output format (Markdown, plain text, or Braille/BRF) in a
|
|
164
|
+
single step, via **File βΈ Batch Convert** (`Ctrl+Shift+C`) in the Qt GUI or
|
|
165
|
+
`M-x batch-convert` in the terminal UI. Each file runs through the existing
|
|
166
|
+
single-file loadβexport pipeline; a corrupt, password-protected, or
|
|
167
|
+
unsupported file is recorded and skipped instead of aborting the run. Outputs
|
|
168
|
+
reuse the source basename (collisions disambiguated, never overwritten), and a
|
|
169
|
+
timestamped summary β what succeeded, what failed and why, and where outputs
|
|
170
|
+
went β is written alongside the outputs.
|
|
171
|
+
- **Hot-folder watching.** Watch a folder and convert files dropped into it,
|
|
172
|
+
unattended: `star --watch <input_dir> --output <output_dir> --format <fmt>`
|
|
173
|
+
for headless use, or **File βΈ Watch Folder** (`Ctrl+Shift+W`, a toggle) from
|
|
174
|
+
the Qt GUI. Built on the batch core (same formats and validation). Files are
|
|
175
|
+
debounced (processed only once their size has stabilised, so a file still
|
|
176
|
+
being copied in is never read half-written); every attempt is logged with a
|
|
177
|
+
timestamp to `<output>/star-watch.log`; successful sources move to
|
|
178
|
+
`<input>/processed/` and failures to `<input>/failed/` (collisions
|
|
179
|
+
disambiguated, never overwritten); and Ctrl+C / SIGTERM shut it down cleanly
|
|
180
|
+
without interrupting a file mid-conversion. Uses `watchdog` when installed
|
|
181
|
+
(the new `[watch]` optional-dependency group) and falls back to directory
|
|
182
|
+
polling otherwise.
|
|
183
|
+
- **Every Qt menu item now has a keyboard shortcut**, keeping star fully
|
|
184
|
+
keyboard-drivable.
|
|
185
|
+
|
|
186
|
+
### π§ Changed
|
|
187
|
+
|
|
188
|
+
- **The Qt GUI is now star's primary interface.** Ongoing development is focused
|
|
189
|
+
on the Qt GUI, so it is the default and the recommended way to run star. The
|
|
190
|
+
curses terminal UI (`--tui`) remains fully supported and keyboard-driven as a
|
|
191
|
+
secondary interface for headless or text-only environments.
|
|
192
|
+
|
|
193
|
+
### π Fixed
|
|
194
|
+
|
|
195
|
+
- **Reading highlight no longer runs ahead of the audio.** The highlight is
|
|
196
|
+
now anchored to the engine's actual word progress rather than a free-running
|
|
197
|
+
words-per-minute estimate:
|
|
198
|
+
- With eSpeak-NG driven through libespeak-ng, the highlight follows each
|
|
199
|
+
word's reported audio position, so it tracks playback across the whole
|
|
200
|
+
document instead of drifting further ahead over time.
|
|
201
|
+
- For backends that report real word events (pyttsx3 and libespeak-ng), the
|
|
202
|
+
highlight now waits for the first event before starting, so it begins when
|
|
203
|
+
audio actually begins rather than when synthesis was requested β removing a
|
|
204
|
+
constant head start.
|
|
205
|
+
- Note: the previous attempt to read `<mark>` events from the `espeak-ng`
|
|
206
|
+
command line could never work (the CLI does not emit them), so the
|
|
207
|
+
command-line backend remains timer-paced; use libespeak-ng for synced
|
|
208
|
+
highlighting.
|
|
209
|
+
|
|
210
|
+
---
|
|
211
|
+
|
|
212
|
+
## [0.1.4] β 2026-06-22
|
|
213
|
+
|
|
214
|
+
### β¨ Added
|
|
215
|
+
|
|
216
|
+
- **Fat zipapp build (`star.pyz`).** A new `build_zipapp.py` produces a
|
|
217
|
+
single-file `star.pyz` that bundles star together with its Python
|
|
218
|
+
dependencies (the `[all]` extras group). It is self-extracting: on first run
|
|
219
|
+
it unpacks its bundled packages into the per-user config directory (so
|
|
220
|
+
compiled packages such as PyQt6 and PyMuPDF load from real files on disk),
|
|
221
|
+
then starts normally. This removes the `pip install` step β running star this
|
|
222
|
+
way needs only a Python interpreter plus the external engines (ffmpeg,
|
|
223
|
+
Tesseract, liblouis, eSpeak-NG, DECtalk) on `PATH`. Because it carries
|
|
224
|
+
compiled packages, the artifact is **platform-specific** (build one per target
|
|
225
|
+
platform). It is additive and does not replace the self-contained Windows
|
|
226
|
+
`star.exe`, which additionally bundles the external engines.
|
|
227
|
+
|
|
228
|
+
### π§ Changed
|
|
229
|
+
|
|
230
|
+
- **Minimum supported Python is now 3.11** (previously 3.8). The
|
|
231
|
+
`requires-python` constraint, the installer and build scripts, and the build
|
|
232
|
+
documentation were updated to match.
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
## [0.1.3] β 2026-06-16
|
|
237
|
+
|
|
238
|
+
A focused round of reading, speech, and study-workflow additions, all built on
|
|
239
|
+
the existing single-file architecture β `star.py` still runs with zero extras
|
|
240
|
+
installed.
|
|
241
|
+
|
|
242
|
+
### β¨ Added
|
|
243
|
+
|
|
244
|
+
- **Sentence-level highlight option.** A new **highlight granularity**
|
|
245
|
+
control lets the spoken text be highlighted by **word** (default), by whole
|
|
246
|
+
**sentence** (much less visual flicker for readers who find rapid word-by-word
|
|
247
|
+
movement distracting), or **both** (a soft sentence band with the current word
|
|
248
|
+
marked on top). Works in **both** the Qt GUI and the curses TUI. Set it from
|
|
249
|
+
**View β Reading Aids β Karaoke Highlightβ¦** (new *Granularity* selector) or
|
|
250
|
+
`M-x highlight-granularity word|sentence|both` in the TUI. New setting:
|
|
251
|
+
`highlight_granularity` (default `word`).
|
|
252
|
+
- **Timestamped subtitle export β SRT / VTT.** Audio export can now emit a
|
|
253
|
+
synchronized caption track so the highlight "travels" with the audio into any
|
|
254
|
+
media player. Export captions on their own (**File β Export β Export Subtitles
|
|
255
|
+
(SRT / VTT)β¦**, or `M-x export-subtitles`), or have them written automatically
|
|
256
|
+
alongside every audio export (`M-x subtitles-with-audio`). Captions are grouped
|
|
257
|
+
into readable sentence-length cues by default, or one cue per word with
|
|
258
|
+
`M-x subtitle-word-level`. Timing is estimated from the synthesized audio's
|
|
259
|
+
duration, so it needs no external tools. New settings: `subtitle_format`
|
|
260
|
+
(`srt`/`vtt`), `subtitle_word_level`, `export_subtitles_with_audio`. New TUI
|
|
261
|
+
commands: `export-subtitles`, `subtitle-format`, `subtitle-word-level`,
|
|
262
|
+
`subtitles-with-audio`.
|
|
263
|
+
- **A keyboard shortcut for every GUI menu item.** Every command in the Qt
|
|
264
|
+
menus now has a shortcut shown beside it and listed in **Help β Keyboard
|
|
265
|
+
Shortcuts** (`F3`). Bindings follow a consistent scheme β `Ctrl+letter`
|
|
266
|
+
(forward/primary), `Ctrl+Shift+letter` (backward/secondary), `Alt+punct`
|
|
267
|
+
(sentences), `Ctrl+Alt+letter` (exports, citations, tools, reading aids) β
|
|
268
|
+
and each is owned by exactly one action, eliminating the previous duplicate
|
|
269
|
+
toolbar/window bindings that risked Qt βambiguous shortcutβ conflicts. New:
|
|
270
|
+
highlight colors (`Ctrl+Shift+1`β¦`5`), export commands (`Ctrl+Alt+M/P/B/A/U`),
|
|
271
|
+
citation commands (`Ctrl+Alt+I/E/C/D/R/G`), reading aids, and more. All
|
|
272
|
+
bindings remain remappable via **Help β Customize Shortcutsβ¦**.
|
|
273
|
+
- **Tap `Ctrl` to play/pause (JAWS habit).** Pressing and releasing the `Ctrl`
|
|
274
|
+
key on its own toggles speech, mirroring the JAWS βCtrl silences speechβ
|
|
275
|
+
reflex. Using Ctrl as a modifier in a chord never triggers it. New setting:
|
|
276
|
+
`qt_ctrl_pause` (default `true`).
|
|
277
|
+
- **Reading statistics & progress tracking.** STAR now records time read,
|
|
278
|
+
furthest word reached, progress %, and session count per document while
|
|
279
|
+
speech plays, and surfaces them in a dashboard β **Tools β Reading
|
|
280
|
+
Statisticsβ¦** (`Ctrl+Shift+S`) in the Qt GUI and `M-x reading-stats` in the
|
|
281
|
+
TUI β with overall totals and a most-read list. New setting: `reading_stats`.
|
|
282
|
+
- **Library / bookshelf view.** Every opened document is remembered with
|
|
283
|
+
its title, format, progress, and last-opened time. **File β Library /
|
|
284
|
+
Bookshelfβ¦** (`Ctrl+Shift+B`) opens a searchable list (Enter / double-click
|
|
285
|
+
reopens a document); the TUI offers `M-x library`. New setting: `library`.
|
|
286
|
+
- **Live HTML preview while editing.** In edit mode a split pane can show a
|
|
287
|
+
live-rendered HTML preview of the Markdown source beside the editor,
|
|
288
|
+
re-rendering as you type (debounced). Toggle it with **View β Live HTML
|
|
289
|
+
Preview** (`Ctrl+Shift+L`); turning it on outside edit mode enters edit mode.
|
|
290
|
+
New setting: `qt_edit_preview`.
|
|
291
|
+
- **Voice & profile presets.** Save the current voice, rate, volume, theme,
|
|
292
|
+
font, spacing, and highlight settings as a named profile (e.g. βSkimβ, βDeep
|
|
293
|
+
Studyβ, βLow-Lightβ) and switch between them in one step. A new **Profiles**
|
|
294
|
+
menu offers **Save Current Settings as Profileβ¦** (`Ctrl+Shift+K`), **Load
|
|
295
|
+
Profileβ¦** (`Ctrl+Shift+J`), and **Delete Profileβ¦** (`Ctrl+Shift+Y`); the TUI
|
|
296
|
+
adds `M-x profile-save`, `profile-load`, `profile-list`, and `profile-delete`.
|
|
297
|
+
New setting: `profiles`.
|
|
298
|
+
- **Pronunciation lexicon editor.** A user-editable dictionary maps domain
|
|
299
|
+
terms β drug names, anatomy, acronyms β to a spoken form so TTS says them
|
|
300
|
+
correctly and consistently across every backend. Edit it from **Speech β
|
|
301
|
+
Pronunciation Lexiconβ¦** (`Ctrl+Shift+I`) in the Qt GUI, or `M-x pron-add`,
|
|
302
|
+
`pron-list`, `pron-remove`, and `pronunciations` (on/off) in the TUI.
|
|
303
|
+
Pronunciation overrides are applied first, before abbreviation and number
|
|
304
|
+
normalization. New settings: `pronunciations`, `use_pronunciations`.
|
|
305
|
+
- **Piper neural TTS backend.** A new optional **`piper`** backend brings
|
|
306
|
+
free, offline, neural-quality voices via the standalone
|
|
307
|
+
[Piper](https://github.com/rhasspy/piper) binary β no Python package, no
|
|
308
|
+
subscription, no network. Point STAR at a `.onnx` voice model with the new
|
|
309
|
+
`piper_model` setting (or the `PIPER_MODEL` env var, or by dropping models in
|
|
310
|
+
a Piper voice directory) and select it from **Speech β Choose TTS Engineβ¦**
|
|
311
|
+
(new GUI engine picker) or `M-x tts-backend piper`. Like Coqui, it is opt-in
|
|
312
|
+
and never chosen in `auto` mode. New setting: `piper_model`.
|
|
313
|
+
- **Fully self-contained Windows binary.** The portable `star.exe` can now
|
|
314
|
+
bundle the native engines that previously had to be installed separately, so
|
|
315
|
+
a single file does *everything* on a clean PC:
|
|
316
|
+
- **ffmpeg** β MP3 / OGG / MP4 audio export
|
|
317
|
+
- **Tesseract** + English language data β OCR of images and scanned PDFs
|
|
318
|
+
- **liblouis** + translation tables β Grade 2 (contracted) Braille
|
|
319
|
+
- **Pandoc** β high-fidelity markup conversion (RST, Org, MediaWiki,
|
|
320
|
+
AsciiDoc, Textile, LaTeX, legacy `.doc`, β¦)
|
|
321
|
+
- **DECtalk** β the classic βPerfect Paulβ voice, via the bundled
|
|
322
|
+
`DECtalk.dll` + dictionary driven **in-process through ctypes** (no
|
|
323
|
+
separate CLI required); the architecture-matched 64-/32-bit engine is
|
|
324
|
+
selected automatically. On the self-contained Windows build DECtalk is now
|
|
325
|
+
the **default engine** and **Perfect Paul the default voice**, and all
|
|
326
|
+
nine classic speakers β Perfect Paul, Beautiful Betty, Huge Harry, Frail
|
|
327
|
+
Frank, Doctor Dennis, Kit the Kid, Uppity Ursula, Rough Rita, Whispering
|
|
328
|
+
Wendy β appear in the voice picker (**Speech β Choose Voiceβ¦**,
|
|
329
|
+
`Ctrl+Shift+V`). DECtalk is only chosen automatically when the engine
|
|
330
|
+
actually starts (a real startup is probed once), so machines without a
|
|
331
|
+
working DECtalk fall back to pyttsx3/SAPI as before
|
|
332
|
+
|
|
333
|
+
`star.py` locates each bundled engine via a new `_vendor_dir()` resolver
|
|
334
|
+
(`sys._MEIPASS` when frozen) and falls back to a system install when a tool
|
|
335
|
+
is absent, so running from source still needs nothing extra. For Pandoc the
|
|
336
|
+
bundled binary is also exposed to `pypandoc` via `$PYPANDOC_PANDOC`; for
|
|
337
|
+
DECtalk a `say`/`dtalk` CLI (on `PATH` or via `DECTALK_BIN`) still works as a
|
|
338
|
+
fallback. A new `build-vendor.py` helper downloads and assembles the engines
|
|
339
|
+
into `vendor/`, which `star.spec` packs into the bundle (see `BUILD.md`). The
|
|
340
|
+
fully self-contained build is ~300+ MB (Pandoc alone adds ~150 MB); a lean
|
|
341
|
+
build without `vendor/` remains ~90β100 MB.
|
|
342
|
+
|
|
343
|
+
### ποΈ Packaging & architecture
|
|
344
|
+
|
|
345
|
+
- **`star.py` can now be split into an importable `star/` package.** A new
|
|
346
|
+
[`tools/split_star.py`](tools/split_star.py) refactors the monolithic
|
|
347
|
+
`star.py` into logical submodules (`tts`, `tui`, `gui`, `documents`,
|
|
348
|
+
`markup`, `render`, `braille`, `citations`, β¦) under a `star/` package,
|
|
349
|
+
with shared foundational state (stdlib imports, vendored-tool wiring,
|
|
350
|
+
optional-dependency flags, paths, metadata) in `star/_runtime.py` and
|
|
351
|
+
re-exported via `from ._runtime import *`. The tool moves exact source by
|
|
352
|
+
top-level AST node β **nothing is re-typed** β and computes the
|
|
353
|
+
cross-module imports automatically, so the package stays byte-for-byte
|
|
354
|
+
faithful to `star.py`. `star.py` remains the canonical single-file source
|
|
355
|
+
and still runs with zero extras; the generated `star/` package is what the
|
|
356
|
+
wheel ships and what `python -m star` / the `star` console command import.
|
|
357
|
+
- **Pure-Python wheel for macOS / Linux / Windows.** A new
|
|
358
|
+
[`pyproject.toml`](pyproject.toml) builds a single `py3-none-any` wheel
|
|
359
|
+
(`star_reader-<version>-py3-none-any.whl`) that installs `star` and its
|
|
360
|
+
`star` command into any environment. Recommended dependencies (Qt GUI, TTS,
|
|
361
|
+
common document loaders) install by default; the optional features are
|
|
362
|
+
available as extras β `[ocr]`, `[formats]`, `[markup]`, `[braille]`,
|
|
363
|
+
`[audio]`, `[transcribe]`, and `[all]`. Build it with `python -m build
|
|
364
|
+
--wheel` (see `BUILD.md`).
|
|
365
|
+
- **macOS / Linux native-engine bootstrap.** A new
|
|
366
|
+
[`tools/install_native.py`](tools/install_native.py) is the cross-platform
|
|
367
|
+
counterpart of `build-vendor.py`: it installs the native engines (ffmpeg,
|
|
368
|
+
Tesseract + English data, liblouis, Pandoc, and eSpeak-NG on Linux) through
|
|
369
|
+
the system package manager (Homebrew / apt / dnf / pacman / zypper),
|
|
370
|
+
installing only what is missing. Supports `--dry-run` and per-engine
|
|
371
|
+
selection.
|
|
372
|
+
- **Voice dictation & transcription now bundled in the Windows binary.** The
|
|
373
|
+
self-contained `star.exe` ships the full Whisper stack β `openai-whisper`
|
|
374
|
+
with its PyTorch backend, `sounddevice` for microphone capture, and the
|
|
375
|
+
Whisper **`base` model** β so **Tools β Dictate Note** and **Transcribe
|
|
376
|
+
Audio File** work **offline, with no install and no first-run download** on a
|
|
377
|
+
clean machine. A PyInstaller runtime hook
|
|
378
|
+
([`tools/rthook_star.py`](tools/rthook_star.py)) puts the bundled ffmpeg on
|
|
379
|
+
`PATH` (Whisper decodes audio through it) and points Whisper's model cache at
|
|
380
|
+
the bundled `base` model; `tools/build-windows.ps1` installs the dictation
|
|
381
|
+
dependencies and stages the model automatically. PyTorch makes this the
|
|
382
|
+
largest single addition to the bundle (the binary grows to ~600+ MB); the
|
|
383
|
+
dependencies are guarded, so a build without them still succeeds and the
|
|
384
|
+
feature falls back to its βrequires Whisperβ hint. The frozen entry point is
|
|
385
|
+
now [`run_star.py`](run_star.py), which imports `star.app.main` from the
|
|
386
|
+
generated package.
|
|
387
|
+
|
|
388
|
+
### π Notes for upgrading users
|
|
389
|
+
|
|
390
|
+
- All new settings have safe defaults, so existing `settings.json` files keep
|
|
391
|
+
working unchanged; the new keys are added on next save.
|
|
392
|
+
- Subtitle timing is *estimated* (proportional to spoken-token length) because
|
|
393
|
+
file-based TTS synthesis exposes no per-word callbacks. It is accurate enough
|
|
394
|
+
for review and study recordings.
|
|
395
|
+
|
|
396
|
+
---
|
|
397
|
+
|
|
398
|
+
## [0.1.2] β 2026-06-14
|
|
399
|
+
|
|
400
|
+
A substantial revision focused on **reliable, accessible defaults out of the
|
|
401
|
+
box**: native speech on every platform, dependency-free Braille export, smoother
|
|
402
|
+
word-highlight tracking, a more professional default look, and a new set of
|
|
403
|
+
reading-accessibility aids. The single-file architecture is unchanged β
|
|
404
|
+
`star.py` still runs with zero extras installed.
|
|
405
|
+
|
|
406
|
+
### β¨ Added
|
|
407
|
+
|
|
408
|
+
- **Reading accessibility aids (Qt GUI).** A new **View β Reading Aids** submenu
|
|
409
|
+
collects low-friction, high-impact accommodations:
|
|
410
|
+
- **Adjustable text spacing** (WCAG 1.4.12) β independently tune line height,
|
|
411
|
+
letter spacing, and word spacing from a live-preview dialog. New settings:
|
|
412
|
+
`qt_line_height` (default `1.5`), `qt_letter_spacing`, `qt_word_spacing`.
|
|
413
|
+
- **Dyslexia-friendly font preference** β opt in to an installed
|
|
414
|
+
OpenDyslexic / Atkinson Hyperlegible / Lexend / Comic Sans face, with a
|
|
415
|
+
graceful fallback and prompt when none is installed. New setting:
|
|
416
|
+
`qt_dyslexia_font`.
|
|
417
|
+
- **Bionic reading** β embolden the leading part of each word to pull the
|
|
418
|
+
eye forward. New setting: `qt_bionic_reading`.
|
|
419
|
+
- **Current-line focus band** β tint the line being read. New setting:
|
|
420
|
+
`qt_current_line_highlight`.
|
|
421
|
+
- **Karaoke highlight tuning** β choose the spoken-word highlight style
|
|
422
|
+
(`background`, `underline`, `box`, `bold`, `color`), color, pacing
|
|
423
|
+
(`highlight_speed`), and a lead/lag offset. New settings: `highlight_style`,
|
|
424
|
+
`highlight_lead_words`.
|
|
425
|
+
- **Notes dock stays hidden by default.** The Qt **Notes** panel is hidden on
|
|
426
|
+
launch (`qt_show_notes` defaults to `false`) to maximize the reading area; it
|
|
427
|
+
opens on demand via **Ctrl+Shift+N**, **View β Toggle Notes Panel**, or when a
|
|
428
|
+
note is added.
|
|
429
|
+
- **Annotations / notes β now in both interfaces, with tags & search.** The Qt
|
|
430
|
+
**Notes** dock and a new curses TUI notes pager share one per-document store.
|
|
431
|
+
Add at the cursor/selection (`Ctrl+Shift+A` in Qt, `a` in the TUI), attach
|
|
432
|
+
**tags**, and **filter with full-text or `#tag` search** (filter box in Qt;
|
|
433
|
+
`M-x annotations-search` / `annotations-list` in the TUI). Notes carry both a
|
|
434
|
+
Qt char offset and a `word_idx`, so a note made in one interface navigates
|
|
435
|
+
correctly in the other. Export to Markdown, JSON, BibTeX, or RIS. New TUI
|
|
436
|
+
commands: `annotate`, `annotations-list`, `annotations-search`,
|
|
437
|
+
`annotation-goto`, `annotation-delete`, `annotations-export`.
|
|
438
|
+
- **Citation manager (Qt GUI).** A shared citation library with **import and
|
|
439
|
+
export of BibTeX, RIS, and CSL-JSON** (`Citations` menu). Add references by
|
|
440
|
+
hand, browse/copy/delete them, and **link a citation to a note** so exported
|
|
441
|
+
study notes carry attribution. New setting: `citations`.
|
|
442
|
+
- **Whisper voice dictation & transcription (optional).** `Tools β Transcribe
|
|
443
|
+
Audio Fileβ¦` opens a transcription as a new document; `Tools β Dictate Noteβ¦`
|
|
444
|
+
records from the microphone and saves the transcription as a note. Backed by
|
|
445
|
+
`openai-whisper` or `faster-whisper` (+ `sounddevice`/`numpy` for the mic);
|
|
446
|
+
fully guarded when absent. New setting: `whisper_model`.
|
|
447
|
+
- **Keyboard cheat sheet & GUI/TUI parity.** A canonical shortcut scheme is now
|
|
448
|
+
documented in one place and shown in-app (`Help β Keyboard Shortcuts` in Qt,
|
|
449
|
+
`?` / `M-x shortcuts` in the TUI).
|
|
450
|
+
- **Full menu coverage (Qt GUI).** New **Speech**, **Navigate**, **Edit**,
|
|
451
|
+
**Citations**, **Tools**, and **Help** menus put every command within reach
|
|
452
|
+
of the mouse β important for users who don't drive the app by keyboard.
|
|
453
|
+
- **macOS native speech backend (`applesay`).** A new TTS backend drives the
|
|
454
|
+
built-in `/usr/bin/say` command, giving Mac users Apple's high-quality system
|
|
455
|
+
voices with **no extra dependencies** (no `pyobjc`, no Homebrew, no eSpeak).
|
|
456
|
+
In `auto` mode it is ranked **above eSpeak** so a Mac never silently falls
|
|
457
|
+
back to the robotic eSpeak voice.
|
|
458
|
+
- **Preferred default voice resolution.** When no voice is explicitly set,
|
|
459
|
+
star now auto-selects a voice matching the new `tts_prefer_voice` setting
|
|
460
|
+
(default `"eloquence"`), favoring a US-English variant. This makes the
|
|
461
|
+
**Eloquence (US English)** voices bundled with recent macOS the default when
|
|
462
|
+
present. A user's explicit voice choice is never overridden.
|
|
463
|
+
- **Built-in, dependency-free Braille (Grade 1) translator.** BRF export now
|
|
464
|
+
works out of the box with a pure-Python North-American Braille-ASCII (NABCC)
|
|
465
|
+
translator β letters, capital signs, number signs, common punctuation, and
|
|
466
|
+
standard 40-cell / 25-line page geometry with form feeds.
|
|
467
|
+
- **Settings migration.** Settings files written by earlier versions are
|
|
468
|
+
upgraded on load: the deprecated serif `Georgia` font and the lagging `0.85`
|
|
469
|
+
highlight speed are replaced with the new defaults (only when they exactly
|
|
470
|
+
match the old default, so deliberate choices are preserved).
|
|
471
|
+
- New settings keys: `tts_prefer_voice`, `braille_grade2`,
|
|
472
|
+
`audio_export_format`.
|
|
473
|
+
- **Cross-platform installers**: `install.sh` (Linux/macOS) and `install.ps1`
|
|
474
|
+
(Windows) with `minimal` / `recommended` / `all` profiles, virtual-env by
|
|
475
|
+
default, and platform-aware dependency hints (incl. `pyobjc` on macOS,
|
|
476
|
+
`windows-curses` on Windows).
|
|
477
|
+
- **Portable Windows binary build.** A PyInstaller recipe (`star.spec`) and a
|
|
478
|
+
one-command wrapper (`build-windows.ps1`) produce a single, self-contained
|
|
479
|
+
`dist\star.exe` that runs on Windows machines with no Python or dependencies
|
|
480
|
+
installed β ideal for demos. Bundles the Qt GUI, SAPI5 speech, and the core
|
|
481
|
+
document loaders. Documented in `BUILD.md`.
|
|
482
|
+
- New documentation: `CHANGELOG.md` and `BUILD.md` (portable Windows binary).
|
|
483
|
+
|
|
484
|
+
### π§ Changed
|
|
485
|
+
|
|
486
|
+
- **Word-highlight tracking is smooth and continuous.** The highlight timer no
|
|
487
|
+
longer freezes mid-document when SAPI5 word callbacks arrive late or stop
|
|
488
|
+
firing. The pacing guard now allows the highlight to run up to **4 words**
|
|
489
|
+
ahead of the last confirmed audio position and is **bypassed after 1.5 s**
|
|
490
|
+
of callback silence, so the cursor keeps following speech instead of getting
|
|
491
|
+
stuck. (Builds on the timer-generation race fixes already in place.)
|
|
492
|
+
- **Word-position map is monotonic and column-aware.** Repeated common words
|
|
493
|
+
(`the`, `a`, `and`) on a single line are matched in document order instead of
|
|
494
|
+
always snapping back to the first occurrence, and the search position never
|
|
495
|
+
moves backward β eliminating the "highlight stuck several lines back" effect.
|
|
496
|
+
- **Audio export now defaults to WAV** (`audio_export_format`). WAV needs no
|
|
497
|
+
external tools; MP3/OGG/MP4 still work when `ffmpeg` or `pydub` is present.
|
|
498
|
+
- **Default display font is now sans-serif** (`Helvetica Neue` on macOS,
|
|
499
|
+
`Segoe UI` on Windows, `DejaVu Sans` on Linux). Serif faces are discouraged
|
|
500
|
+
for on-screen reading accessibility.
|
|
501
|
+
- **Polished default dark theme** with a modern, professional neutral-dark
|
|
502
|
+
palette (Zed/Ghostty-inspired) for the Qt GUI editor, HTML rendering, and the
|
|
503
|
+
seeded `dark.css` theme.
|
|
504
|
+
- **`highlight_speed` default is now `1.0`** (match speech rate exactly); the
|
|
505
|
+
pacing guard is the real throttle, so the highlight stays tight to the audio.
|
|
506
|
+
- BRF export gained a `braille_grade2` opt-in for contracted Grade 2 via
|
|
507
|
+
liblouis (when installed and the table resolves).
|
|
508
|
+
|
|
509
|
+
### π Fixed
|
|
510
|
+
|
|
511
|
+
- **Qt GUI now runs on PyQt6.** `QAction` was imported from `QtWidgets`, but
|
|
512
|
+
PyQt6 moved it to `QtGui`; the bad import made the whole PyQt6 branch fail, so
|
|
513
|
+
star silently fell back to PyQt5 (and could not start the GUI at all on a
|
|
514
|
+
PyQt6-only machine). `QAction` is now imported from `QtGui` under PyQt6, and a
|
|
515
|
+
couple of PyQt6 enum-to-int conversions (`line height`, full-width selection)
|
|
516
|
+
were hardened. This also makes the frozen Windows binary work.
|
|
517
|
+
- **BRF export no longer crashes the app.** Previously, exporting Braille with
|
|
518
|
+
`louis` installed but a translation table missing could make liblouis call
|
|
519
|
+
`exit()` at the C level, abruptly closing the window. liblouis is now opt-in
|
|
520
|
+
and fully guarded; the built-in Grade 1 translator is the reliable default and
|
|
521
|
+
can never terminate the process.
|
|
522
|
+
- **macOS no longer defaults to eSpeak.** With the new `applesay` backend ranked
|
|
523
|
+
above eSpeak, Macs speak with a native Apple voice by default.
|
|
524
|
+
- Highlighting that previously got "stuck" and stopped advancing down the page
|
|
525
|
+
while speech continued now tracks reading position reliably.
|
|
526
|
+
|
|
527
|
+
### π Notes for upgrading users
|
|
528
|
+
|
|
529
|
+
- Existing `settings.json` files are migrated automatically (see above). To
|
|
530
|
+
adopt the new dark palette, delete `themes/dark.css` in your config directory
|
|
531
|
+
(a fresh, updated copy is regenerated) or pick it from **View β Theme**.
|
|
532
|
+
- macOS users who want pyttsx3's word-boundary callbacks (rather than the
|
|
533
|
+
timer-based highlight used by `say`) can `pip install pyobjc pyttsx3`.
|
|
534
|
+
|
|
535
|
+
---
|
|
536
|
+
|
|
537
|
+
## [0.1.1] β earlier
|
|
538
|
+
|
|
539
|
+
Initial public lineage of star prior to the 0.1.2 revision: single-file Qt GUI
|
|
540
|
+
and curses TUI, multi-format document loading (PDF, EPUB, DAISY/DTBook, DOCX,
|
|
541
|
+
PPTX, ODT, HTML, Markdown, LaTeX, RST and many markup formats, CSV/XLSX, images
|
|
542
|
+
via OCR, notebooks, source code), multiple TTS backends, themes, search,
|
|
543
|
+
bookmarks, reading-position memory, speed presets, Speech Cursor mode,
|
|
544
|
+
table-of-contents navigation, user highlights, audio export, document caching,
|
|
545
|
+
and screen-reader compatibility.
|
|
546
|
+
|
|
547
|
+
[0.1.3]: #013--2026-06-16
|
|
548
|
+
[0.1.2]: #012--2026-06-14
|
|
549
|
+
[0.1.1]: #011--earlier
|