star-reader 0.1.8__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
star/CHANGELOG.md ADDED
@@ -0,0 +1,549 @@
1
+ # πŸ“œ Changelog
2
+
3
+ All notable changes to **star β€” Speaking Terminal Access Reader** are documented
4
+ in this file.
5
+
6
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
7
+ and this project aims to follow [Semantic Versioning](https://semver.org/).
8
+
9
+ ---
10
+
11
+ ## [0.1.8] 2026-06-23
12
+
13
+ ### ✨ Added
14
+
15
+ - **Published to PyPI.** star is now installable with `pip install star-reader`
16
+ (or `pipx install star-reader`) β€” no manual wheel download. The release
17
+ workflow publishes the wheel and sdist via PyPI **trusted publishing** (OIDC,
18
+ no stored API token): pre-release tags (e.g. `v0.1.8-rc1`) go to TestPyPI and
19
+ final tags to PyPI.
20
+
21
+ ### πŸ—οΈ Build & CI
22
+
23
+ - **Continuous integration.** A GitHub Actions test matrix (Linux / Windows /
24
+ macOS Γ— Python 3.11–3.13, with one leg that installs the optional packages so
25
+ the real-behaviour tests run) and a non-blocking `ruff` lint gate run on every
26
+ push and pull request.
27
+ - **Automated releases.** A tag-triggered workflow builds the universal wheel +
28
+ sdist, the Windows `star.pyz`, and the Windows `star.exe`, and publishes a
29
+ GitHub Release with generated notes.
30
+ - **Optional lean Windows build.** The Windows `star.exe` still bundles the
31
+ offline dictation stack (Whisper + PyTorch + the `base` model) **by default**,
32
+ so users get voice dictation out of the box. A new `-Lean` switch on
33
+ `tools/build-windows.ps1` (or the release workflow's `lean: true` input) skips
34
+ that multi-GB stack for a fast, small build β€” useful for quick test builds and
35
+ CI iteration; a lean `star.exe` reports dictation as unavailable in
36
+ `star --deps` and is otherwise fully functional.
37
+
38
+ ---
39
+
40
+ ## [0.1.7] 2026-06-23
41
+
42
+ ### ✨ Added
43
+
44
+ - **Document translation.** A new **Tools β–Έ Translate Document…**
45
+ (`Ctrl+Shift+X`) translates the open document into any of 15 common languages
46
+ via Google Translate (no API key, no account). A picker dialog chooses the
47
+ target language and shows the result in a read-only pane; the network call
48
+ runs on a background thread so the window stays responsive, and the input is
49
+ capped at 15 000 characters per request to stay within rate limits. Requires
50
+ the optional `deep-translator` package; the menu item prompts to install it
51
+ when absent.
52
+ - **RSS / Atom feed reading.** **File β–Έ Open Feed…** (`Ctrl+Shift+M`) fetches a
53
+ feed URL, lists its articles, and opens the chosen one in the reader through
54
+ star's normal URL-loading path. Useful for tracking arXiv, PubMed, bioRxiv,
55
+ or journal feeds without leaving star. Requires the optional `feedparser`
56
+ package; the menu item prompts to install it when absent.
57
+ - **Difficult-word overlay.** **View β–Έ Reading Aids β–Έ Highlight Difficult
58
+ Words** (`Ctrl+Alt+O`) tints uncommon / academic vocabulary by word
59
+ frequency, giving a visual pre-scan of dense terminology before reading. The
60
+ overlay is non-destructive (it rides the existing extra-selection pipeline,
61
+ sitting under user highlights and the TTS word highlight), persists across
62
+ sessions (`qt_vocab_highlight`), and recomputes on each document load.
63
+ Requires the optional `wordfreq` package.
64
+ - **Dependency status report.** A new `star --deps` flag prints the
65
+ availability of every optional dependency, grouped by area, with a one-line
66
+ description and a copy-paste install hint for anything missing β€” backed by a
67
+ new `star.diagnostics` module that is the single source of truth for star's
68
+ optional dependencies.
69
+ - **New optional-dependency groups.** `translate` (`deep-translator`), `feeds`
70
+ (`feedparser`), and `vocab` (`wordfreq`), all folded into the `all` extra and
71
+ mirrored in `requirements-optional.txt`.
72
+
73
+ ### πŸ§ͺ Tests
74
+
75
+ - **General dependency harness.** `tests/test_dependencies.py` verifies the new
76
+ diagnostics registry against the codebase: a completeness check fails if any
77
+ import guard is ever added without being registered, and a per-dependency
78
+ consistency check asserts that anything reported as available really does
79
+ import. `tests/test_features.py` covers the translation, feed, and
80
+ difficult-word logic, including their graceful-degradation paths.
81
+
82
+ ### πŸ“ Notes
83
+
84
+ - The three new commands use `Ctrl+Shift+X`, `Ctrl+Shift+M`, and `Ctrl+Alt+O`
85
+ β€” the more intuitive `Ctrl+Shift+L/F` and `Ctrl+Alt+W` were already bound
86
+ (live preview, themes folder, text spacing). All three are also reachable
87
+ from the F2 command palette, which now additionally lists Summarize, Anki
88
+ export, and Check Spelling for completeness.
89
+
90
+ ---
91
+
92
+ ## [0.1.6] 2026-06-23
93
+
94
+ ### ✨ Added
95
+
96
+ - **Document summarization.** A new **Tools β–Έ Summarize Document**
97
+ (`Ctrl+Shift+U`) condenses the open document to its key sentences using the
98
+ LexRank algorithm (via the optional `sumy` package) and shows the result in a
99
+ read-only dialog. The number of sentences is configurable through the
100
+ `summary_sentences` setting (default 7). Summarization runs on a background
101
+ thread so the window stays responsive on long documents. Requires
102
+ `pip install sumy`; the menu item prompts to install it when absent. The
103
+ NLTK sentence-tokenizer data it needs is fetched automatically on first use.
104
+ - **Anki flashcard export.** **File β–Έ Export β–Έ Anki Flashcards…**
105
+ (`Ctrl+Alt+H`) turns the current document's notes into an Anki deck
106
+ (`.apkg`): each note becomes one card with the highlighted passage on the
107
+ front and your note on the back. Requires the optional `genanki` package;
108
+ the menu item prompts to install it when absent, and prompts you to add a
109
+ note first if the document has none.
110
+ - **Spell checking in edit mode.** While editing a document's Markdown source,
111
+ misspelled words are underlined with a red squiggle, rechecked as you type.
112
+ **Edit β–Έ Check Spelling** (`F7`) counts the misspellings and lists them in a
113
+ dialog, in or out of edit mode. Both use the optional `pyspellchecker`
114
+ package and degrade gracefully β€” edit mode stays fully usable, and the menu
115
+ item prompts to install it β€” when it is absent.
116
+ - **New optional-dependency groups.** `summarize` (`sumy`), `flashcards`
117
+ (`genanki`), and `spellcheck` (`pyspellchecker`), all folded into the `all`
118
+ extra, plus a comment-annotated `requirements-optional.txt` mirroring the
119
+ optional packages for `pip install -r` users.
120
+
121
+ ### πŸ› Fixed
122
+
123
+ - **Reading highlight no longer runs ahead of eSpeak-NG speech.** In its
124
+ in-process (libespeak-ng) mode, eSpeak synthesizes a whole sentence's audio
125
+ in a burst and reports all of that sentence's word events at once β€” well
126
+ before the words are actually heard β€” which made the highlight race up to a
127
+ sentence ahead of the audio. star now paces each word event to the word's
128
+ real audio position (which the engine reports per event) and only advances
129
+ the highlight when that moment arrives, so the highlight follows playback
130
+ instead of synthesis. The highlight timer also tracks these playback-accurate
131
+ events tightly (within a single word) for this backend. A new
132
+ `espeak_highlight_offset_ms` setting (default 120) compensates for audio
133
+ output latency β€” raise it if highlights still lead the audio, lower it toward
134
+ 0 if they lag.
135
+
136
+ ### πŸ“ Notes
137
+
138
+ - Summarize Document uses `Ctrl+Shift+U` rather than `Ctrl+Shift+S`, which was
139
+ already bound to Reading Statistics. Every new command has both a menu entry
140
+ and a keyboard shortcut, keeping star fully keyboard-drivable.
141
+
142
+ ---
143
+
144
+ ## [0.1.5] 2026-06-22
145
+
146
+ ### ✨ Added
147
+
148
+ - **In-process eSpeak-NG via libespeak-ng (ctypes).** A new backend drives
149
+ eSpeak-NG through its C library instead of the `espeak-ng` command line. The
150
+ library reports a per-word event for every spoken word, tagged with the
151
+ word's audio position (milliseconds into the output stream), which `star`
152
+ forwards to the reading highlight. It is preferred automatically when the
153
+ shared library is available β€” the bundled `libespeak-ng.dll` in the
154
+ self-contained Windows build, or a system `libespeak-ng` on Linux/macOS β€” and
155
+ falls back to the `espeak-ng` command-line backend otherwise. Speech is
156
+ synthesized in short sentence-sized chunks, so pausing, stopping, or switching
157
+ away silences playback promptly instead of running on in the background.
158
+ - **Bundled libespeak-ng in the self-contained Windows build.**
159
+ `tools/build-vendor.py` now fetches eSpeak-NG (1.52.0) and vendors its 64-bit
160
+ `libespeak-ng.dll` plus the `espeak-ng-data` tree, so `star.exe` speaks with
161
+ eSpeak β€” and the playback-synced highlight β€” with no separate install.
162
+ - **Batch conversion.** Convert many documents β€” selected files or a whole
163
+ folder β€” to one output format (Markdown, plain text, or Braille/BRF) in a
164
+ single step, via **File β–Έ Batch Convert** (`Ctrl+Shift+C`) in the Qt GUI or
165
+ `M-x batch-convert` in the terminal UI. Each file runs through the existing
166
+ single-file load→export pipeline; a corrupt, password-protected, or
167
+ unsupported file is recorded and skipped instead of aborting the run. Outputs
168
+ reuse the source basename (collisions disambiguated, never overwritten), and a
169
+ timestamped summary β€” what succeeded, what failed and why, and where outputs
170
+ went β€” is written alongside the outputs.
171
+ - **Hot-folder watching.** Watch a folder and convert files dropped into it,
172
+ unattended: `star --watch <input_dir> --output <output_dir> --format <fmt>`
173
+ for headless use, or **File β–Έ Watch Folder** (`Ctrl+Shift+W`, a toggle) from
174
+ the Qt GUI. Built on the batch core (same formats and validation). Files are
175
+ debounced (processed only once their size has stabilised, so a file still
176
+ being copied in is never read half-written); every attempt is logged with a
177
+ timestamp to `<output>/star-watch.log`; successful sources move to
178
+ `<input>/processed/` and failures to `<input>/failed/` (collisions
179
+ disambiguated, never overwritten); and Ctrl+C / SIGTERM shut it down cleanly
180
+ without interrupting a file mid-conversion. Uses `watchdog` when installed
181
+ (the new `[watch]` optional-dependency group) and falls back to directory
182
+ polling otherwise.
183
+ - **Every Qt menu item now has a keyboard shortcut**, keeping star fully
184
+ keyboard-drivable.
185
+
186
+ ### πŸ”§ Changed
187
+
188
+ - **The Qt GUI is now star's primary interface.** Ongoing development is focused
189
+ on the Qt GUI, so it is the default and the recommended way to run star. The
190
+ curses terminal UI (`--tui`) remains fully supported and keyboard-driven as a
191
+ secondary interface for headless or text-only environments.
192
+
193
+ ### πŸ› Fixed
194
+
195
+ - **Reading highlight no longer runs ahead of the audio.** The highlight is
196
+ now anchored to the engine's actual word progress rather than a free-running
197
+ words-per-minute estimate:
198
+ - With eSpeak-NG driven through libespeak-ng, the highlight follows each
199
+ word's reported audio position, so it tracks playback across the whole
200
+ document instead of drifting further ahead over time.
201
+ - For backends that report real word events (pyttsx3 and libespeak-ng), the
202
+ highlight now waits for the first event before starting, so it begins when
203
+ audio actually begins rather than when synthesis was requested β€” removing a
204
+ constant head start.
205
+ - Note: the previous attempt to read `<mark>` events from the `espeak-ng`
206
+ command line could never work (the CLI does not emit them), so the
207
+ command-line backend remains timer-paced; use libespeak-ng for synced
208
+ highlighting.
209
+
210
+ ---
211
+
212
+ ## [0.1.4] β€” 2026-06-22
213
+
214
+ ### ✨ Added
215
+
216
+ - **Fat zipapp build (`star.pyz`).** A new `build_zipapp.py` produces a
217
+ single-file `star.pyz` that bundles star together with its Python
218
+ dependencies (the `[all]` extras group). It is self-extracting: on first run
219
+ it unpacks its bundled packages into the per-user config directory (so
220
+ compiled packages such as PyQt6 and PyMuPDF load from real files on disk),
221
+ then starts normally. This removes the `pip install` step β€” running star this
222
+ way needs only a Python interpreter plus the external engines (ffmpeg,
223
+ Tesseract, liblouis, eSpeak-NG, DECtalk) on `PATH`. Because it carries
224
+ compiled packages, the artifact is **platform-specific** (build one per target
225
+ platform). It is additive and does not replace the self-contained Windows
226
+ `star.exe`, which additionally bundles the external engines.
227
+
228
+ ### πŸ”§ Changed
229
+
230
+ - **Minimum supported Python is now 3.11** (previously 3.8). The
231
+ `requires-python` constraint, the installer and build scripts, and the build
232
+ documentation were updated to match.
233
+
234
+ ---
235
+
236
+ ## [0.1.3] β€” 2026-06-16
237
+
238
+ A focused round of reading, speech, and study-workflow additions, all built on
239
+ the existing single-file architecture β€” `star.py` still runs with zero extras
240
+ installed.
241
+
242
+ ### ✨ Added
243
+
244
+ - **Sentence-level highlight option.** A new **highlight granularity**
245
+ control lets the spoken text be highlighted by **word** (default), by whole
246
+ **sentence** (much less visual flicker for readers who find rapid word-by-word
247
+ movement distracting), or **both** (a soft sentence band with the current word
248
+ marked on top). Works in **both** the Qt GUI and the curses TUI. Set it from
249
+ **View β†’ Reading Aids β†’ Karaoke Highlight…** (new *Granularity* selector) or
250
+ `M-x highlight-granularity word|sentence|both` in the TUI. New setting:
251
+ `highlight_granularity` (default `word`).
252
+ - **Timestamped subtitle export β€” SRT / VTT.** Audio export can now emit a
253
+ synchronized caption track so the highlight "travels" with the audio into any
254
+ media player. Export captions on their own (**File β†’ Export β†’ Export Subtitles
255
+ (SRT / VTT)…**, or `M-x export-subtitles`), or have them written automatically
256
+ alongside every audio export (`M-x subtitles-with-audio`). Captions are grouped
257
+ into readable sentence-length cues by default, or one cue per word with
258
+ `M-x subtitle-word-level`. Timing is estimated from the synthesized audio's
259
+ duration, so it needs no external tools. New settings: `subtitle_format`
260
+ (`srt`/`vtt`), `subtitle_word_level`, `export_subtitles_with_audio`. New TUI
261
+ commands: `export-subtitles`, `subtitle-format`, `subtitle-word-level`,
262
+ `subtitles-with-audio`.
263
+ - **A keyboard shortcut for every GUI menu item.** Every command in the Qt
264
+ menus now has a shortcut shown beside it and listed in **Help β†’ Keyboard
265
+ Shortcuts** (`F3`). Bindings follow a consistent scheme β€” `Ctrl+letter`
266
+ (forward/primary), `Ctrl+Shift+letter` (backward/secondary), `Alt+punct`
267
+ (sentences), `Ctrl+Alt+letter` (exports, citations, tools, reading aids) β€”
268
+ and each is owned by exactly one action, eliminating the previous duplicate
269
+ toolbar/window bindings that risked Qt β€œambiguous shortcut” conflicts. New:
270
+ highlight colors (`Ctrl+Shift+1`…`5`), export commands (`Ctrl+Alt+M/P/B/A/U`),
271
+ citation commands (`Ctrl+Alt+I/E/C/D/R/G`), reading aids, and more. All
272
+ bindings remain remappable via **Help β†’ Customize Shortcuts…**.
273
+ - **Tap `Ctrl` to play/pause (JAWS habit).** Pressing and releasing the `Ctrl`
274
+ key on its own toggles speech, mirroring the JAWS β€œCtrl silences speech”
275
+ reflex. Using Ctrl as a modifier in a chord never triggers it. New setting:
276
+ `qt_ctrl_pause` (default `true`).
277
+ - **Reading statistics & progress tracking.** STAR now records time read,
278
+ furthest word reached, progress %, and session count per document while
279
+ speech plays, and surfaces them in a dashboard β€” **Tools β†’ Reading
280
+ Statistics…** (`Ctrl+Shift+S`) in the Qt GUI and `M-x reading-stats` in the
281
+ TUI β€” with overall totals and a most-read list. New setting: `reading_stats`.
282
+ - **Library / bookshelf view.** Every opened document is remembered with
283
+ its title, format, progress, and last-opened time. **File β†’ Library /
284
+ Bookshelf…** (`Ctrl+Shift+B`) opens a searchable list (Enter / double-click
285
+ reopens a document); the TUI offers `M-x library`. New setting: `library`.
286
+ - **Live HTML preview while editing.** In edit mode a split pane can show a
287
+ live-rendered HTML preview of the Markdown source beside the editor,
288
+ re-rendering as you type (debounced). Toggle it with **View β†’ Live HTML
289
+ Preview** (`Ctrl+Shift+L`); turning it on outside edit mode enters edit mode.
290
+ New setting: `qt_edit_preview`.
291
+ - **Voice & profile presets.** Save the current voice, rate, volume, theme,
292
+ font, spacing, and highlight settings as a named profile (e.g. β€œSkim”, β€œDeep
293
+ Study”, β€œLow-Light”) and switch between them in one step. A new **Profiles**
294
+ menu offers **Save Current Settings as Profile…** (`Ctrl+Shift+K`), **Load
295
+ Profile…** (`Ctrl+Shift+J`), and **Delete Profile…** (`Ctrl+Shift+Y`); the TUI
296
+ adds `M-x profile-save`, `profile-load`, `profile-list`, and `profile-delete`.
297
+ New setting: `profiles`.
298
+ - **Pronunciation lexicon editor.** A user-editable dictionary maps domain
299
+ terms β€” drug names, anatomy, acronyms β€” to a spoken form so TTS says them
300
+ correctly and consistently across every backend. Edit it from **Speech β†’
301
+ Pronunciation Lexicon…** (`Ctrl+Shift+I`) in the Qt GUI, or `M-x pron-add`,
302
+ `pron-list`, `pron-remove`, and `pronunciations` (on/off) in the TUI.
303
+ Pronunciation overrides are applied first, before abbreviation and number
304
+ normalization. New settings: `pronunciations`, `use_pronunciations`.
305
+ - **Piper neural TTS backend.** A new optional **`piper`** backend brings
306
+ free, offline, neural-quality voices via the standalone
307
+ [Piper](https://github.com/rhasspy/piper) binary β€” no Python package, no
308
+ subscription, no network. Point STAR at a `.onnx` voice model with the new
309
+ `piper_model` setting (or the `PIPER_MODEL` env var, or by dropping models in
310
+ a Piper voice directory) and select it from **Speech β†’ Choose TTS Engine…**
311
+ (new GUI engine picker) or `M-x tts-backend piper`. Like Coqui, it is opt-in
312
+ and never chosen in `auto` mode. New setting: `piper_model`.
313
+ - **Fully self-contained Windows binary.** The portable `star.exe` can now
314
+ bundle the native engines that previously had to be installed separately, so
315
+ a single file does *everything* on a clean PC:
316
+ - **ffmpeg** β†’ MP3 / OGG / MP4 audio export
317
+ - **Tesseract** + English language data β†’ OCR of images and scanned PDFs
318
+ - **liblouis** + translation tables β†’ Grade 2 (contracted) Braille
319
+ - **Pandoc** β†’ high-fidelity markup conversion (RST, Org, MediaWiki,
320
+ AsciiDoc, Textile, LaTeX, legacy `.doc`, …)
321
+ - **DECtalk** β†’ the classic β€œPerfect Paul” voice, via the bundled
322
+ `DECtalk.dll` + dictionary driven **in-process through ctypes** (no
323
+ separate CLI required); the architecture-matched 64-/32-bit engine is
324
+ selected automatically. On the self-contained Windows build DECtalk is now
325
+ the **default engine** and **Perfect Paul the default voice**, and all
326
+ nine classic speakers β€” Perfect Paul, Beautiful Betty, Huge Harry, Frail
327
+ Frank, Doctor Dennis, Kit the Kid, Uppity Ursula, Rough Rita, Whispering
328
+ Wendy β€” appear in the voice picker (**Speech β†’ Choose Voice…**,
329
+ `Ctrl+Shift+V`). DECtalk is only chosen automatically when the engine
330
+ actually starts (a real startup is probed once), so machines without a
331
+ working DECtalk fall back to pyttsx3/SAPI as before
332
+
333
+ `star.py` locates each bundled engine via a new `_vendor_dir()` resolver
334
+ (`sys._MEIPASS` when frozen) and falls back to a system install when a tool
335
+ is absent, so running from source still needs nothing extra. For Pandoc the
336
+ bundled binary is also exposed to `pypandoc` via `$PYPANDOC_PANDOC`; for
337
+ DECtalk a `say`/`dtalk` CLI (on `PATH` or via `DECTALK_BIN`) still works as a
338
+ fallback. A new `build-vendor.py` helper downloads and assembles the engines
339
+ into `vendor/`, which `star.spec` packs into the bundle (see `BUILD.md`). The
340
+ fully self-contained build is ~300+ MB (Pandoc alone adds ~150 MB); a lean
341
+ build without `vendor/` remains ~90–100 MB.
342
+
343
+ ### πŸ—οΈ Packaging & architecture
344
+
345
+ - **`star.py` can now be split into an importable `star/` package.** A new
346
+ [`tools/split_star.py`](tools/split_star.py) refactors the monolithic
347
+ `star.py` into logical submodules (`tts`, `tui`, `gui`, `documents`,
348
+ `markup`, `render`, `braille`, `citations`, …) under a `star/` package,
349
+ with shared foundational state (stdlib imports, vendored-tool wiring,
350
+ optional-dependency flags, paths, metadata) in `star/_runtime.py` and
351
+ re-exported via `from ._runtime import *`. The tool moves exact source by
352
+ top-level AST node β€” **nothing is re-typed** β€” and computes the
353
+ cross-module imports automatically, so the package stays byte-for-byte
354
+ faithful to `star.py`. `star.py` remains the canonical single-file source
355
+ and still runs with zero extras; the generated `star/` package is what the
356
+ wheel ships and what `python -m star` / the `star` console command import.
357
+ - **Pure-Python wheel for macOS / Linux / Windows.** A new
358
+ [`pyproject.toml`](pyproject.toml) builds a single `py3-none-any` wheel
359
+ (`star_reader-<version>-py3-none-any.whl`) that installs `star` and its
360
+ `star` command into any environment. Recommended dependencies (Qt GUI, TTS,
361
+ common document loaders) install by default; the optional features are
362
+ available as extras β€” `[ocr]`, `[formats]`, `[markup]`, `[braille]`,
363
+ `[audio]`, `[transcribe]`, and `[all]`. Build it with `python -m build
364
+ --wheel` (see `BUILD.md`).
365
+ - **macOS / Linux native-engine bootstrap.** A new
366
+ [`tools/install_native.py`](tools/install_native.py) is the cross-platform
367
+ counterpart of `build-vendor.py`: it installs the native engines (ffmpeg,
368
+ Tesseract + English data, liblouis, Pandoc, and eSpeak-NG on Linux) through
369
+ the system package manager (Homebrew / apt / dnf / pacman / zypper),
370
+ installing only what is missing. Supports `--dry-run` and per-engine
371
+ selection.
372
+ - **Voice dictation & transcription now bundled in the Windows binary.** The
373
+ self-contained `star.exe` ships the full Whisper stack β€” `openai-whisper`
374
+ with its PyTorch backend, `sounddevice` for microphone capture, and the
375
+ Whisper **`base` model** β€” so **Tools β†’ Dictate Note** and **Transcribe
376
+ Audio File** work **offline, with no install and no first-run download** on a
377
+ clean machine. A PyInstaller runtime hook
378
+ ([`tools/rthook_star.py`](tools/rthook_star.py)) puts the bundled ffmpeg on
379
+ `PATH` (Whisper decodes audio through it) and points Whisper's model cache at
380
+ the bundled `base` model; `tools/build-windows.ps1` installs the dictation
381
+ dependencies and stages the model automatically. PyTorch makes this the
382
+ largest single addition to the bundle (the binary grows to ~600+ MB); the
383
+ dependencies are guarded, so a build without them still succeeds and the
384
+ feature falls back to its β€œrequires Whisper” hint. The frozen entry point is
385
+ now [`run_star.py`](run_star.py), which imports `star.app.main` from the
386
+ generated package.
387
+
388
+ ### πŸ“ Notes for upgrading users
389
+
390
+ - All new settings have safe defaults, so existing `settings.json` files keep
391
+ working unchanged; the new keys are added on next save.
392
+ - Subtitle timing is *estimated* (proportional to spoken-token length) because
393
+ file-based TTS synthesis exposes no per-word callbacks. It is accurate enough
394
+ for review and study recordings.
395
+
396
+ ---
397
+
398
+ ## [0.1.2] β€” 2026-06-14
399
+
400
+ A substantial revision focused on **reliable, accessible defaults out of the
401
+ box**: native speech on every platform, dependency-free Braille export, smoother
402
+ word-highlight tracking, a more professional default look, and a new set of
403
+ reading-accessibility aids. The single-file architecture is unchanged β€”
404
+ `star.py` still runs with zero extras installed.
405
+
406
+ ### ✨ Added
407
+
408
+ - **Reading accessibility aids (Qt GUI).** A new **View β†’ Reading Aids** submenu
409
+ collects low-friction, high-impact accommodations:
410
+ - **Adjustable text spacing** (WCAG 1.4.12) β€” independently tune line height,
411
+ letter spacing, and word spacing from a live-preview dialog. New settings:
412
+ `qt_line_height` (default `1.5`), `qt_letter_spacing`, `qt_word_spacing`.
413
+ - **Dyslexia-friendly font preference** β€” opt in to an installed
414
+ OpenDyslexic / Atkinson Hyperlegible / Lexend / Comic Sans face, with a
415
+ graceful fallback and prompt when none is installed. New setting:
416
+ `qt_dyslexia_font`.
417
+ - **Bionic reading** β€” embolden the leading part of each word to pull the
418
+ eye forward. New setting: `qt_bionic_reading`.
419
+ - **Current-line focus band** β€” tint the line being read. New setting:
420
+ `qt_current_line_highlight`.
421
+ - **Karaoke highlight tuning** β€” choose the spoken-word highlight style
422
+ (`background`, `underline`, `box`, `bold`, `color`), color, pacing
423
+ (`highlight_speed`), and a lead/lag offset. New settings: `highlight_style`,
424
+ `highlight_lead_words`.
425
+ - **Notes dock stays hidden by default.** The Qt **Notes** panel is hidden on
426
+ launch (`qt_show_notes` defaults to `false`) to maximize the reading area; it
427
+ opens on demand via **Ctrl+Shift+N**, **View β†’ Toggle Notes Panel**, or when a
428
+ note is added.
429
+ - **Annotations / notes β€” now in both interfaces, with tags & search.** The Qt
430
+ **Notes** dock and a new curses TUI notes pager share one per-document store.
431
+ Add at the cursor/selection (`Ctrl+Shift+A` in Qt, `a` in the TUI), attach
432
+ **tags**, and **filter with full-text or `#tag` search** (filter box in Qt;
433
+ `M-x annotations-search` / `annotations-list` in the TUI). Notes carry both a
434
+ Qt char offset and a `word_idx`, so a note made in one interface navigates
435
+ correctly in the other. Export to Markdown, JSON, BibTeX, or RIS. New TUI
436
+ commands: `annotate`, `annotations-list`, `annotations-search`,
437
+ `annotation-goto`, `annotation-delete`, `annotations-export`.
438
+ - **Citation manager (Qt GUI).** A shared citation library with **import and
439
+ export of BibTeX, RIS, and CSL-JSON** (`Citations` menu). Add references by
440
+ hand, browse/copy/delete them, and **link a citation to a note** so exported
441
+ study notes carry attribution. New setting: `citations`.
442
+ - **Whisper voice dictation & transcription (optional).** `Tools β†’ Transcribe
443
+ Audio File…` opens a transcription as a new document; `Tools β†’ Dictate Note…`
444
+ records from the microphone and saves the transcription as a note. Backed by
445
+ `openai-whisper` or `faster-whisper` (+ `sounddevice`/`numpy` for the mic);
446
+ fully guarded when absent. New setting: `whisper_model`.
447
+ - **Keyboard cheat sheet & GUI/TUI parity.** A canonical shortcut scheme is now
448
+ documented in one place and shown in-app (`Help β†’ Keyboard Shortcuts` in Qt,
449
+ `?` / `M-x shortcuts` in the TUI).
450
+ - **Full menu coverage (Qt GUI).** New **Speech**, **Navigate**, **Edit**,
451
+ **Citations**, **Tools**, and **Help** menus put every command within reach
452
+ of the mouse β€” important for users who don't drive the app by keyboard.
453
+ - **macOS native speech backend (`applesay`).** A new TTS backend drives the
454
+ built-in `/usr/bin/say` command, giving Mac users Apple's high-quality system
455
+ voices with **no extra dependencies** (no `pyobjc`, no Homebrew, no eSpeak).
456
+ In `auto` mode it is ranked **above eSpeak** so a Mac never silently falls
457
+ back to the robotic eSpeak voice.
458
+ - **Preferred default voice resolution.** When no voice is explicitly set,
459
+ star now auto-selects a voice matching the new `tts_prefer_voice` setting
460
+ (default `"eloquence"`), favoring a US-English variant. This makes the
461
+ **Eloquence (US English)** voices bundled with recent macOS the default when
462
+ present. A user's explicit voice choice is never overridden.
463
+ - **Built-in, dependency-free Braille (Grade 1) translator.** BRF export now
464
+ works out of the box with a pure-Python North-American Braille-ASCII (NABCC)
465
+ translator β€” letters, capital signs, number signs, common punctuation, and
466
+ standard 40-cell / 25-line page geometry with form feeds.
467
+ - **Settings migration.** Settings files written by earlier versions are
468
+ upgraded on load: the deprecated serif `Georgia` font and the lagging `0.85`
469
+ highlight speed are replaced with the new defaults (only when they exactly
470
+ match the old default, so deliberate choices are preserved).
471
+ - New settings keys: `tts_prefer_voice`, `braille_grade2`,
472
+ `audio_export_format`.
473
+ - **Cross-platform installers**: `install.sh` (Linux/macOS) and `install.ps1`
474
+ (Windows) with `minimal` / `recommended` / `all` profiles, virtual-env by
475
+ default, and platform-aware dependency hints (incl. `pyobjc` on macOS,
476
+ `windows-curses` on Windows).
477
+ - **Portable Windows binary build.** A PyInstaller recipe (`star.spec`) and a
478
+ one-command wrapper (`build-windows.ps1`) produce a single, self-contained
479
+ `dist\star.exe` that runs on Windows machines with no Python or dependencies
480
+ installed β€” ideal for demos. Bundles the Qt GUI, SAPI5 speech, and the core
481
+ document loaders. Documented in `BUILD.md`.
482
+ - New documentation: `CHANGELOG.md` and `BUILD.md` (portable Windows binary).
483
+
484
+ ### πŸ”§ Changed
485
+
486
+ - **Word-highlight tracking is smooth and continuous.** The highlight timer no
487
+ longer freezes mid-document when SAPI5 word callbacks arrive late or stop
488
+ firing. The pacing guard now allows the highlight to run up to **4 words**
489
+ ahead of the last confirmed audio position and is **bypassed after 1.5 s**
490
+ of callback silence, so the cursor keeps following speech instead of getting
491
+ stuck. (Builds on the timer-generation race fixes already in place.)
492
+ - **Word-position map is monotonic and column-aware.** Repeated common words
493
+ (`the`, `a`, `and`) on a single line are matched in document order instead of
494
+ always snapping back to the first occurrence, and the search position never
495
+ moves backward β€” eliminating the "highlight stuck several lines back" effect.
496
+ - **Audio export now defaults to WAV** (`audio_export_format`). WAV needs no
497
+ external tools; MP3/OGG/MP4 still work when `ffmpeg` or `pydub` is present.
498
+ - **Default display font is now sans-serif** (`Helvetica Neue` on macOS,
499
+ `Segoe UI` on Windows, `DejaVu Sans` on Linux). Serif faces are discouraged
500
+ for on-screen reading accessibility.
501
+ - **Polished default dark theme** with a modern, professional neutral-dark
502
+ palette (Zed/Ghostty-inspired) for the Qt GUI editor, HTML rendering, and the
503
+ seeded `dark.css` theme.
504
+ - **`highlight_speed` default is now `1.0`** (match speech rate exactly); the
505
+ pacing guard is the real throttle, so the highlight stays tight to the audio.
506
+ - BRF export gained a `braille_grade2` opt-in for contracted Grade 2 via
507
+ liblouis (when installed and the table resolves).
508
+
509
+ ### πŸ› Fixed
510
+
511
+ - **Qt GUI now runs on PyQt6.** `QAction` was imported from `QtWidgets`, but
512
+ PyQt6 moved it to `QtGui`; the bad import made the whole PyQt6 branch fail, so
513
+ star silently fell back to PyQt5 (and could not start the GUI at all on a
514
+ PyQt6-only machine). `QAction` is now imported from `QtGui` under PyQt6, and a
515
+ couple of PyQt6 enum-to-int conversions (`line height`, full-width selection)
516
+ were hardened. This also makes the frozen Windows binary work.
517
+ - **BRF export no longer crashes the app.** Previously, exporting Braille with
518
+ `louis` installed but a translation table missing could make liblouis call
519
+ `exit()` at the C level, abruptly closing the window. liblouis is now opt-in
520
+ and fully guarded; the built-in Grade 1 translator is the reliable default and
521
+ can never terminate the process.
522
+ - **macOS no longer defaults to eSpeak.** With the new `applesay` backend ranked
523
+ above eSpeak, Macs speak with a native Apple voice by default.
524
+ - Highlighting that previously got "stuck" and stopped advancing down the page
525
+ while speech continued now tracks reading position reliably.
526
+
527
+ ### πŸ“ Notes for upgrading users
528
+
529
+ - Existing `settings.json` files are migrated automatically (see above). To
530
+ adopt the new dark palette, delete `themes/dark.css` in your config directory
531
+ (a fresh, updated copy is regenerated) or pick it from **View β†’ Theme**.
532
+ - macOS users who want pyttsx3's word-boundary callbacks (rather than the
533
+ timer-based highlight used by `say`) can `pip install pyobjc pyttsx3`.
534
+
535
+ ---
536
+
537
+ ## [0.1.1] β€” earlier
538
+
539
+ Initial public lineage of star prior to the 0.1.2 revision: single-file Qt GUI
540
+ and curses TUI, multi-format document loading (PDF, EPUB, DAISY/DTBook, DOCX,
541
+ PPTX, ODT, HTML, Markdown, LaTeX, RST and many markup formats, CSV/XLSX, images
542
+ via OCR, notebooks, source code), multiple TTS backends, themes, search,
543
+ bookmarks, reading-position memory, speed presets, Speech Cursor mode,
544
+ table-of-contents navigation, user highlights, audio export, document caching,
545
+ and screen-reader compatibility.
546
+
547
+ [0.1.3]: #013--2026-06-16
548
+ [0.1.2]: #012--2026-06-14
549
+ [0.1.1]: #011--earlier