mediahound 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. mediahound-0.2.0/CHANGELOG.md +97 -0
  2. mediahound-0.2.0/LICENSE +21 -0
  3. mediahound-0.2.0/MANIFEST.in +3 -0
  4. mediahound-0.2.0/PKG-INFO +326 -0
  5. mediahound-0.2.0/README.md +292 -0
  6. mediahound-0.2.0/mediahound/__init__.py +3 -0
  7. mediahound-0.2.0/mediahound/cli.py +178 -0
  8. mediahound-0.2.0/mediahound/config.example.toml +73 -0
  9. mediahound-0.2.0/mediahound/config.py +108 -0
  10. mediahound-0.2.0/mediahound/csvio.py +151 -0
  11. mediahound-0.2.0/mediahound/identify/__init__.py +21 -0
  12. mediahound-0.2.0/mediahound/identify/base.py +27 -0
  13. mediahound-0.2.0/mediahound/identify/cloud.py +112 -0
  14. mediahound-0.2.0/mediahound/identify/ollama.py +54 -0
  15. mediahound-0.2.0/mediahound/identify/tesseract.py +107 -0
  16. mediahound-0.2.0/mediahound/imaging.py +108 -0
  17. mediahound-0.2.0/mediahound/intro.py +50 -0
  18. mediahound-0.2.0/mediahound/links.py +23 -0
  19. mediahound-0.2.0/mediahound/metadata/__init__.py +29 -0
  20. mediahound-0.2.0/mediahound/metadata/base.py +63 -0
  21. mediahound-0.2.0/mediahound/metadata/musicbrainz.py +93 -0
  22. mediahound-0.2.0/mediahound/metadata/omdb.py +76 -0
  23. mediahound-0.2.0/mediahound/metadata/tmdb.py +100 -0
  24. mediahound-0.2.0/mediahound/metadata/wikidata.py +186 -0
  25. mediahound-0.2.0/mediahound/pipeline.py +864 -0
  26. mediahound-0.2.0/mediahound/resale.py +54 -0
  27. mediahound-0.2.0/mediahound/serve.py +214 -0
  28. mediahound-0.2.0/mediahound/store.py +174 -0
  29. mediahound-0.2.0/mediahound/streaming.py +86 -0
  30. mediahound-0.2.0/mediahound/web/assets/css/styles.css +417 -0
  31. mediahound-0.2.0/mediahound/web/assets/js/app.js +705 -0
  32. mediahound-0.2.0/mediahound/web/assets/js/identify.js +114 -0
  33. mediahound-0.2.0/mediahound/web/identify.html +45 -0
  34. mediahound-0.2.0/mediahound/web/index.html +186 -0
  35. mediahound-0.2.0/mediahound.egg-info/PKG-INFO +326 -0
  36. mediahound-0.2.0/mediahound.egg-info/SOURCES.txt +52 -0
  37. mediahound-0.2.0/mediahound.egg-info/dependency_links.txt +1 -0
  38. mediahound-0.2.0/mediahound.egg-info/entry_points.txt +2 -0
  39. mediahound-0.2.0/mediahound.egg-info/requires.txt +10 -0
  40. mediahound-0.2.0/mediahound.egg-info/top_level.txt +1 -0
  41. mediahound-0.2.0/pyproject.toml +61 -0
  42. mediahound-0.2.0/setup.cfg +4 -0
  43. mediahound-0.2.0/tests/test_cli.py +28 -0
  44. mediahound-0.2.0/tests/test_csvio.py +41 -0
  45. mediahound-0.2.0/tests/test_imaging.py +43 -0
  46. mediahound-0.2.0/tests/test_intro.py +40 -0
  47. mediahound-0.2.0/tests/test_metadata.py +68 -0
  48. mediahound-0.2.0/tests/test_music.py +48 -0
  49. mediahound-0.2.0/tests/test_pipeline.py +68 -0
  50. mediahound-0.2.0/tests/test_pipeline_internals.py +142 -0
  51. mediahound-0.2.0/tests/test_serve.py +130 -0
  52. mediahound-0.2.0/tests/test_store.py +84 -0
  53. mediahound-0.2.0/tests/test_streaming.py +51 -0
  54. mediahound-0.2.0/tests/test_tmdb.py +80 -0
@@ -0,0 +1,97 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project are documented here. The format is based on
4
+ [Keep a Changelog](https://keepachangelog.com/), and this project adheres to
5
+ [Semantic Versioning](https://semver.org/).
6
+
7
+ ## [0.2.0] — 2026-06-11 — "MediaHound"
8
+
9
+ Renamed **ReelShelf → MediaHound** and grew from a movie catalog into a **multi-media** catalog
10
+ (movies *and* music).
11
+
12
+ ### Added
13
+ - **Music support** — catalog CDs, vinyl and cassettes. New `media_type` discriminator and a
14
+ `MusicMeta` model; **MusicBrainz + Cover Art Archive** metadata provider (open, zero-key); keyless
15
+ **Spotify / Apple Music / YouTube Music** "where-to-listen" links.
16
+ - **Raw-image folder convention** — `RawImages/video/` → movies, `RawImages/audio/` → music; the
17
+ build routes each photo to the right identify/enrich path by its subfolder, and `init` scaffolds both.
18
+ - **CSV import/export** — `mediahound import catalog.csv` bulk-adds movies & music (no photos
19
+ needed), optionally enriched online; `mediahound export -o catalog.csv` backs up the catalog.
20
+ - **Frontend** — a **🎬 Movies / 🎵 Music** segmented filter, per-media-type cards (artist / label /
21
+ tracklist / ♫ listen for music), and music-aware search.
22
+ - Provider factory now routes by media type: `get_metadata_provider(cfg, media_type)`.
23
+ - **`mediahound serve`** — preview the generated site over http (no more file:// fetch limits).
24
+ - **`mediahound serve --admin`** — a localhost-only write API so admin-portal edits save **straight
25
+ into `data/corrections.json`** (and `seen-overrides.json`) as you make them. No "Export changes →
26
+ drop file in" step; edits persist immediately and **survive every future build** (the long-standing
27
+ cause of "my manual title fix reverted on rebuild"). Cross-origin writes are refused; a **↻ Rebuild**
28
+ button re-bakes the catalog in place. See `mediahound/serve.py` and
29
+ [docs/EDITING.md](docs/EDITING.md).
30
+ - **Move a title between Movies & Music from the admin screen** — the inline editor now has a
31
+ 🎬 Movie / 🎵 Music selector (with an Artist field and CD/Vinyl/Cassette formats for music).
32
+ Switching type sets a `media_type` correction, clears the previous type's exclusive fields, and
33
+ auto-ticks re-query so the next `--online` build re-enriches with the correct provider
34
+ (movie ↔ music). New `_apply_meta_to_music()`; `_apply_corrections` is now media-type-aware.
35
+ On the next build the move also **relocates the source cover photo** into the matching
36
+ `RawImages/video` or `RawImages/audio` folder (idempotent, path-traversal-guarded), so a
37
+ reclassified title is correct at the source and won't revert if `corrections.json` is cleared.
38
+ - **⬆ Import list (admin screen)** — under `serve --admin`, a new button + `POST /api/import` lets you
39
+ paste or upload a CSV and bulk-add titles (optionally enriched online), with the site rebuilt in
40
+ place. Same importer as the CLI: only `title` is required.
41
+
42
+ ### Changed
43
+ - **Filters are now media-type-aware** — the Format / Genre / Studio·Label / Language dropdowns
44
+ narrow to the active 🎬 Movies or 🎵 Music tab (and show everything under *All*), so you no longer
45
+ see movie-only formats while browsing music.
46
+ - Rebrand across package, CLI (`mediahound`), JS data global, `localStorage` keys, branding and docs.
47
+ - Default site title/subtitle are media-generic.
48
+
49
+ ### Fixed
50
+ - **“Export changes” now merges** with the site's existing `data/corrections.json` before downloading,
51
+ so exporting can never silently drop a previously-saved correction (which would make that title
52
+ revert on the next build).
53
+ - **Format is normalised when a title changes type** — a movie-only format left on a music item
54
+ (e.g. a CD that was catalogued as a `DVD`) is reset to the new type's default, so the card's format
55
+ badge and meta line no longer show `DVD` on music (or `CD` on a movie). Applied both in the build
56
+ and live in the portal.
57
+
58
+ ## [0.1.0] — 2026-06-09
59
+
60
+ First public release.
61
+
62
+ ### Added
63
+ - **CLI** — `mediahound init <dir>` scaffolds a site; `mediahound build` turns a folder of cover
64
+ photos into a static catalog. Incremental (only new photos are processed, tracked by sha256).
65
+ - **Offline-first** — builds never touch the network unless `--online` is passed.
66
+ - **Pluggable providers**
67
+ - Identify: `tesseract` (default, zero-key OCR), `claude` (vision), `ollama` (local).
68
+ - Metadata: `wikidata` (default, zero-key), `tmdb`, `omdb`.
69
+ - Where-to-watch via JustWatch (no key); resale estimates with eBay sold-listings links.
70
+ - **Static web app** (vanilla JS, no build step)
71
+ - Read-only **default view** + password-protected **admin view** (read/write).
72
+ - Dense, aligned cards: poster, title·year, rating·format·runtime·language, genres, director +
73
+ cast, studio, where-to-watch, intro hook, estimated resale value.
74
+ - Clickable genre / person / studio filters; streaming-service filter; adjustable columns;
75
+ responsive web + mobile.
76
+ - Multi-photo galleries with ‹ › arrows, click-to-zoom, rotate, and set-default.
77
+ - Admin: edit name/year/format/studio, mark seen, delete a title or a photo, manual
78
+ identification (name or discard unidentified covers), and configure the library name, logo,
79
+ description, shown fields, and default columns.
80
+ - Edits round-trip through small JSON files (`corrections.json`, `seen-overrides.json`,
81
+ `view-config.json`, `identify-queue.json`) applied on the next build.
82
+ - Works from `file://` via an embedded `data/bundle.js`; content-hash cache-busting for updates.
83
+ - **Demo** — `--mock` builds a 10-title sample catalog (real posters hotlinked for illustration;
84
+ no copyrighted images stored in the repo). Hosted live on GitHub Pages.
85
+ - **Robustness** — metadata caching, soft-failing providers, and a plausible-title guard that
86
+ rejects fuzzy mismatches so they can't corrupt your data.
87
+ - **Docs & CI** — README, ARCHITECTURE, DEPLOYMENT (with free-hosting options), SECURITY,
88
+ CONTRIBUTING; GitHub Actions CI (tests + mock build + `pip-audit`); Dependabot.
89
+
90
+ ### Security
91
+ - All rendered data is HTML-escaped; links are restricted to `http(s)` / site-relative URLs.
92
+ - Photo-rotation corrections are guarded against path traversal; poster downloads are `http(s)`-only.
93
+ - Secrets live only in a gitignored `.env`; only the admin password **hash** is published. The admin
94
+ gate is a convenience control, not server-side auth (see [SECURITY.md](SECURITY.md)).
95
+
96
+ [0.2.0]: https://github.com/jchirayath/mediahound/releases/tag/v0.2.0
97
+ [0.1.0]: https://github.com/jchirayath/mediahound/releases/tag/v0.1.0
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Jacob Chirayath
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,3 @@
1
+ recursive-include mediahound/web *
2
+ include mediahound/config.example.toml
3
+ include README.md LICENSE CHANGELOG.md
@@ -0,0 +1,326 @@
1
+ Metadata-Version: 2.4
2
+ Name: mediahound
3
+ Version: 0.2.0
4
+ Summary: Turn photos of your movie & music collection into a sleek, searchable web catalog.
5
+ Author: Jacob Chirayath
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/jchirayath/mediahound
8
+ Project-URL: Source, https://github.com/jchirayath/mediahound
9
+ Project-URL: Demo, https://jchirayath.github.io/mediahound/
10
+ Project-URL: Issues, https://github.com/jchirayath/mediahound/issues
11
+ Project-URL: Changelog, https://github.com/jchirayath/mediahound/blob/main/CHANGELOG.md
12
+ Keywords: dvd,vhs,movies,catalog,collection,ocr,tmdb,wikidata
13
+ Classifier: Development Status :: 4 - Beta
14
+ Classifier: Environment :: Console
15
+ Classifier: Intended Audience :: End Users/Desktop
16
+ Classifier: License :: OSI Approved :: MIT License
17
+ Classifier: Operating System :: OS Independent
18
+ Classifier: Programming Language :: Python :: 3
19
+ Classifier: Programming Language :: Python :: 3.11
20
+ Classifier: Programming Language :: Python :: 3.12
21
+ Classifier: Topic :: Multimedia :: Video
22
+ Requires-Python: >=3.11
23
+ Description-Content-Type: text/markdown
24
+ License-File: LICENSE
25
+ Requires-Dist: requests>=2.28
26
+ Requires-Dist: Pillow>=9.0
27
+ Provides-Extra: ocr
28
+ Requires-Dist: pytesseract>=0.3.10; extra == "ocr"
29
+ Provides-Extra: dev
30
+ Requires-Dist: pytest>=7.0; extra == "dev"
31
+ Requires-Dist: pytest-cov>=4.0; extra == "dev"
32
+ Requires-Dist: ruff>=0.4; extra == "dev"
33
+ Dynamic: license-file
34
+
35
+ # 🎬🎵 MediaHound
36
+
37
+ [![CI](https://github.com/jchirayath/mediahound/actions/workflows/ci.yml/badge.svg)](https://github.com/jchirayath/mediahound/actions/workflows/ci.yml)
38
+ [![CodeQL](https://github.com/jchirayath/mediahound/actions/workflows/codeql.yml/badge.svg)](https://github.com/jchirayath/mediahound/actions/workflows/codeql.yml)
39
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
40
+ [![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/)
41
+ [![Live demo](https://img.shields.io/badge/live-demo-ff5252.svg)](https://jchirayath.github.io/mediahound/)
42
+
43
+ **Turn photos of your movie *and* music collection into a sleek, searchable web catalog.**
44
+
45
+ Point MediaHound at a folder of cover photos — DVDs, VHS, Blu-ray, **CDs, vinyl, cassettes** — or
46
+ **import a CSV**. It identifies each item, pulls in cover art, genres, cast/artist, studio/label,
47
+ runtime/tracklist and ratings, writes a short enticing intro, estimates the used resale value, links
48
+ where to **watch** (movies) or **listen** (music), and generates a polished static website you can
49
+ search, filter by **🎬 Movies / 🎵 Music**, sort, and curate — with a password-protected admin mode.
50
+
51
+ Movies are identified/enriched via TMDB / OMDb / Wikidata + JustWatch; music via **MusicBrainz +
52
+ Cover Art Archive** (open, zero-key) with keyless Spotify / Apple Music / YouTube Music links.
53
+
54
+ **▶ [Live demo](https://jchirayath.github.io/mediahound/)** — explore a sample catalog in your browser (admin password: `changeme`).
55
+
56
+ [![MediaHound screenshot](docs/screenshot.jpg)](https://jchirayath.github.io/mediahound/)
57
+
58
+ > ℹ️ The demo shows **real movie posters and album covers** (hotlinked from IMDb/OMDb and Cover Art
59
+ > Archive / Apple) so you can see what a finished catalog looks like — no cover images are stored in
60
+ > this repo. Extra gallery photos are generated placeholders standing in for *your own* photos. Your
61
+ > real catalog pulls art from TMDB / OMDb / Wikidata (movies) and MusicBrainz / Cover Art Archive
62
+ > (music), or falls back to the photos you take.
63
+
64
+ - **Runs for anybody with zero API keys** — open-source OCR + open data by default.
65
+ - **Offline-first** — never contacts the internet unless you explicitly ask (`--online`).
66
+ - **Static output** — deploy anywhere (Netlify, GitHub Pages, S3, Vercel) or just open the HTML file.
67
+ - **No secrets in the repo** — keys live in a gitignored `.env`; your catalog is generated output.
68
+
69
+ > MIT-licensed. Your photos, keys and catalog never get committed to this tool's repo.
70
+
71
+ ---
72
+
73
+ ## See it in 30 seconds (no photos, no keys)
74
+
75
+ ```bash
76
+ git clone https://github.com/jchirayath/mediahound && cd mediahound
77
+ pip install -e .
78
+
79
+ mediahound init demo
80
+ mediahound build --config demo/config.toml --mock # generates a 10-title sample catalog
81
+ cd demo && python3 -m http.server 8000 # open http://localhost:8000
82
+ ```
83
+
84
+ That's the screenshot above. Click **🔒 Admin** and sign in with **`changeme`** to try the
85
+ read/write admin tools. Everything you see is generated by `--mock` — no internet, no API keys.
86
+
87
+ > Once published to PyPI you can skip the clone and just `pip install mediahound` (then
88
+ > `mediahound init …`). Maintainers: see [RELEASING.md](RELEASING.md) for the one-time publish setup.
89
+
90
+ ---
91
+
92
+ ## Cataloguing your own collection
93
+
94
+ ```bash
95
+ pip install -e ".[ocr]" # adds the default OCR identifier
96
+ # Install the Tesseract engine for OCR:
97
+ # macOS: brew install tesseract
98
+ # Debian: sudo apt-get install tesseract-ocr
99
+
100
+ mediahound init mysite # scaffolds mysite/ (RawImages/{video,audio}/, config.toml, web template)
101
+
102
+ # Sort your cover photos by media type:
103
+ cp ~/Pictures/dvd-covers/*.jpg mysite/RawImages/video/ # 🎬 movies (DVD/VHS/Blu-ray/LaserDisc)
104
+ cp ~/Pictures/album-covers/*.jpg mysite/RawImages/audio/ # 🎵 music (CD/vinyl/cassette)
105
+
106
+ mediahound build --config mysite/config.toml --online # identify + enrich (see Providers below)
107
+ cd mysite && python3 -m http.server 8000
108
+ ```
109
+
110
+ #### Raw-image folder convention
111
+
112
+ Photos are sorted into **media-type subfolders** so MediaHound knows how to identify each item:
113
+
114
+ | Folder | Media type | Identified/enriched via |
115
+ |---|---|---|
116
+ | `RawImages/video/` | 🎬 movies | TMDB / OMDb / Wikidata + JustWatch |
117
+ | `RawImages/audio/` | 🎵 music | MusicBrainz + Cover Art Archive + listen links |
118
+ | `RawImages/` (root) | defaults to movies | — |
119
+
120
+ (`movies/` and `music/` are accepted aliases.) Add more photos anytime and re-run `build` — only the
121
+ **new** ones are processed (state is tracked by content hash in `data/manifest.json`).
122
+
123
+ ### Or import from a CSV (no photos)
124
+
125
+ ```bash
126
+ mediahound import catalog.csv --config mysite/config.toml # add rows offline
127
+ mediahound import catalog.csv --config mysite/config.toml --online # …and fetch cover art + metadata
128
+ mediahound export --config mysite/config.toml -o backup.csv # dump the whole catalog back to CSV
129
+ ```
130
+
131
+ Columns (case-insensitive; extras ignored): `media_type, title, artist, director, year, format,
132
+ label, studio, genres, rating, barcode, cover_url, intro`. **Only `title` is required** — any missing
133
+ fields are left blank (or filled by `--online`); even a one-column list of titles works. `media_type`
134
+ is inferred (`music` if an `artist` is given, else `movie`). See
135
+ [`examples/sample-import.csv`](examples/sample-import.csv).
136
+
137
+ Prefer a UI? Under **`mediahound serve --admin`** the admin screen has an **⬆ Import list** button —
138
+ paste or upload the same CSV, optionally tick *enrich online*, and the titles are added and the site
139
+ rebuilt in place.
140
+
141
+ ---
142
+
143
+ ## Features
144
+
145
+ ### The catalog
146
+ - **Search** title / genre / cast / studio / intro, **sort** by title, year, recently-added, value or rating.
147
+ - **Filters**: format, genre, studio, **streaming service**, language, category, seen / unseen.
148
+ - **Dense, aligned cards** showing poster, title·year, ★rating · format · runtime · language,
149
+ genres, director + cast, studio, where-to-watch, intro hook, and estimated resale value.
150
+ - **Clickable everything**: a genre, person, or studio filters the grid to matching titles.
151
+ - **Adjustable density** — viewers pick how many movies per row; responsive on web & mobile.
152
+
153
+ ### Photos
154
+ - **Multi-photo galleries** — flip through every photo of a title with ‹ › arrows.
155
+ - **Click-to-zoom** lightbox; set any photo as the default; rotate photos (baked in on rebuild).
156
+ - Auto-uprights sideways/landscape cover photos to portrait.
157
+
158
+ ### Where to watch & resale
159
+ - **Where to watch** — is it on Netflix / Amazon Prime / Hulu? A clickable ▶ badge + pills link
160
+ straight to the title (via JustWatch, no key). A filter narrows to a specific service.
161
+ - **Resale value** — a heuristic estimate plus a live link to eBay sold/completed listings.
162
+
163
+ ### Two views
164
+ - **Default view** — public, read-only.
165
+ - **Admin view** — password-protected, read/write. Edit a title's name, year, format, studio &
166
+ distributor; **move a title between 🎬 Movies and 🎵 Music**; mark seen; rotate / set-default /
167
+ delete a photo; delete a title; and configure the **library name, description, logo, which fields
168
+ are shown, and default columns**.
169
+
170
+ ### Editing & persisting your changes
171
+
172
+ Your edits are recorded as small **corrections** (keyed by title id). There are two ways to make
173
+ them permanent so they **survive every future `mediahound build`** — pick one:
174
+
175
+ **A. Live admin server (recommended — zero manual steps)**
176
+
177
+ ```bash
178
+ mediahound serve --admin # serves the site at http://127.0.0.1:8765
179
+ ```
180
+
181
+ Open the site, unlock admin, and edit. Every change is written **straight into
182
+ `data/corrections.json`** (and `seen-overrides.json`) as you go — the badge shows
183
+ *“✓ Saved to disk.”* Click **↻ Rebuild** to re-bake the catalog and reload. Because the edit is
184
+ already in `data/`, the next `mediahound build` (and any re-query) keeps it — **edits never revert.**
185
+ The write API is **localhost-only** and refuses cross-origin requests; never expose it publicly.
186
+
187
+ **B. Static export (for read-only/CDN hosting like Netlify or GitHub Pages)**
188
+
189
+ When the site is served as plain files (no admin server), edits live in your browser. Click
190
+ **Export changes** / **Export seen** — the download is **merged with the site's existing
191
+ `data/corrections.json`** so nothing already saved is lost — then drop the file into `data/` and run
192
+ `mediahound build`.
193
+
194
+ > Either way the source of truth is `data/corrections.json`. A title you fix only in the browser
195
+ > (without server-admin or an export) shows locally but **reverts on the next rebuild**, because the
196
+ > build regenerates the catalog from `data/`.
197
+
198
+ ### Manual identification
199
+ - Covers that couldn't be read are grouped on `identify.html`, where you **name** them (queued for
200
+ discovery on the next online build) or **discard** them (e.g. blank tapes).
201
+
202
+ ---
203
+
204
+ ## How it compares
205
+
206
+ Most movie-collection tools add items by **barcode scan or manual entry** and keep your catalog in
207
+ **their cloud** or a dated desktop app. MediaHound is the only one that identifies titles from
208
+ **photos of the covers** and generates a **modern static website you own and host for free** —
209
+ offline-first and open-source. It's also one of the few that handles **VHS** (which usually has no
210
+ scannable barcode in the disc databases others rely on).
211
+
212
+ | | MediaHound | CLZ / Libib | Tellico / GCstar | Plex / Jellyfin |
213
+ |---|---|---|---|---|
214
+ | Add by **photo of cover** | ✅ OCR/AI | ❌ barcode/manual | ❌ search/manual | ❌ scans video files |
215
+ | Modern **website you host free** | ✅ | ❌ their cloud | ◻︎ dated HTML export | ❌ private server |
216
+ | Open-source / offline / no account | ✅ | ❌ | ✅ desktop | ✅ (Jellyfin) |
217
+ | For a **physical** shelf | ✅ | ✅ | ✅ | ❌ digital files |
218
+
219
+ See **[COMPARISON.md](COMPARISON.md)** for the full, honest analysis — including when a barcode app
220
+ (CLZ/Libib), an OSS desktop cataloger (Tellico/Data Crow), or a media server (Plex/Jellyfin) is the
221
+ better choice.
222
+
223
+ ---
224
+
225
+ ## Providers (how titles get identified & enriched)
226
+
227
+ Both paths are first-class — pick them per-site in `config.toml`. The default needs **zero keys**.
228
+
229
+ | Concern | Default (no key) | Optional upgrade |
230
+ |---|---|---|
231
+ | **Identify** title from a cover | `tesseract` — open-source OCR | `claude` (Anthropic vision, also writes the intro) · `ollama` (local model) |
232
+ | **Movie** metadata + poster | `wikidata` — Wikidata + Wikipedia + Wikimedia | `tmdb` (free key) · `omdb` (free key) |
233
+ | **Music** metadata + cover art | `musicbrainz` — MusicBrainz + Cover Art Archive | `discogs` *(planned)* |
234
+ | **Where to watch / listen** | `justwatch` (movies) · keyless Spotify/Apple/YouTube search (music) | Spotify / Apple Music keys *(planned)* |
235
+ | **Resale** | eBay sold-listings link + estimate | Discogs price *(planned, music)* |
236
+
237
+ Switch to a premium provider in `config.toml`:
238
+
239
+ ```toml
240
+ [identify]
241
+ provider = "claude" # needs ANTHROPIC_API_KEY
242
+ [metadata]
243
+ provider = "tmdb" # needs TMDB_API_KEY (or use "omdb" + OMDB_API_KEY)
244
+ ```
245
+
246
+ …and create a **gitignored** `.env` next to `config.toml`:
247
+
248
+ ```
249
+ ANTHROPIC_API_KEY=sk-ant-...
250
+ TMDB_API_KEY=...
251
+ ```
252
+
253
+ Robustness built in: results are cached (`data/.metadata-cache.json`) so rebuilds never re-hit a
254
+ rate-limited free key, providers fail soft (a bad lookup never drops a title), and a fuzzy match
255
+ that returns the wrong film is rejected so it can't corrupt your names.
256
+
257
+ ---
258
+
259
+ ## Offline by default
260
+
261
+ `mediahound build` is **offline** — it regenerates the site from existing data and never contacts
262
+ the internet. Add `--online` to allow identification / metadata / where-to-watch lookups:
263
+
264
+ ```bash
265
+ mediahound build --config mysite/config.toml # offline: just rebuild the site
266
+ mediahound build --config mysite/config.toml --online # online: identify + enrich new titles
267
+ mediahound build --config mysite/config.toml --online --refresh-streaming # also re-check where-to-watch
268
+ ```
269
+
270
+ Useful flags: `--mock` (demo data), `--force` (reprocess everything), `--limit N`, `--reidentify <sha256>`.
271
+
272
+ ---
273
+
274
+ ## Deploy
275
+
276
+ The generated site folder (`mysite/`) is plain static files (`index.html`, `identify.html`,
277
+ `assets/`, `data/`, `posters/`, `originals/`). It's just static files, so you can **host it free**
278
+ on **GitHub Pages, Cloudflare Pages, Netlify, Vercel, Render, or Surge.sh** — no server, database,
279
+ or build step required. Quickest:
280
+
281
+ ```bash
282
+ cd mysite && npx netlify deploy --prod # Netlify
283
+ cd mysite && npx wrangler pages deploy . # Cloudflare Pages
284
+ cd mysite && npx surge . # Surge.sh
285
+ ```
286
+
287
+ See **[DEPLOYMENT.md](DEPLOYMENT.md)** for the full free-hosting comparison plus GitHub Pages, Vercel
288
+ and S3 instructions. The live demo above is itself hosted free on GitHub Pages via a workflow.
289
+
290
+ It even works by **double-clicking `index.html`** — the build embeds the catalog in `data/bundle.js`
291
+ so it loads without a web server.
292
+
293
+ ---
294
+
295
+ ## Architecture
296
+
297
+ See **[ARCHITECTURE.md](ARCHITECTURE.md)** for the full picture. In short:
298
+
299
+ ```
300
+ RawImages/*.jpg ─▶ identify (OCR / vision) ─▶ enrich (poster, genres, cast, studio, rating)
301
+
302
+ confidence too low? ──────┴─▶ data/unidentified.json → identify.html
303
+
304
+ + intro + resale + where-to-watch ─▶ data/collection.json ─▶ static site (index.html)
305
+ ```
306
+
307
+ Python CLI (`mediahound/`) builds the data; a dependency-free vanilla-JS frontend (`mediahound/web/`) renders it.
308
+
309
+ ---
310
+
311
+ ## Attribution & licensing
312
+
313
+ - Code: **MIT** (see [LICENSE](LICENSE)).
314
+ - Default data: **Wikidata** (CC0), **Wikipedia** text (CC BY-SA), images via **Wikimedia Commons**.
315
+ - If you enable **TMDB**: uses the TMDB API but is not endorsed or certified by TMDB.
316
+ - Where-to-watch data via **JustWatch**; resale links to **eBay** sold listings (estimates are heuristics).
317
+
318
+ ## Security
319
+
320
+ MediaHound output is a **static site** — read-only to everyone, with no server to attack. The admin
321
+ password is a convenience gate, **not** an access-control boundary (the published catalog can't be
322
+ changed without rebuilding + redeploying). API keys stay in a gitignored `.env`; only the password
323
+ hash ships. All rendered data is HTML-escaped and links are scheme-restricted. See
324
+ **[SECURITY.md](SECURITY.md)** for the full threat model and reporting instructions.
325
+
326
+ Contributions welcome — see [CONTRIBUTING.md](CONTRIBUTING.md).