ear2finger 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. ear2finger-1.0.0/LICENSE +21 -0
  2. ear2finger-1.0.0/MANIFEST.in +2 -0
  3. ear2finger-1.0.0/PKG-INFO +365 -0
  4. ear2finger-1.0.0/README.md +330 -0
  5. ear2finger-1.0.0/pyproject.toml +55 -0
  6. ear2finger-1.0.0/setup.cfg +4 -0
  7. ear2finger-1.0.0/src/ear2finger/__init__.py +3 -0
  8. ear2finger-1.0.0/src/ear2finger/__main__.py +3 -0
  9. ear2finger-1.0.0/src/ear2finger/app.py +86 -0
  10. ear2finger-1.0.0/src/ear2finger/auth.py +87 -0
  11. ear2finger-1.0.0/src/ear2finger/cli.py +37 -0
  12. ear2finger-1.0.0/src/ear2finger/config.py +4 -0
  13. ear2finger-1.0.0/src/ear2finger/database.py +261 -0
  14. ear2finger-1.0.0/src/ear2finger/models/__init__.py +1 -0
  15. ear2finger-1.0.0/src/ear2finger/routers/__init__.py +1 -0
  16. ear2finger-1.0.0/src/ear2finger/routers/auth.py +130 -0
  17. ear2finger-1.0.0/src/ear2finger/routers/dictation.py +51 -0
  18. ear2finger-1.0.0/src/ear2finger/routers/health.py +9 -0
  19. ear2finger-1.0.0/src/ear2finger/routers/learning_progress.py +451 -0
  20. ear2finger-1.0.0/src/ear2finger/routers/lesson_sessions.py +170 -0
  21. ear2finger-1.0.0/src/ear2finger/routers/playlists.py +283 -0
  22. ear2finger-1.0.0/src/ear2finger/routers/user_config.py +169 -0
  23. ear2finger-1.0.0/src/ear2finger/routers/users.py +167 -0
  24. ear2finger-1.0.0/src/ear2finger/routers/youtube.py +242 -0
  25. ear2finger-1.0.0/src/ear2finger/services/__init__.py +1 -0
  26. ear2finger-1.0.0/src/ear2finger/services/youtube_processor.py +566 -0
  27. ear2finger-1.0.0/src/ear2finger/web/dist/assets/index-BNJPfZCv.css +1 -0
  28. ear2finger-1.0.0/src/ear2finger/web/dist/assets/index-D9cDb7wN.js +174 -0
  29. ear2finger-1.0.0/src/ear2finger/web/dist/icon.png +0 -0
  30. ear2finger-1.0.0/src/ear2finger/web/dist/index.html +14 -0
  31. ear2finger-1.0.0/src/ear2finger.egg-info/PKG-INFO +365 -0
  32. ear2finger-1.0.0/src/ear2finger.egg-info/SOURCES.txt +34 -0
  33. ear2finger-1.0.0/src/ear2finger.egg-info/dependency_links.txt +1 -0
  34. ear2finger-1.0.0/src/ear2finger.egg-info/entry_points.txt +2 -0
  35. ear2finger-1.0.0/src/ear2finger.egg-info/requires.txt +13 -0
  36. ear2finger-1.0.0/src/ear2finger.egg-info/top_level.txt +1 -0
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Hang Yin
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,2 @@
1
+ # Ensure bundled SPA is included in sdist/wheel (package-data covers wheel; sdist needs this)
2
+ recursive-include src/ear2finger/web *
@@ -0,0 +1,365 @@
1
+ Metadata-Version: 2.4
2
+ Name: ear2finger
3
+ Version: 1.0.0
4
+ Summary: YouTube subtitle dictation practice — FastAPI backend and bundled React UI
5
+ Author: Stephen Yin
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/stephenyin/Ear2Finger
8
+ Project-URL: Repository, https://github.com/stephenyin/Ear2Finger
9
+ Keywords: dictation,youtube,esl,fastapi,education
10
+ Classifier: Development Status :: 4 - Beta
11
+ Classifier: Environment :: Web Environment
12
+ Classifier: Framework :: FastAPI
13
+ Classifier: Intended Audience :: Education
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Topic :: Education
19
+ Requires-Python: >=3.10
20
+ Description-Content-Type: text/markdown
21
+ License-File: LICENSE
22
+ Requires-Dist: fastapi==0.109.0
23
+ Requires-Dist: uvicorn[standard]==0.27.0
24
+ Requires-Dist: pydantic==2.5.3
25
+ Requires-Dist: python-multipart==0.0.6
26
+ Requires-Dist: yt-dlp==2026.2.4
27
+ Requires-Dist: sqlalchemy==2.0.25
28
+ Requires-Dist: aiosqlite==0.19.0
29
+ Requires-Dist: passlib[bcrypt]==1.7.4
30
+ Requires-Dist: python-jose[cryptography]==3.3.0
31
+ Provides-Extra: dev
32
+ Requires-Dist: build>=1.0.0; extra == "dev"
33
+ Requires-Dist: twine>=5.0.0; extra == "dev"
34
+ Dynamic: license-file
35
+
36
+ # Ear2Finger
37
+
38
+ A locally deployable web application that allows users to improve their English listening and dictation skills, **with an AI coach that analyzes your practice history and recommends what to study next**.
39
+
40
+ ## Tech Stack
41
+
42
+ ### Backend
43
+ - **Python 3.8+**
44
+ - **FastAPI** - Modern, fast web framework for building APIs
45
+ - **Uvicorn** - ASGI server
46
+ - **yt-dlp** - YouTube video and subtitle extraction
47
+ - **SQLAlchemy** - Database ORM
48
+ - **SQLite** - Database for storing videos and sentences
49
+ - **NLTK** - Natural language processing for sentence segmentation
50
+ - **Qdrant** - Vector database for storing sentence and learning-history embeddings (AI coach)
51
+ - **sentence-transformers** - Local embedding models for sentence and history vectors
52
+ - **Gemini** (via LangChain) - LLM provider powering the AI coach feedback
53
+
54
+ ### Frontend
55
+ - **React 18** - UI library
56
+ - **TypeScript** - Type-safe JavaScript
57
+ - **Tailwind CSS** - Utility-first CSS framework
58
+ - **Vite** - Fast build tool and dev server
59
+ - **Axios** - HTTP client
60
+
61
+ ## Project Structure
62
+
63
+ ```
64
+ Ear2Finger/
65
+ ├── pyproject.toml # Python package metadata (PyPI) + dependencies
66
+ ├── src/ear2finger/ # Installable application package
67
+ │ ├── app.py # FastAPI app (serves /api + bundled UI from web/dist when present)
68
+ │ ├── database.py, auth.py, … # Core modules and routers/, services/
69
+ │ └── web/dist/ # Production frontend build (copy from frontend/dist before releases)
70
+ ├── backend/
71
+ │ ├── main.py # Thin shim: uvicorn main:app (adds ../src to PYTHONPATH)
72
+ │ └── requirements.txt # Points to pyproject.toml; use pip install -e ..
73
+ ├── frontend/ # React frontend
74
+ │ ├── src/
75
+ │ │ ├── App.tsx # Main React component with tab navigation
76
+ │ │ ├── components/ # React components
77
+ │ │ │ ├── Workspace.tsx # Dictation workspace with per-word input + AI coach panel
78
+ │ │ │ ├── Dashboard.tsx # Practice dashboard with AI coach summary and tips
79
+ │ │ │ ├── LessonHistory.tsx # Per-lesson session history with “Ask coach” integration
80
+ │ │ │ └── YouTubeProcessor.tsx# YouTube video processing UI
81
+ │ │ ├── main.tsx # React entry point
82
+ │ │ └── index.css # Global styles with Tailwind
83
+ │ ├── package.json # Node.js dependencies
84
+ │ ├── vite.config.ts # Vite configuration
85
+ │ ├── tsconfig.json # TypeScript configuration
86
+ │ └── tailwind.config.js # Tailwind CSS configuration
87
+
88
+ └── README.md # This file
89
+ ```
90
+
91
+ ## Prerequisites
92
+
93
+ - **Python 3.8+** and pip
94
+ - **Node.js 18+** and npm (or yarn/pnpm)
95
+ - **FFmpeg** (required for MP3 audio conversion from YouTube videos)
96
+ - Install on macOS: `brew install ffmpeg`
97
+ - Install on Ubuntu/Debian: `sudo apt-get install ffmpeg`
98
+ - Install on Windows: Download from [FFmpeg website](https://ffmpeg.org/download.html)
99
+
100
+ ## Setup Instructions
101
+
102
+ ### Backend Setup
103
+
104
+ 1. Navigate to the backend directory:
105
+ ```bash
106
+ cd backend
107
+ ```
108
+
109
+ 2. Create a virtual environment (recommended):
110
+ ```bash
111
+ python -m venv venv
112
+ ```
113
+
114
+ 3. Activate the virtual environment:
115
+ - On macOS/Linux:
116
+ ```bash
117
+ source venv/bin/activate
118
+ ```
119
+ - On Windows:
120
+ ```bash
121
+ venv\Scripts\activate
122
+ ```
123
+
124
+ 4. Install dependencies:
125
+ ```bash
126
+ pip install -r requirements.txt
127
+ ```
128
+
129
+ 5. (Recommended) Copy environment variables:
130
+ ```bash
131
+ cp .env.example .env
132
+ ```
133
+ Edit `.env` to configure:
134
+ - Database, Qdrant URL/API key, and embedding model
135
+ - Gemini API key and `GEMINI_MODEL` (required for the AI coach)
136
+
137
+ 6. Run the development server:
138
+ ```bash
139
+ uvicorn main:app --reload --host 0.0.0.0 --port 8000
140
+ ```
141
+
142
+ The API will be available at `http://localhost:8000`
143
+ - API documentation: `http://localhost:8000/docs` (Swagger UI)
144
+ - Alternative docs: `http://localhost:8000/redoc`
145
+
146
+ ### Frontend Setup
147
+
148
+ 1. Navigate to the frontend directory:
149
+ ```bash
150
+ cd frontend
151
+ ```
152
+
153
+ 2. Install dependencies:
154
+ ```bash
155
+ npm install
156
+ ```
157
+ (or use `yarn install` or `pnpm install`)
158
+
159
+ 3. Start the development server:
160
+ ```bash
161
+ npm run dev
162
+ ```
163
+
164
+ The frontend will be available at `http://localhost:3000`
165
+
166
+ ## PyPI package
167
+
168
+ The Python distribution name is **`ear2finger`** (see `pyproject.toml`). Build wheels locally as below, then publish to PyPI with Twine when you are ready.
169
+
170
+ **Install and run** (bundled UI + API on one port):
171
+
172
+ ```bash
173
+ pip install ear2finger
174
+ ear2finger --host 0.0.0.0 --port 8000
175
+ # Open http://127.0.0.1:8000
176
+ ```
177
+
178
+ **Develop from a git clone** (editable install):
179
+
180
+ ```bash
181
+ pip install -e .
182
+ uvicorn ear2finger.app:app --reload --host 0.0.0.0 --port 8000
183
+ ```
184
+
185
+ **Refresh the bundled UI before building a release wheel** (after `npm run build` in `frontend/`):
186
+
187
+ ```bash
188
+ rm -rf src/ear2finger/web/dist && cp -R frontend/dist src/ear2finger/web/dist
189
+ ```
190
+
191
+ **Build sdist + wheel** (from repo root, in a virtualenv):
192
+
193
+ ```bash
194
+ pip install build
195
+ python -m build
196
+ ```
197
+
198
+ **Upload** (Twine + PyPI credentials):
199
+
200
+ ```bash
201
+ pip install twine
202
+ twine upload dist/*
203
+ ```
204
+
205
+ ## Running the Application
206
+
207
+ 1. **Start the backend** (from `backend/` with venv, after `pip install -e ..`):
208
+ ```bash
209
+ uvicorn ear2finger.app:app --reload
210
+ ```
211
+ Or, without installing the package: `cd backend && uvicorn main:app --reload` (shim loads `src/`).
212
+
213
+ 2. **Start the frontend** (from `frontend/` directory, in a new terminal):
214
+ ```bash
215
+ npm run dev
216
+ ```
217
+
218
+ 3. Open your browser and navigate to `http://localhost:3000`
219
+
220
+ ## Features
221
+
222
+ ### Core Learning Flow
223
+ - **YouTube import**: Paste a YouTube URL and turn it into a structured dictation lesson.
224
+ - **Extract subtitles**: Automatically extract subtitles from YouTube videos using yt-dlp.
225
+ - **Download MP3 audio**: Download audio-only MP3 files from YouTube videos (requires FFmpeg).
226
+ - **Sentence segmentation**: Intelligently segment subtitles into individual sentences using NLTK.
227
+ - **Timestamp storage**: Store each sentence with precise start and end timestamps.
228
+ - **Database storage**: Store processed videos, sentences, audio paths, and learning events in SQLite.
229
+ - **Dictation workspace**: Practice sentence-by-sentence with per-word inputs, hints, and keyboard shortcuts.
230
+ - **Lesson playlists**: Organize imported videos into playlists and track progress per lesson.
231
+
232
+ ### **AI Coach / AI Agent (highlight)**
233
+
234
+ The AI coach is a **personalized language-learning agent** that reads your practice history and:
235
+
236
+ - **Summarizes your progress**: Explains what you are doing well and where you are struggling, based on:
237
+ - Per-word spelling difficulty
238
+ - Hint usage
239
+ - Error rates over time
240
+ - **Generates tailored advice**: Produces 3–5 concrete, numbered suggestions for what to practice next.
241
+ - **Recommends sentences to review**: Uses Qdrant to find sentences containing your weakest words and surfaces them as practice recommendations.
242
+ - **Respects your data**: Uses your own practice stats and sentence history only; embeddings and vectors are stored in your own Qdrant instance.
243
+
244
+ Where you see the AI coach in the UI:
245
+
246
+ - **Dashboard**:
247
+ - `Dashboard.tsx` shows an **AI Language Coach** card with lightweight tips and recommended YouTube channels.
248
+ - You can open a **full-screen AI coach modal** to read detailed feedback and see recommended lessons.
249
+ - **Workspace**:
250
+ - `Workspace.tsx` can automatically open an **AI coach side panel** when you finish a lesson.
251
+ - The panel shows a session recap and lets you request **practice recommendations** for the current video.
252
+ - **Lesson history**:
253
+ - `LessonHistory.tsx` adds an **“Ask coach”** button per past session so you can get feedback on specific practice days.
254
+
255
+ AI coach plumbing:
256
+
257
+ - Backend endpoints:
258
+ - `/api/user/progress` + `/api/user/stats` aggregate fine-grained word- and sentence-level stats.
259
+ - `/api/ai/coach/feedback` generates natural-language feedback via Gemini.
260
+ - `/api/ai/coach/recommend-practice` queries Qdrant for similar sentences based on your weakest words.
261
+ - Vector store:
262
+ - `qdrant_client.py` ingests:
263
+ - Per-sentence learning events (`LearningProgress`) as **user learning events**.
264
+ - All lesson sentences as **sentence embeddings** for semantic search.
265
+ - Qdrant can run locally (default `http://localhost:6333`) or via Qdrant Cloud.
266
+ - LLM + embeddings:
267
+ - `ai_client_factory.py` builds:
268
+ - A Gemini chat model (configurable via `GEMINI_MODEL` and API key in `.env`).
269
+ - A local `sentence-transformers` embedding model for Qdrant.
270
+
271
+ To **enable the AI coach**, you need:
272
+
273
+ - A running **Qdrant** instance (local or cloud) reachable from the backend.
274
+ - A valid **Gemini API key** and model name configured in `backend/.env`.
275
+ - A logged-in user practicing at least a few sentences so that stats and vectors exist.
276
+
277
+ ### How It Works
278
+ 1. User submits a YouTube video URL through the web interface.
279
+ 2. Backend uses yt-dlp to extract video metadata and subtitles (supports both manual and auto-generated subtitles).
280
+ 3. Subtitles are parsed from WebVTT format and segmented into sentences.
281
+ 4. Each sentence is stored with its timestamp information in the database.
282
+ 5. Users can browse processed videos and view all sentences with timestamps.
283
+ 6. While practicing, per-word correctness, hints, and error characters are sent to `/api/user/progress`, aggregated by `/api/user/stats`, and ingested into Qdrant.
284
+ 7. The AI coach uses these stats and vectors to generate feedback and practice recommendations.
285
+
286
+ ## API Endpoints
287
+
288
+ ### Health
289
+ - `GET /api/health` - Health check endpoint
290
+
291
+ ### Dictation (Legacy)
292
+ - `GET /api/dictations` - Get all dictation exercises
293
+ - `GET /api/dictations/{id}` - Get a specific dictation exercise
294
+ - `POST /api/dictations` - Create a new dictation exercise
295
+
296
+ ### YouTube Processing
297
+ - `POST /api/youtube/process` - Process a YouTube video (extract subtitles, download MP3 audio, and segment)
298
+ - `GET /api/youtube/videos` - Get all processed videos
299
+ - `GET /api/youtube/videos/{video_id}` - Get a specific video
300
+ - `GET /api/youtube/videos/{video_id}/sentences` - Get all sentences for a video
301
+ - `GET /api/youtube/videos/{video_id}/audio` - Download the MP3 audio file for a video
302
+ - `DELETE /api/youtube/videos/{video_id}` - Delete a video, its sentences, and audio file
303
+
304
+ ### Learning Progress & Stats
305
+ - `GET /api/user/progress` - Get raw learning progress events for the current user
306
+ - `POST /api/user/progress` - Upsert a learning progress event for a sentence/video
307
+ - `GET /api/user/stats` - Get aggregated user stats (totals, distributions, and top tricky words)
308
+
309
+ ### AI Coach / AI Agent
310
+ - `POST /api/ai/coach/feedback` - Generate personalized, LLM-based feedback from aggregated user stats
311
+ - `POST /api/ai/coach/recommend-practice` - Recommend sentences/videos to review based on weak words and Qdrant search
312
+
313
+ See the interactive API documentation at `http://localhost:8000/docs` for more details.
314
+
315
+ ## Development
316
+
317
+ ### Backend Development
318
+
319
+ - The backend uses FastAPI with automatic API documentation.
320
+ - Code is organized in routers for different features.
321
+ - Add new endpoints by creating routers in `backend/routers/`.
322
+ - AI coach behavior is primarily in:
323
+ - `routers/learning_progress.py` (stats aggregation)
324
+ - `routers/ai_coach.py` (AI coach endpoints)
325
+ - `services/qdrant_client.py` (vector store)
326
+ - `services/ai_client_factory.py` (LLM + embeddings).
327
+
328
+ ### Frontend Development
329
+
330
+ - The frontend uses Vite for fast hot module replacement.
331
+ - TypeScript provides type safety.
332
+ - Tailwind CSS is configured and ready to use.
333
+ - Components are in `frontend/src/`, with AI coach UI in:
334
+ - `components/Dashboard.tsx`
335
+ - `components/Workspace.tsx`
336
+ - `components/LessonHistory.tsx`.
337
+
338
+ ## Building for Production
339
+
340
+ ### Backend
341
+
342
+ The backend can be run with uvicorn in production mode:
343
+ ```bash
344
+ uvicorn main:app --host 0.0.0.0 --port 8000
345
+ ```
346
+
347
+ For production, consider using a process manager like systemd, supervisor, or Docker.
348
+
349
+ ### Frontend
350
+
351
+ Build the frontend for production:
352
+ ```bash
353
+ cd frontend
354
+ npm run build
355
+ ```
356
+
357
+ The built files will be in `frontend/dist/` and can be served by any static file server or integrated with the backend.
358
+
359
+ ## License
360
+
361
+ See LICENSE file for details.
362
+
363
+ ## Contributing
364
+
365
+ Contributions are welcome! Please feel free to submit a Pull Request.