PyPI - revoxx - Versions diffs - 1.0.0.dev22__tar.gz → 1.0.2__tar.gz - Mend

revoxx 1.0.0.dev22tar.gz → 1.0.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (133) hide show

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/MANIFEST.in RENAMED Viewed

@@ -1,6 +1,7 @@
 include README.md
 include LICENSE
 include requirements.txt
+recursive-include doc *.md *.png
 recursive-include revoxx/resources *
 recursive-include revoxx/resources/templates *
 global-exclude __pycache__

{revoxx-1.0.0.dev22/revoxx.egg-info → revoxx-1.0.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: revoxx
-Version: 1.0.0.dev22
+Version: 1.0.2
 Summary: Speech recording application for creating high-quality speech datasets
 Author-email: Grammatek ehf <info@grammatek.com>
 Maintainer-email: Grammatek ehf <info@grammatek.com>
@@ -31,15 +31,19 @@ Requires-Dist: matplotlib>=3.8.0
 Requires-Dist: sounddevice>=0.5.1
 Requires-Dist: soundfile>=0.12.0
 Requires-Dist: tqdm>=4.65.0
+Requires-Dist: markdown2>2.5.1
+Requires-Dist: tkinterweb>4.4.1
 Provides-Extra: vad
 Requires-Dist: torch>=2.0.0; extra == "vad"
 Requires-Dist: silero-vad>=5.0; extra == "vad"
+Requires-Dist: torchaudio<2.8.0; extra == "vad"
 Provides-Extra: dev
 Requires-Dist: black>=22.0.0; extra == "dev"
 Requires-Dist: isort>=5.10.0; extra == "dev"
 Requires-Dist: flake8>=6.0.0; extra == "dev"
 Requires-Dist: pytest>=7.0.0; extra == "dev"
 Requires-Dist: pytest-cov>=3.0.0; extra == "dev"
+Requires-Dist: versioningit>=2.0.0; extra == "dev"
 Dynamic: license-file
 # Revoxx - Record Voices
@@ -72,8 +76,10 @@ This repository provides **Revoxx**, a graphical recording application for recor
 ## System Requirements
 - **Operating System:** Linux/OS-X, should work on Windows
+- **Python:** 3.9 - 3.13 with Tkinter support
 - **Recording:** Audio Interface, good voice microphone and headphones
 - **Linux:** Requires PortAudio library (`sudo apt-get install portaudio19-dev` on Ubuntu/Debian)
+- **GUI:** Tkinter (usually included with Python, see installation notes below)
 ## Description
@@ -102,23 +108,57 @@ the Icelandic emotional speech dataset, and created this tool to minimize hassle
 - Recordings are organized into **Recording Sessions**
   - Record emotional sessions for each speaker or record more traditional LJSpeech-style sessions
   - Seamless transitions between different recording sessions with automatic progress tracking: continue where you left-off
-  - Offers advanced search and navigation capabilities for utterances, with flexible sorting by label, emotion, text
-    content, and recorded takes
+  - Offers advanced search and navigation capabilities for utterances, with flexible sorting and ordering by label, emotion, text
+    content, text length and recorded takes
   - Consistent audio settings & metadata for all recordings
 - **Real-time monitoring** including toggable recording levels, mel spectrograms, maximum frequency detection, and more
   - Customizable **industry-standard presets for Peak/RMS levels**
   - Dedicated **Monitoring mode** for precise input calibration
 - **Multi-Screen Support**
   - You can use multiple monitors to **separate recording view from speaker view**
-  - We support Apple's "Continuity" feature for a **convenient dual screen setup with an external iPad**
+  - We support Apple's [Sidecar](https://support.apple.com/en-us/102597) feature for a **convenient dual screen setup with an external iPad**
   - Each screen appearance can be individually configured
-  - All screen layouts, placement & configuration is preserved at exit
+  - All screen layouts, placement & configuration are preserved at exit
 - Export Dataset
   - Facilitates **batch export of multiple sessions** into T3 (Talrómur3) dataset format
   - Groups different recording sessions of the same speaker into a common dataset
+  - **Add voice timestamps, if VAD is enabled**
 ## Installation
+<details>
+<summary><b>Prerequisites</b></summary>
+### Tkinter
+Revoxx requires Tkinter for its graphical user interface. Tkinter is usually included with Python, but may need separate installation on some systems:
+**macOS**: Tkinter should be included with Python.org installers and Homebrew Python, but integration issues can occur. If you encounter problems:
+- For Homebrew Python: Try `brew install python-tk`
+- For Python.org installers: Reinstall Python with the official installer
+- Consider using a virtual environment with a fresh Python installation
+**Linux**:
+```bash
+# Ubuntu/Debian
+sudo apt-get install python3-tk
+# Fedora
+sudo dnf install python3-tkinter
+# Arch Linux
+sudo pacman -S tk
+```
+**Windows**: Tkinter is included with the standard Python installer.
+**Verify Tkinter installation**:
+```bash
+python3 -c "import tkinter; print('Tkinter is installed')"
+```
+</details>
 <details>
 <summary><b>Basic Installation</b></summary>
@@ -273,11 +313,15 @@ revoxx --show-devices            # List available audio devices
 revoxx --session path/to/session # Open specific session
 ```
+## Usage
+For a guide on using Revoxx, please see the [User Guide](https://github.com/icelandic-lt/revoxx/blob/main/revoxx/doc/USER_GUIDE.md).
 ## Prepare recordings
 Before you start recording, you need to prepare an utterance script with the utterances you want to record. This can be simplified by using the "Import Text to Script" Dialog:
-<img src="https://raw.githubusercontent.com/icelandic-lt/revoxx/main/doc/import_raw_text.png" alt="Raw text import dialog" width="30%"/>
+<img src="https://raw.githubusercontent.com/icelandic-lt/revoxx/main/doc/import_raw_text.png" alt="Raw text import dialog" width="50%"/>
 This dialog takes an input script of raw text and converts it into an utterance script. You can redo this for the same input text as many times you want, e.g. if you want to use separate emotional levels for different speakers.
@@ -297,17 +341,28 @@ For a script without emotion levels. This format was used for recording our non-
 ( <unique id> "<utterance>" )
 ```
-You can see for both formats an example in the directory [t3_scripts](t3_scripts).
+You can see for both formats an example in the directory [t3_scripts](https://github.com/icelandic-lt/revoxx/tree/main/t3_scripts).
 The emotion levels can be from any monotonic numerical value range you want. If you want to follow Talrómur 3 conventions, you can use emotion intensity levels 1-5 and 6 emotions: neutral, happy, sad, angry, surprised, and helpful.
 The emotion intensity levels are used to control the emotion intensity of the speech in combination with the specific emotion.
 Neutral speech is treated as intensity level 0 at dataset export.
-## Record dataset
+## Known Issues
-to be defined
+### macOS: System Python 3.9 Icon Loading Issue
-## Known Issues
+On macOS with the system-provided Python 3.9 (3.9.6), the application icon may fail to load with the error:
+- "couldn't recognize data in image file"
+- "Error: too many values to unpack (expected 2)"
+**Affected versions:**
+- macOS system Python 3.9.6 (default installation)
+**Solution:**
+- Use Python 3.9.23 or newer (available via Homebrew, uv or python.org)
+- Alternatively, use Python 3.10 or newer
+This issue is related to Tkinter's PNG handling in the macOS system Python 3.9.6 and does not affect newer Python versions.
 ### Linux: USB Audio Output Devices

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/README.md RENAMED Viewed

@@ -28,8 +28,10 @@ This repository provides **Revoxx**, a graphical recording application for recor
 ## System Requirements
 - **Operating System:** Linux/OS-X, should work on Windows
+- **Python:** 3.9 - 3.13 with Tkinter support
 - **Recording:** Audio Interface, good voice microphone and headphones
 - **Linux:** Requires PortAudio library (`sudo apt-get install portaudio19-dev` on Ubuntu/Debian)
+- **GUI:** Tkinter (usually included with Python, see installation notes below)
 ## Description
@@ -58,23 +60,57 @@ the Icelandic emotional speech dataset, and created this tool to minimize hassle
 - Recordings are organized into **Recording Sessions**
   - Record emotional sessions for each speaker or record more traditional LJSpeech-style sessions
   - Seamless transitions between different recording sessions with automatic progress tracking: continue where you left-off
-  - Offers advanced search and navigation capabilities for utterances, with flexible sorting by label, emotion, text
-    content, and recorded takes
+  - Offers advanced search and navigation capabilities for utterances, with flexible sorting and ordering by label, emotion, text
+    content, text length and recorded takes
   - Consistent audio settings & metadata for all recordings
 - **Real-time monitoring** including toggable recording levels, mel spectrograms, maximum frequency detection, and more
   - Customizable **industry-standard presets for Peak/RMS levels**
   - Dedicated **Monitoring mode** for precise input calibration
 - **Multi-Screen Support**
   - You can use multiple monitors to **separate recording view from speaker view**
-  - We support Apple's "Continuity" feature for a **convenient dual screen setup with an external iPad**
+  - We support Apple's [Sidecar](https://support.apple.com/en-us/102597) feature for a **convenient dual screen setup with an external iPad**
   - Each screen appearance can be individually configured
-  - All screen layouts, placement & configuration is preserved at exit
+  - All screen layouts, placement & configuration are preserved at exit
 - Export Dataset
   - Facilitates **batch export of multiple sessions** into T3 (Talrómur3) dataset format
   - Groups different recording sessions of the same speaker into a common dataset
+  - **Add voice timestamps, if VAD is enabled**
 ## Installation
+<details>
+<summary><b>Prerequisites</b></summary>
+### Tkinter
+Revoxx requires Tkinter for its graphical user interface. Tkinter is usually included with Python, but may need separate installation on some systems:
+**macOS**: Tkinter should be included with Python.org installers and Homebrew Python, but integration issues can occur. If you encounter problems:
+- For Homebrew Python: Try `brew install python-tk`
+- For Python.org installers: Reinstall Python with the official installer
+- Consider using a virtual environment with a fresh Python installation
+**Linux**:
+```bash
+# Ubuntu/Debian
+sudo apt-get install python3-tk
+# Fedora
+sudo dnf install python3-tkinter
+# Arch Linux
+sudo pacman -S tk
+```
+**Windows**: Tkinter is included with the standard Python installer.
+**Verify Tkinter installation**:
+```bash
+python3 -c "import tkinter; print('Tkinter is installed')"
+```
+</details>
 <details>
 <summary><b>Basic Installation</b></summary>
@@ -229,11 +265,15 @@ revoxx --show-devices            # List available audio devices
 revoxx --session path/to/session # Open specific session
 ```
+## Usage
+For a guide on using Revoxx, please see the [User Guide](https://github.com/icelandic-lt/revoxx/blob/main/revoxx/doc/USER_GUIDE.md).
 ## Prepare recordings
 Before you start recording, you need to prepare an utterance script with the utterances you want to record. This can be simplified by using the "Import Text to Script" Dialog:
-<img src="https://raw.githubusercontent.com/icelandic-lt/revoxx/main/doc/import_raw_text.png" alt="Raw text import dialog" width="30%"/>
+<img src="https://raw.githubusercontent.com/icelandic-lt/revoxx/main/doc/import_raw_text.png" alt="Raw text import dialog" width="50%"/>
 This dialog takes an input script of raw text and converts it into an utterance script. You can redo this for the same input text as many times you want, e.g. if you want to use separate emotional levels for different speakers.
@@ -253,17 +293,28 @@ For a script without emotion levels. This format was used for recording our non-
 ( <unique id> "<utterance>" )
 ```
-You can see for both formats an example in the directory [t3_scripts](t3_scripts).
+You can see for both formats an example in the directory [t3_scripts](https://github.com/icelandic-lt/revoxx/tree/main/t3_scripts).
 The emotion levels can be from any monotonic numerical value range you want. If you want to follow Talrómur 3 conventions, you can use emotion intensity levels 1-5 and 6 emotions: neutral, happy, sad, angry, surprised, and helpful.
 The emotion intensity levels are used to control the emotion intensity of the speech in combination with the specific emotion.
 Neutral speech is treated as intensity level 0 at dataset export.
-## Record dataset
+## Known Issues
-to be defined
+### macOS: System Python 3.9 Icon Loading Issue
-## Known Issues
+On macOS with the system-provided Python 3.9 (3.9.6), the application icon may fail to load with the error:
+- "couldn't recognize data in image file"
+- "Error: too many values to unpack (expected 2)"
+**Affected versions:**
+- macOS system Python 3.9.6 (default installation)
+**Solution:**
+- Use Python 3.9.23 or newer (available via Homebrew, uv or python.org)
+- Alternatively, use Python 3.10 or newer
+This issue is related to Tkinter's PNG handling in the macOS system Python 3.9.6 and does not affect newer Python versions.
 ### Linux: USB Audio Output Devices

revoxx-1.0.2/doc/import_raw_text.png ADDED Viewed

Binary file

revoxx-1.0.2/doc/screenshot1.png ADDED Viewed

Binary file

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/pyproject.toml RENAMED Viewed

@@ -1,10 +1,10 @@
 [build-system]
-requires = ["setuptools>=65.5.1", "wheel"]
+requires = ["setuptools>=65.5.1", "wheel", "versioningit>=2.0.0"]
 build-backend = "setuptools.build_meta"
 [project]
 name = "revoxx"
-version = "1.0.0.dev22"
+dynamic = ["version"]
 description = "Speech recording application for creating high-quality speech datasets"
 readme = "README.md"
 license = "Apache-2.0"
@@ -37,12 +37,15 @@ dependencies = [
     "sounddevice>=0.5.1",  # Updated for better Linux USB audio support
     "soundfile>=0.12.0",
     "tqdm>=4.65.0",
+    "markdown2>2.5.1",  # User guide dependencies
+    "tkinterweb>4.4.1"
 ]
 [project.optional-dependencies]
 vad = [
     "torch>=2.0.0",
     "silero-vad>=5.0",
+    "torchaudio<2.8.0"
 ]
 # Note: For CPU-only PyTorch (smaller download), install with:
 # pip install torch --index-url https://download.pytorch.org/whl/cpu
@@ -53,6 +56,7 @@ dev = [
     "flake8>=6.0.0",
     "pytest>=7.0.0",
     "pytest-cov>=3.0.0",
+    "versioningit>=2.0.0",  # For dynamic version detection during development
 ]
 [project.urls]
@@ -68,15 +72,17 @@ revoxx-vadiate = "scripts_module.vadiate:main"
 [tool.setuptools]
 packages = ["revoxx", "revoxx.audio", "revoxx.audio.processors", "revoxx.controllers",
-           "revoxx.dataset", "revoxx.resources", "revoxx.resources.templates",
+           "revoxx.dataset", "revoxx.doc", "revoxx.resources", "revoxx.resources.templates",
            "revoxx.session", "revoxx.ui", "revoxx.ui.dialogs", "revoxx.ui.level_meter",
            "revoxx.ui.menus", "revoxx.ui.spectrogram", "revoxx.ui.spectrogram.controllers",
            "revoxx.utils", "scripts_module"]
+include-package-data = true
 [tool.setuptools.package-data]
 revoxx = [
     "resources/*.png",
     "resources/templates/*.txt",
+    "doc/*.md",
 ]
 [tool.black]
@@ -108,4 +114,21 @@ include_trailing_comma = true
 force_grid_wrap = 0
 use_parentheses = true
 ensure_newline_before_comments = true
-line_length = 88
+line_length = 88
+[tool.versioningit]
+default-version = "1.0.0"
+[tool.versioningit.format]
+# Format used when there have been commits since the most recent tag:
+distance = "{base_version}.post{distance}+{vcs}{rev}"
+# Example formatted version: 1.2.3.post42+ge174a1f
+# Format used when there are uncommitted changes:
+dirty = "{base_version}+d{build_date:%Y%m%d}"
+# Example formatted version: 1.2.3+d20230922
+# Format used when there are both commits and uncommitted changes:
+distance-dirty = "{base_version}.post{distance}+{vcs}{rev}.d{build_date:%Y%m%d}"
+# Example formatted version: 1.2.3.post42+ge174a1f.d20230922

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/revoxx/__init__.py RENAMED Viewed

@@ -1,6 +1,14 @@
 """Revoxx Recorder - A tool for recording emotional speech."""
-__version__ = "1.0.0"
+try:
+    # Try to use versioningit for dynamic version detection
+    from versioningit import get_version
+    __version__ = get_version(root="../", config={})
+except (ImportError, Exception):
+    # Fallback if versioningit is not installed or fails
+    __version__ = "1.0.0+dev"
 __author__ = "Grammatek"
 # Only import main entry point to avoid circular imports

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/revoxx/app.py RENAMED Viewed

@@ -688,6 +688,12 @@ class Revoxx:
         # Tkinter might have changed it during setup
         self.cleanup_manager.refresh_sigint_handler()
+        # Show user guide dialog if configured
+        if self.settings_manager.get_setting("show_user_guide_at_startup", True):
+            from .ui.dialogs.user_guide_dialog import UserGuideDialog
+            UserGuideDialog(self.window.window, self.settings_manager)
         self.window.focus_window()
         self.window.window.mainloop()

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/revoxx/controllers/display_controller.py RENAMED Viewed

@@ -220,6 +220,36 @@ class DisplayController:
         """Reset the level meter display."""
         self.reset_level_meters()
+    def format_take_status(self, label: str) -> str:
+        """Format the take status display string for a given label.
+        This returns current take information in the status bar.
+        Args:
+            label: The utterance label (e.g., "utterance_001")
+        Returns:
+            - Empty string if label is None or empty
+            - Just the label if no active_recordings exist
+            - Just the label if no takes exist for this utterance
+            - "label - Take X/Y" if takes exist, where X is the position of the
+              current take in the list and Y is the total number of takes
+        """
+        if not label:
+            return ""
+        if not self.app.active_recordings:
+            return label
+        current_take = self.app.state.recording.get_current_take(label)
+        existing_takes = self.app.active_recordings.get_existing_takes(label)
+        if existing_takes and current_take in existing_takes:
+            position = existing_takes.index(current_take) + 1
+            return f"{label} - Take {position}/{len(existing_takes)}"
+        return label
     def set_status(self, status: str, msg_type: MsgType = MsgType.TEMPORARY) -> None:
         """Set the status bar text.

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/revoxx/controllers/navigation_controller.py RENAMED Viewed

@@ -2,7 +2,7 @@
 from typing import TYPE_CHECKING
-from ..constants import FileConstants
+from ..constants import FileConstants, MsgType
 if TYPE_CHECKING:
     from ..app import Revoxx
@@ -134,10 +134,6 @@ class NavigationController:
             # Update info overlay if visible
             if self.app.window.info_panel_visible:
                 self.app.display_controller.update_info_panel()
-        else:
-            # No more takes in that direction
-            direction_text = "forward" if direction > 0 else "backward"
-            self.app.display_controller.set_status(f"No more takes {direction_text}")
     def find_utterance(self, index: int) -> None:
         """Navigate directly to a specific utterance by index.
@@ -252,15 +248,8 @@ class NavigationController:
         if not current_label:
             return
-        current_take = self.app.state.recording.get_current_take(current_label)
-        if not self.app.active_recordings:
-            existing_takes = []
-        else:
-            existing_takes = self.app.active_recordings.get_existing_takes(
-                current_label
-            )
         # Update label with filename if we have a recording
+        current_take = self.app.state.recording.get_current_take(current_label)
         if current_take > 0:
             filename = f"take_{current_take:03d}{FileConstants.AUDIO_FILE_EXTENSION}"
             self.app.window.update_label_with_filename(current_label, filename)
@@ -277,18 +266,8 @@ class NavigationController:
                 if second:
                     second.update_label_with_filename(current_label)
-        if existing_takes and current_take in existing_takes:
-            # Find position in the list
-            position = existing_takes.index(current_take) + 1
-            total = len(existing_takes)
-            self.app.display_controller.set_status(
-                f"{current_label} - Take {position}/{total}"
-            )
-        elif not existing_takes:
-            # Show label even without recordings
-            self.app.display_controller.set_status(f"{current_label}")
-        else:
-            self.app.display_controller.set_status(f"{current_label}")
+        status_text = self.app.display_controller.format_take_status(current_label)
+        self.app.display_controller.set_status(status_text, MsgType.DEFAULT)
     def after_recording_saved(self, label: str) -> None:
         """Called after a recording has been saved to disk.

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/revoxx/controllers/process_manager.py RENAMED Viewed

@@ -77,6 +77,9 @@ class ProcessManager:
         self.set_audio_queue_active(False)
         self.set_save_path(None)
+        # Check for VAD availability
+        self._check_vad_availability()
     def start_processes(self) -> None:
         """Start background recording and playback processes."""
         if self.app.debug:
@@ -322,3 +325,34 @@ class ProcessManager:
             and self.playback_process is not None
             and self.playback_process.is_alive()
         )
+    def _check_vad_availability(self) -> None:
+        """Check if VAD support is available and store in manager_dict."""
+        try:
+            # Try to import the VAD module from scripts_module
+            from scripts_module import vadiate  # noqa: F401
+            from silero_vad import load_silero_vad  # noqa: F401
+            vad_available = True
+            if self.app.debug:
+                print("[ProcessManager] VAD support is available")
+        except ImportError:
+            vad_available = False
+            if self.app.debug:
+                print("[ProcessManager] VAD support is not available")
+        if self.manager_dict is not None:
+            self.manager_dict["vad_available"] = vad_available
+    def is_vad_available(self) -> bool:
+        """Check if VAD support is available.
+        Returns:
+            True if VAD is available
+        """
+        if self.manager_dict:
+            try:
+                return self.manager_dict.get("vad_available", False)
+            except (AttributeError, KeyError):
+                return False
+        return False

{revoxx-1.0.0.dev22 → revoxx-1.0.2}/revoxx/controllers/session_controller.py RENAMED Viewed

@@ -147,10 +147,7 @@ class SessionController:
         self.reload_script_and_recordings()
         # Then apply saved sort settings from session (after data is loaded)
-        if session:
-            self.app.active_recordings.set_sort(
-                session.sort_column, session.sort_reverse
-            )
+        self.app.active_recordings.set_sort(session.sort_column, session.sort_reverse)
         self.app.window.window.title(f"Revoxx - {session.name}")
         self.app.menu.update_recent_sessions()

revoxx 1.0.0.dev22__tar.gz → 1.0.2__tar.gz

revoxx 1.0.0.dev22tar.gz → 1.0.2tar.gz