PyPI - supervoxtral - Versions diffs - 0.1.2__tar.gz → 0.1.4__tar.gz - Mend

supervoxtral 0.1.2tar.gz → 0.1.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

{supervoxtral-0.1.2 → supervoxtral-0.1.4}/AGENTS.md RENAMED Viewed

@@ -13,7 +13,7 @@ supervoxtral/
 │   │   ├── audio.py               # Recording, ffmpeg detection/conversion
 │   │   ├── config.py              # Structured Config dataclasses, loading, resolution, logging setup
 │   │   ├── pipeline.py            # Centralized RecordingPipeline for CLI/GUI unification
-│   │   ├── prompt.py              # Prompt resolution (via Config)
+│   │   ├── prompt.py              # Prompt resolution (supports multiple prompts via Config dict, key-based)
 │   │   └── storage.py             # Save transcripts and raw JSON (conditional on keep_transcript_files)
 │   ├── providers/                 # API integrations
 │   │   ├── __init__.py            # Provider registry (get_provider with Config support)
@@ -33,10 +33,10 @@ supervoxtral/
 ## Typical Execution Flow
 - **Entry**: `svx/cli.py` Typer `record` command parses args (e.g., --prompt, --save-all, --gui, --transcribe).
-- **Config & Prompt**: Load `Config` via `Config.load()` (`core/config.py`); if transcribe_mode, skip prompt resolution; else resolve prompt with `cfg.resolve_prompt()` (`core/prompt.py`).
-- **Pipeline**: Run `RecordingPipeline` (`core/pipeline.py`): record WAV/stop (`core/audio.py`), optional conversion (ffmpeg), get provider/init (`providers/__init__.py`, e.g., `mistral.py` from `cfg`); if transcribe_mode (CLI only): no prompt, model override to voxtral-mini-latest (with warning if changed), pass transcribe_mode to provider.transcribe; for GUI: --transcribe ignored (warning), recording starts immediately, uses modular record()/process()/clean() with dynamic mode (Transcribe: no prompt, model override; Prompt: resolved prompt); transcribe, conditional save (`core/storage.py` based on `keep_*`/`save_all`), clipboard copy, logging setup.
+- **Config & Prompt**: Load `Config` via `Config.load()` (`core/config.py`); supports dict of prompts in config.toml (e.g., [prompt.default], [prompt.other]); if transcribe_mode, skip prompt resolution; else resolve prompt with `cfg.resolve_prompt(key="default" for CLI, or selected key for GUI)` (`core/prompt.py`).
+- **Pipeline**: Run `RecordingPipeline` (`core/pipeline.py`): record WAV/stop (`core/audio.py`), optional conversion (ffmpeg), get provider/init (`providers/__init__.py`, e.g., `mistral.py` from `cfg`); if transcribe_mode (CLI only): no prompt, model override to voxtral-mini-latest (with warning if changed), pass transcribe_mode to provider.transcribe; for GUI: --transcribe ignored (warning), recording starts immediately, uses modular record()/process()/clean() with dynamic mode (Transcribe: no prompt, model override; Prompt key: resolved prompt for selected key); transcribe, conditional save (`core/storage.py` based on `keep_*`/`save_all`), clipboard copy, logging setup.
 - **Cleanup**: Temp files auto-deleted (tempfile) if `keep_*=false`; dirs created only if persistence enabled.
-- **End**: Return `{"text": str, "raw": dict, "duration": float, "paths": dict}`; CLI prints result, GUI emits progress/updates via callback (buttons: 'Transcribe' for stop/transcribe without prompt; 'Prompt' for stop/use resolved prompt; default 'Prompt' on Esc/close).
+- **End**: Return `{"text": str, "raw": dict, "duration": float, "paths": dict}`; CLI prints result (uses "default" prompt), GUI emits progress/updates via callback (buttons: 'Transcribe' for no prompt; capitalized prompt keys (e.g., 'Default', 'Test') for selected prompt; 'Cancel'; Esc/close cancels).
 ## Build & test
 ```bash

{supervoxtral-0.1.2 → supervoxtral-0.1.4}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: supervoxtral
-Version: 0.1.2
+Version: 0.1.4
 Summary: CLI/GUI audio recorder and transcription client using Mistral Voxtral (chat with audio and transcription).
 License: MIT
 License-File: LICENSE

{supervoxtral-0.1.2 → supervoxtral-0.1.4}/README.md RENAMED Viewed

@@ -22,16 +22,17 @@ The GUI is minimal, launches fast, and can be bound to a system hotkey. Upon sto
 The package is available on PyPI. We recommend using `uv` (a fast Python package installer) for a simple, global tool installation—no virtual environment setup required.
-- For core CLI functionality:
-  ```
-  uv tool install supervoxtral
-  ```
 - For GUI support (includes PySide6):
   ```
   uv tool install "supervoxtral[gui]"
   ```
+- For core CLI only functionality:
+  ```
+  uv tool install supervoxtral
+  ```
 This installs the `svx` command globally. If you don't have `uv`, install it first via `curl -LsSf https://astral.sh/uv/install.sh | sh` (or from https://docs.astral.sh/uv/getting-started/installation/).
 **Alternative: Using pip with a virtual environment**
@@ -53,13 +54,10 @@ If you prefer not to use uv, you can install via pip in a virtual environment:
      ```
 2. Install the package:
-   ```
-   pip install supervoxtral
-   ```
    For GUI support (includes PySide6):
    ```
-   pip install supervoxtral[gui]
+   pip install "supervoxtral[gui]"
    ```
 This installs the `svx` command within the virtual environment. Make sure to activate the environment before running `svx`.
@@ -86,7 +84,7 @@ To get started quickly with SuperVoxtral:
    ```
 3. Launch the GUI: `svx record --gui`
-   This opens the minimal GUI, starts recording immediately; click 'Transcribe' for pure transcription (no prompt) or 'Prompt' for prompted transcription (resolved prompt); --transcribe ignored with warning (results copied to clipboard).
+   This opens the minimal GUI, starts recording immediately; click 'Transcribe' for pure transcription (no prompt) or a button for each configured prompt (e.g., 'Default', 'Mail', 'Translate') for prompted transcription using the selected prompt; --transcribe ignored with warning (results copied to clipboard).
 ### macOS Shortcuts Integration
@@ -173,13 +171,18 @@ copy = true
 # Log level: "DEBUG" | "INFO" | "WARNING" | "ERROR"
 log_level = "INFO"
-[prompt]
+[prompt.default]
 # Default user prompt source:
 # - Option 1: Use a file (recommended)
 file = "~/.config/supervoxtral/prompt/user.md"
 #
 # - Option 2: Inline prompt (less recommended for long text)
 # text = "Please transcribe the audio and provide a concise summary in French."
+[prompt.test]
+# Example additional prompt
+# file = "/path/to/another_prompt.md"
+# text = "Summarize the meeting in bullet points."
 ```
 **Configuration is centralized via a structured `Config` object loaded from your user configuration file (`config.toml`). CLI arguments override select values (e.g., prompt, log level), but most defaults (provider, model, keep flags) come from `config.toml`. No environment variables are used for API keys or settings.**
@@ -221,13 +224,17 @@ svx record [OPTIONS]
   - Interactive mode: recording starts immediately; click 'Transcribe' (pure transcription, no prompt) or 'Prompt' (with resolved prompt); --transcribe ignored with warning. GUI respects config.toml and CLI flags (e.g., `--gui --save-all`).
 **Prompt Resolution Priority** (for non-transcribe mode):
+By default in CLI, uses the 'default' prompt from config.toml [prompt.default]. For overrides:
 1. CLI `--user-prompt` or `--user-prompt-file`
-2. config.toml [prompt] section (text or file)
-3. User prompt file (user.md in config dir)
-4. Fallback: "What's in this audio?"
+2. Specified prompt key (future: via --prompt-key; currently implicit 'default')
+3. config.toml [prompt.default] (text or file)
+4. User prompt file (user.md in config dir)
+5. Fallback: "What's in this audio?"
 ## Changelog
+- 0.1.4: Support for multiple prompts in config.toml with dynamic GUI buttons for each prompt key
+- 0.1.3: Minor style update
 - 0.1.2: Interactive mode in GUI (choose transcribe / prompt / cancel while recording)
 - 0.1.1: Minor updates to default config and default prompt

{supervoxtral-0.1.2 → supervoxtral-0.1.4}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "supervoxtral"
-version = "0.1.2"
+version = "0.1.4"
 description = "CLI/GUI audio recorder and transcription client using Mistral Voxtral (chat with audio and transcription)."
 requires-python = ">=3.11"
 license = { text = "MIT" }

supervoxtral-0.1.4/supervoxtral.gif ADDED Viewed

Binary file

{supervoxtral-0.1.2 → supervoxtral-0.1.4}/svx/cli.py RENAMED Viewed

@@ -72,7 +72,7 @@ def config_show() -> None:
     user_prompt_file = cfg.user_prompt_dir / "user.md"
     defaults_section = asdict(cfg.defaults)
-    prompt_section = asdict(cfg.prompt)
+    prompt_section = {k: asdict(e) for k, e in cfg.prompt.prompts.items()}
     # Resolve prompt source (same logic as record command, but read-only)
     resolved_prompt = cfg.resolve_prompt(None, None)

{supervoxtral-0.1.2 → supervoxtral-0.1.4}/svx/core/config.py RENAMED Viewed

@@ -242,13 +242,15 @@ def init_user_config(force: bool = False, prompt_file: Path | None = None) -> Pa
         "copy = true\n\n"
         '# Log level: "DEBUG" | "INFO" | "WARNING" | "ERROR"\n'
         'log_level = "INFO"\n\n'
-        "[prompt]\n"
+        "[prompt.default]\n"
         "# Default user prompt source:\n"
         "# - Option 1: Use a file (recommended)\n"
         f'file = "{str(prompt_file)}"\n'
         "#\n"
         "# - Option 2: Inline prompt (less recommended for long text)\n"
         '# text = "Please transcribe the audio and provide a concise summary in French."\n'
+        "#\n"
+        "# For multiple prompts in future, add [prompt.other] sections.\n"
     )
     if not USER_CONFIG_FILE.exists() or force:
@@ -282,11 +284,16 @@ class DefaultsConfig:
 @dataclass
-class PromptConfig:
+class PromptEntry:
     text: str | None = None
     file: str | None = None
+@dataclass
+class PromptConfig:
+    prompts: dict[str, PromptEntry] = field(default_factory=lambda: {"default": PromptEntry()})
 @dataclass
 class Config:
     providers: dict[str, ProviderConfig] = field(default_factory=dict)
@@ -356,11 +363,39 @@ class Config:
                 providers_data[name] = ProviderConfig(api_key=api_key)
         # Prompt
         prompt_raw = user_config.get("prompt", {})
-        prompt_data = {
-            "text": prompt_raw.get("text") if isinstance(prompt_raw.get("text"), str) else None,
-            "file": prompt_raw.get("file") if isinstance(prompt_raw.get("file"), str) else None,
-        }
-        prompt = PromptConfig(**prompt_data)
+        prompts_data: dict[str, PromptEntry] = {}
+        if isinstance(prompt_raw, dict):
+            if any(k in prompt_raw for k in ["text", "file"]):  # old flat style
+                logging.warning(
+                    "Old [prompt] format detected in %s; "
+                    "please migrate to [prompt.default] manually.",
+                    USER_CONFIG_FILE,
+                )
+                entry = PromptEntry(
+                    text=prompt_raw.get("text")
+                    if isinstance(prompt_raw.get("text"), str)
+                    else None,
+                    file=prompt_raw.get("file")
+                    if isinstance(prompt_raw.get("file"), str)
+                    else None,
+                )
+                prompts_data["default"] = entry
+            else:  # new nested style
+                for key, entry_raw in prompt_raw.items():
+                    if isinstance(entry_raw, dict):
+                        entry = PromptEntry(
+                            text=entry_raw.get("text")
+                            if isinstance(entry_raw.get("text"), str)
+                            else None,
+                            file=entry_raw.get("file")
+                            if isinstance(entry_raw.get("file"), str)
+                            else None,
+                        )
+                        prompts_data[key] = entry
+        # Ensure "default" always exists
+        if "default" not in prompts_data:
+            prompts_data["default"] = PromptEntry()
+        prompt = PromptConfig(prompts=prompts_data)
         data = {
             "defaults": defaults,
             "providers": providers_data,
@@ -376,7 +411,7 @@ class Config:
     def resolve_prompt(self, inline: str | None = None, file_path: Path | None = None) -> str:
         from svx.core.prompt import resolve_user_prompt
-        return resolve_user_prompt(self, inline, file_path, self.user_prompt_dir)
+        return resolve_user_prompt(self, inline, file_path, self.user_prompt_dir, key="default")
     def get_provider_config(self, name: str) -> dict[str, Any]:
         return asdict(self.providers.get(name, ProviderConfig()))

{supervoxtral-0.1.2 → supervoxtral-0.1.4}/svx/core/prompt.py RENAMED Viewed

@@ -14,7 +14,7 @@ from __future__ import annotations
 import logging
 from pathlib import Path
-from .config import USER_PROMPT_DIR, Config
+from .config import USER_PROMPT_DIR, Config, PromptEntry
 __all__ = [
     "read_text_file",
@@ -68,16 +68,16 @@ def resolve_user_prompt(
     inline: str | None = None,
     file: Path | None = None,
     user_prompt_dir: Path | None = None,
+    key: str | None = None,
 ) -> str:
     """
     Resolve the effective user prompt from multiple sources, by priority:
     1) inline text (CLI --user-prompt)
     2) explicit file (CLI --user-prompt-file)
-    3) user config inline text (cfg.prompt.text)
-    4) user config file path (cfg.prompt.file)
-    5) user prompt dir file (user_prompt_dir / 'user.md')
-    6) literal fallback: "What's in this audio?"
+    3) user config prompt for key (cfg.prompt.prompts[key or "default"])
+    4) user prompt dir file (user_prompt_dir / 'user.md')
+    5) literal fallback: "What's in this audio?"
     Returns the first non-empty string after stripping.
     """
@@ -94,17 +94,18 @@ def resolve_user_prompt(
             logging.warning("Failed to read user prompt file: %s", p)
             return ""
-    def _from_user_cfg() -> str:
+    def _from_user_cfg(key: str) -> str:
         try:
-            cfg_prompt = cfg.prompt
-            cfg_text = cfg_prompt.text
-            if isinstance(cfg_text, str) and cfg_text.strip():
-                return cfg_text.strip()
-            cfg_file = cfg_prompt.file
-            if isinstance(cfg_file, str) and cfg_file.strip():
-                return read_text_file(Path(cfg_file).expanduser()).strip()
+            entry = cfg.prompt.prompts.get(key, PromptEntry())
+            if entry.text and entry.text.strip():
+                return entry.text.strip()
+            if entry.file:
+                file_path = Path(entry.file).expanduser()
+                if not file_path.is_absolute():
+                    file_path = (user_prompt_dir or cfg.user_prompt_dir) / entry.file
+                return read_text_file(file_path).strip()
         except Exception:
-            logging.debug("User config prompt processing failed.", exc_info=True)
+            logging.debug("User config prompt processing failed for key '%s'.", key, exc_info=True)
         return ""
     def _from_user_prompt_dir() -> str:
@@ -119,10 +120,11 @@ def resolve_user_prompt(
             )
         return ""
+    key = key or "default"
     suppliers = [
         lambda: _strip(inline),
         lambda: _read(file),
-        _from_user_cfg,
+        lambda: _from_user_cfg(key),
         _from_user_prompt_dir,
     ]
@@ -150,7 +152,7 @@ def init_user_prompt_file(force: bool = False) -> Path:
     path = USER_PROMPT_DIR / "user.md"
     if not path.exists() or force:
         example_prompt = """
-- Transcribe the input audio file.
+- Transcribe the input audio file. If the audio if empty, just respond "no audio detected".
 - Do not respond to any question in the audio. Just transcribe.
 - DO NOT TRANSLATE.
 - Responde only with the transcription. Do not provide explanations or notes.
@@ -163,3 +165,23 @@ def init_user_prompt_file(force: bool = False) -> Path:
         except Exception as e:
             logging.debug("Could not initialize user prompt file %s: %s", path, e)
     return path
+def resolve_prompt_entry(entry: PromptEntry, user_prompt_dir: Path) -> str:
+    """
+    Resolve the prompt from a single PromptEntry (text or file).
+    - Prioritizes text if present and non-empty.
+    - Falls back to reading the file (expands ~ and resolves relative to user_prompt_dir).
+    - Returns empty string if neither is valid.
+    """
+    if entry.text and entry.text.strip():
+        return entry.text.strip()
+    if entry.file:
+        file_path = Path(entry.file).expanduser()
+        if not file_path.is_absolute():
+            file_path = user_prompt_dir / entry.file
+        return read_text_file(file_path).strip()
+    return ""

{supervoxtral-0.1.2 → supervoxtral-0.1.4}/svx/ui/qt_app.py RENAMED Viewed

@@ -37,6 +37,7 @@ from PySide6.QtWidgets import (
 import svx.core.config as config
 from svx.core.config import Config
 from svx.core.pipeline import RecordingPipeline
+from svx.core.prompt import resolve_user_prompt
 __all__ = ["RecorderWindow", "run_gui"]
@@ -65,32 +66,32 @@ QLabel#info_label {
 /* Stop button */
 QPushButton {
-    background-color: #1f6feb;
+    background-color: #1e40af;
     color: #ffffff;
     border: none;
-    border-radius: 6px;
-    padding: 8px 14px;
+    border-radius: 2px;
+    padding: 4px 8px;
     margin: 6px;
-    min-width: 80px;
+    min-width: 60px;
 }
 QPushButton:disabled {
-    background-color: #274a7a;
-    color: #9fb8e6;
+    background-color: #374151;
+    color: #9ca3af;
 }
 QPushButton:hover {
-    background-color: #2a78ff;
+    background-color: #1d4ed8;
 }
 /* Cancel button */
 QPushButton#cancel_btn {
-    background-color: #da3633;
+    background-color: #b91c1c;
 }
 QPushButton#cancel_btn:hover {
-    background-color: #f85149;
+    background-color: #ef4444;
 }
 QPushButton#cancel_btn:disabled {
-    background-color: #8b0000;
-    color: #9fb8e6;
+    background-color: #4b5563;
+    color: #9ca3af;
 }
 /* Small window border effect (subtle) */
@@ -239,11 +240,11 @@ class RecorderWorker(QObject):
         self.cancel_requested = True
         self._stop_event.set()
-    def _resolve_user_prompt(self) -> str:
+    def _resolve_user_prompt(self, key: str) -> str:
         """
-        Determine the final user prompt using the shared resolver.
+        Determine the final user prompt using the shared resolver for the given key.
         """
-        return self.cfg.resolve_prompt(self.user_prompt, self.user_prompt_file)
+        return resolve_user_prompt(self.cfg, None, None, self.cfg.user_prompt_dir, key=key)
     def run(self) -> None:
         """
@@ -275,7 +276,7 @@ class RecorderWorker(QObject):
             while self.mode is None:
                 time.sleep(0.05)
             transcribe_mode = self.mode == "transcribe"
-            user_prompt = None if transcribe_mode else self._resolve_user_prompt()
+            user_prompt = None if transcribe_mode else self._resolve_user_prompt(self.mode)
             result = pipeline.process(wav_path, duration, transcribe_mode, user_prompt)
             keep_audio = self.save_all or self.cfg.defaults.keep_audio_files
             pipeline.clean(wav_path, result["paths"], keep_audio)
@@ -310,6 +311,7 @@ class RecorderWindow(QWidget):
         self.user_prompt_file = user_prompt_file
         self.save_all = save_all
         self.outfile_prefix = outfile_prefix
+        self.prompt_keys = sorted(self.cfg.prompt.prompts.keys())
         # Background worker (create early for signal connections)
         self._worker = RecorderWorker(
@@ -381,12 +383,15 @@ class RecorderWindow(QWidget):
         button_layout.addStretch()
         self._transcribe_btn = QPushButton("Transcribe")
         self._transcribe_btn.setToolTip("Stop and transcribe without prompt")
-        self._transcribe_btn.clicked.connect(lambda: self._on_button_clicked("transcribe"))
+        self._transcribe_btn.clicked.connect(lambda: self._on_mode_selected("transcribe"))
         button_layout.addWidget(self._transcribe_btn)
-        self._prompt_btn = QPushButton("Prompt")
-        self._prompt_btn.setToolTip("Stop and transcribe with prompt")
-        self._prompt_btn.clicked.connect(lambda: self._on_button_clicked("prompt"))
-        button_layout.addWidget(self._prompt_btn)
+        self._prompt_buttons: dict[str, QPushButton] = {}
+        for key in self.prompt_keys:
+            btn = QPushButton(key.capitalize())
+            btn.setToolTip(f"Stop and transcribe with '{key}' prompt")
+            btn.clicked.connect(lambda k=key: self._on_mode_selected(k))
+            self._prompt_buttons[key] = btn
+            button_layout.addWidget(btn)
         self._cancel_btn = QPushButton("Cancel")
         self._cancel_btn.setObjectName("cancel_btn")
         self._cancel_btn.setToolTip("Stop recording and quit without processing")
@@ -397,6 +402,8 @@ class RecorderWindow(QWidget):
         button_widget.setLayout(button_layout)
         layout.addWidget(button_widget, 0, Qt.AlignmentFlag.AlignCenter)
+        self._action_buttons = [self._transcribe_btn] + list(self._prompt_buttons.values())
         # Keyboard shortcut: Esc to stop
         stop_action = QAction(self)
         stop_action.setShortcut(QKeySequence.StandardKey.Cancel)  # Esc
@@ -456,17 +463,17 @@ class RecorderWindow(QWidget):
         self._worker.cancel()
         super().closeEvent(event)
-    def _on_button_clicked(self, mode: str) -> None:
-        self._transcribe_btn.setEnabled(False)
-        self._prompt_btn.setEnabled(False)
+    def _on_mode_selected(self, mode: str) -> None:
+        for btn in self._action_buttons:
+            btn.setEnabled(False)
         self._cancel_btn.setEnabled(False)
         self._status_label.setText("Stopping and processing...")
         self._worker.set_mode(mode)
         self._worker.stop()
     def _on_cancel_clicked(self) -> None:
-        self._transcribe_btn.setEnabled(False)
-        self._prompt_btn.setEnabled(False)
+        for btn in self._action_buttons:
+            btn.setEnabled(False)
         self._cancel_btn.setEnabled(False)
         self._status_label.setText("Canceling...")
         self._worker.cancel()