PyPI - fow-cli - Versions diffs - 0.2.0__tar.gz → 0.3.0__tar.gz - Mend

fow-cli 0.2.0tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (51) hide show

{fow_cli-0.2.0 → fow_cli-0.3.0}/CHANGELOG.md RENAMED Viewed

@@ -2,6 +2,15 @@
 All notable changes to Fly on the Wall are documented here.
+## [0.3.0] - 2026-06-13
+### Added
+- Added glossary management with `fow glossary` commands.
+- Added glossary and known-person hints for ElevenLabs transcription keyterms.
+- Added glossary guidance to OpenAI cleanup, analysis, and title generation.
+- Added Obsidian `participants` frontmatter links for known meeting speakers.
 ## [0.2.0] - 2026-06-09
 ### Added

{fow_cli-0.2.0 → fow_cli-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: fow-cli
-Version: 0.2.0
+Version: 0.3.0
 Summary: Personal CLI note-taker for turning meeting audio into cleaned meeting manuscripts.
 Project-URL: Repository, https://github.com/henriksvensson/fly-on-the-wall
 License-Expression: MIT
@@ -50,6 +50,8 @@ Issues and suggestions are welcome via GitHub Issues, but the project is provide
 Audio is sent to configured transcription/AI providers during processing. Optional speaker identity embeddings run locally when installed with the `identity` extra. External providers may charge usage-based fees depending on your provider account, pricing plan, and processing volume.
+Glossary/keyterm hints are sent to ElevenLabs when processing new recordings. ElevenLabs currently documents this as a billable add-on to speech-to-text usage.
 ## Development Transparency
 This project was developed as an agentic coding project using [OpenCode](https://opencode.ai/) with [OpenAI](https://openai.com/) GPT-5.5. Code quality checks were supported by CodeScene's [CodeHealth](https://codescene.com/product/code-health) analysis.
@@ -271,6 +273,32 @@ fow people embeddings status
 fow people embeddings backfill
 ```
+## Glossary
+Use the glossary for names, company names, project names, product names, acronyms, and domain-specific phrases that transcription or cleanup models may spell incorrectly.
+Add terms with optional context:
+```bash
+fow glossary add "Hejare" --description "Company name"
+fow glossary add "Datadrivna" --description "The phrase data driven in Swedish"
+fow glossary add "Ants" --description "Company name"
+fow glossary add "TT" --description "Company name, short for Theodora Tech"
+```
+Manage terms:
+```bash
+fow glossary list
+fow glossary show "Hejare"
+fow glossary update "TT" --description "Company name, short for Theodora Tech"
+fow glossary disable "Ants"
+fow glossary enable "Ants"
+fow glossary remove "Ants"
+```
+During processing, `fow` combines enabled glossary terms with known people names. The combined list is sent to ElevenLabs as transcription keyterms for new transcriptions, and to OpenAI cleanup, analysis, and title generation as spelling/context guidance. Corrections are model-mediated; `fow` does not do deterministic search-and-replace from the glossary.
 ## Watched Folders
 Fly on the Wall can watch local folders, mounted Dropbox/rclone folders, and removable recorder folders.

{fow_cli-0.2.0 → fow_cli-0.3.0}/README.md RENAMED Viewed

@@ -20,6 +20,8 @@ Issues and suggestions are welcome via GitHub Issues, but the project is provide
 Audio is sent to configured transcription/AI providers during processing. Optional speaker identity embeddings run locally when installed with the `identity` extra. External providers may charge usage-based fees depending on your provider account, pricing plan, and processing volume.
+Glossary/keyterm hints are sent to ElevenLabs when processing new recordings. ElevenLabs currently documents this as a billable add-on to speech-to-text usage.
 ## Development Transparency
 This project was developed as an agentic coding project using [OpenCode](https://opencode.ai/) with [OpenAI](https://openai.com/) GPT-5.5. Code quality checks were supported by CodeScene's [CodeHealth](https://codescene.com/product/code-health) analysis.
@@ -241,6 +243,32 @@ fow people embeddings status
 fow people embeddings backfill
 ```
+## Glossary
+Use the glossary for names, company names, project names, product names, acronyms, and domain-specific phrases that transcription or cleanup models may spell incorrectly.
+Add terms with optional context:
+```bash
+fow glossary add "Hejare" --description "Company name"
+fow glossary add "Datadrivna" --description "The phrase data driven in Swedish"
+fow glossary add "Ants" --description "Company name"
+fow glossary add "TT" --description "Company name, short for Theodora Tech"
+```
+Manage terms:
+```bash
+fow glossary list
+fow glossary show "Hejare"
+fow glossary update "TT" --description "Company name, short for Theodora Tech"
+fow glossary disable "Ants"
+fow glossary enable "Ants"
+fow glossary remove "Ants"
+```
+During processing, `fow` combines enabled glossary terms with known people names. The combined list is sent to ElevenLabs as transcription keyterms for new transcriptions, and to OpenAI cleanup, analysis, and title generation as spelling/context guidance. Corrections are model-mediated; `fow` does not do deterministic search-and-replace from the glossary.
 ## Watched Folders
 Fly on the Wall can watch local folders, mounted Dropbox/rclone folders, and removable recorder folders.

{fow_cli-0.2.0 → fow_cli-0.3.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "fow-cli"
-version = "0.2.0"
+version = "0.3.0"
 description = "Personal CLI note-taker for turning meeting audio into cleaned meeting manuscripts."
 readme = "README.md"
 requires-python = ">=3.12"

{fow_cli-0.2.0 → fow_cli-0.3.0}/src/fly_on_the_wall/__init__.py RENAMED Viewed

@@ -1,3 +1,3 @@
 """Fly on the Wall CLI application."""
-__version__ = "0.2.0"
+__version__ = "0.3.0"

{fow_cli-0.2.0 → fow_cli-0.3.0}/src/fly_on_the_wall/cli.py RENAMED Viewed

@@ -9,6 +9,7 @@ from rich.table import Table
 from fly_on_the_wall import __version__
 from fly_on_the_wall.cli_costs import costs_app
+from fly_on_the_wall.cli_glossary import glossary_app
 from fly_on_the_wall.cli_publish import publish_app
 from fly_on_the_wall.cli_speaker_review import speakers_review
 from fly_on_the_wall.cli_watch import watch_app
@@ -78,6 +79,7 @@ app.add_typer(meetings_app, name="meetings")
 meetings_app.add_typer(meeting_speakers_app, name="speakers")
 app.add_typer(refresh_app, name="refresh")
 app.add_typer(secrets_app, name="secrets")
+app.add_typer(glossary_app, name="glossary")
 app.add_typer(watch_app, name="watch")
 app.add_typer(publish_app, name="publish")
 app.add_typer(costs_app, name="costs")

fow_cli-0.3.0/src/fly_on_the_wall/cli_glossary.py ADDED Viewed

@@ -0,0 +1,124 @@
+from __future__ import annotations
+from typing import Annotated
+import typer
+from rich.console import Console
+from rich.table import Table
+from fly_on_the_wall.db import database
+from fly_on_the_wall.glossary import (
+    create_glossary_term,
+    get_glossary_term,
+    list_glossary_terms,
+    remove_glossary_term,
+    update_glossary_term,
+)
+glossary_app = typer.Typer(help="Manage transcription and cleanup glossary terms.", no_args_is_help=True)
+console = Console()
+@glossary_app.command("add")
+def glossary_add(
+    term: str,
+    description: Annotated[str | None, typer.Option("--description", "-d", help="Optional context.")] = None,
+) -> None:
+    """Add a word or phrase to the glossary."""
+    with database() as connection:
+        try:
+            created = create_glossary_term(connection, term, description)
+        except ValueError as exc:
+            console.print(str(exc))
+            raise typer.Exit(code=1) from exc
+    console.print(f"Added glossary term: {created.term}")
+@glossary_app.command("list")
+def glossary_list(
+    all_terms: Annotated[bool, typer.Option("--all", help="Include disabled terms.")] = False,
+) -> None:
+    """List glossary terms."""
+    with database() as connection:
+        terms = list_glossary_terms(connection, include_disabled=all_terms)
+    if not terms:
+        console.print("No glossary terms found.")
+        return
+    table = Table(title="Glossary")
+    table.add_column("Term")
+    table.add_column("Description")
+    table.add_column("Enabled")
+    for term in terms:
+        table.add_row(term.term, term.description or "", "yes" if term.enabled else "no")
+    console.print(table)
+@glossary_app.command("show")
+def glossary_show(term: str) -> None:
+    """Show one glossary term."""
+    with database() as connection:
+        found = get_glossary_term(connection, term)
+    if found is None:
+        console.print(f"Glossary term not found: {term}")
+        raise typer.Exit(code=1)
+    console.print(f"Term: {found.term}")
+    console.print(f"Description: {found.description or ''}")
+    console.print(f"Enabled: {'yes' if found.enabled else 'no'}")
+    console.print(f"ID: {found.id}")
+@glossary_app.command("update")
+def glossary_update(
+    term: str,
+    new_term: Annotated[str | None, typer.Option("--term", help="Replace the glossary term text.")] = None,
+    description: Annotated[str | None, typer.Option("--description", "-d", help="Replace the description.")] = None,
+) -> None:
+    """Update a glossary term or description."""
+    with database() as connection:
+        try:
+            updated = update_glossary_term(connection, term, term=new_term, description=description)
+        except ValueError as exc:
+            console.print(str(exc))
+            raise typer.Exit(code=1) from exc
+    console.print(f"Updated glossary term: {updated.term}")
+@glossary_app.command("enable")
+def glossary_enable(term: str) -> None:
+    """Enable a glossary term."""
+    _set_enabled(term, True)
+@glossary_app.command("disable")
+def glossary_disable(term: str) -> None:
+    """Disable a glossary term without deleting it."""
+    _set_enabled(term, False)
+@glossary_app.command("remove")
+def glossary_remove(
+    term: str,
+    yes: Annotated[bool, typer.Option("--yes", "-y", help="Remove without confirmation.")] = False,
+) -> None:
+    """Remove a glossary term."""
+    if not yes and not typer.confirm(f"Remove glossary term '{term}'?", default=False):
+        console.print("Cancelled.")
+        return
+    with database() as connection:
+        removed = remove_glossary_term(connection, term)
+    if not removed:
+        console.print(f"Glossary term not found: {term}")
+        raise typer.Exit(code=1)
+    console.print(f"Removed glossary term: {term}")
+def _set_enabled(term: str, enabled: bool) -> None:
+    with database() as connection:
+        try:
+            updated = update_glossary_term(connection, term, enabled=enabled)
+        except ValueError as exc:
+            console.print(str(exc))
+            raise typer.Exit(code=1) from exc
+    state = "Enabled" if enabled else "Disabled"
+    console.print(f"{state} glossary term: {updated.term}")

{fow_cli-0.2.0 → fow_cli-0.3.0}/src/fly_on_the_wall/db.py RENAMED Viewed

@@ -8,7 +8,7 @@ from pathlib import Path
 from fly_on_the_wall.storage import ensure_storage_layout, storage_paths
-SCHEMA_VERSION = 17
+SCHEMA_VERSION = 18
 SCHEMA_STATEMENTS = (
     """
@@ -43,6 +43,16 @@ SCHEMA_STATEMENTS = (
     )
     """,
     """
+    CREATE TABLE IF NOT EXISTS glossary_terms (
+        id TEXT PRIMARY KEY,
+        term TEXT NOT NULL UNIQUE,
+        description TEXT,
+        enabled INTEGER NOT NULL DEFAULT 1,
+        created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
+        updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP
+    )
+    """,
+    """
     CREATE TABLE IF NOT EXISTS pipeline_stages (
         id INTEGER PRIMARY KEY AUTOINCREMENT,
         meeting_id TEXT NOT NULL,

fow_cli-0.3.0/src/fly_on_the_wall/glossary.py ADDED Viewed

@@ -0,0 +1,207 @@
+from __future__ import annotations
+from dataclasses import dataclass
+from pathlib import Path
+from sqlite3 import Connection
+from typing import Any
+from uuid import uuid4
+import yaml
+UNSUPPORTED_KEYTERM_CHARS = set("<>{}[]\\")
+@dataclass(frozen=True)
+class GlossaryTerm:
+    id: str
+    term: str
+    description: str | None
+    enabled: bool
+def create_glossary_term(connection: Connection, term: str, description: str | None = None) -> GlossaryTerm:
+    normalized = _normalize_term(term)
+    normalized_description = _normalize_optional(description)
+    term_id = str(uuid4())
+    with connection:
+        connection.execute(
+            """
+            INSERT INTO glossary_terms(id, term, description)
+            VALUES (?, ?, ?)
+            """,
+            (term_id, normalized, normalized_description),
+        )
+    return get_glossary_term(connection, normalized)  # type: ignore[return-value]
+def get_glossary_term(connection: Connection, term_or_id: str) -> GlossaryTerm | None:
+    row = connection.execute(
+        """
+        SELECT * FROM glossary_terms
+        WHERE id = ? OR term = ?
+        """,
+        (term_or_id, term_or_id),
+    ).fetchone()
+    return _term_from_row(row) if row is not None else None
+def list_glossary_terms(connection: Connection, include_disabled: bool = False) -> list[GlossaryTerm]:
+    query = "SELECT * FROM glossary_terms"
+    if not include_disabled:
+        query += " WHERE enabled = 1"
+    query += " ORDER BY lower(term)"
+    return [_term_from_row(row) for row in connection.execute(query).fetchall()]
+def update_glossary_term(
+    connection: Connection,
+    term_or_id: str,
+    *,
+    term: str | None = None,
+    description: str | None = None,
+    enabled: bool | None = None,
+) -> GlossaryTerm:
+    existing = get_glossary_term(connection, term_or_id)
+    if existing is None:
+        raise ValueError(f"Glossary term not found: {term_or_id}")
+    updated_term = existing.term if term is None else _normalize_term(term)
+    updated_description = existing.description if description is None else _normalize_optional(description)
+    updated_enabled = existing.enabled if enabled is None else enabled
+    with connection:
+        connection.execute(
+            """
+            UPDATE glossary_terms
+            SET term = ?,
+                description = ?,
+                enabled = ?,
+                updated_at = CURRENT_TIMESTAMP
+            WHERE id = ?
+            """,
+            (updated_term, updated_description, int(updated_enabled), existing.id),
+        )
+    return get_glossary_term(connection, existing.id)  # type: ignore[return-value]
+def remove_glossary_term(connection: Connection, term_or_id: str) -> bool:
+    existing = get_glossary_term(connection, term_or_id)
+    if existing is None:
+        return False
+    with connection:
+        connection.execute("DELETE FROM glossary_terms WHERE id = ?", (existing.id,))
+    return True
+def glossary_prompt_lines(connection: Connection, legacy_glossary_path: Path | None = None) -> list[str]:
+    lines: list[str] = []
+    seen: set[str] = set()
+    for item in list_glossary_terms(connection):
+        key = item.term.casefold()
+        if key in seen:
+            continue
+        seen.add(key)
+        if item.description:
+            lines.append(f"{item.term}: {item.description}")
+        else:
+            lines.append(item.term)
+    for term in load_glossary_terms(legacy_glossary_path):
+        key = term.casefold()
+        if key not in seen:
+            seen.add(key)
+            lines.append(term)
+    for name in _people_names(connection):
+        key = name.casefold()
+        if key not in seen:
+            seen.add(key)
+            lines.append(name)
+    return lines
+def transcription_keyterms(connection: Connection, legacy_glossary_path: Path | None = None) -> list[str]:
+    terms: list[str] = []
+    seen: set[str] = set()
+    for item in list_glossary_terms(connection):
+        _append_keyterm(terms, seen, item.term)
+    for term in load_glossary_terms(legacy_glossary_path):
+        _append_keyterm(terms, seen, term)
+    for name in _people_names(connection):
+        _append_keyterm(terms, seen, name)
+    return terms[:1000]
+def load_glossary_terms(path: Path | None) -> list[str]:
+    if path is None or not path.exists():
+        return []
+    data = yaml.safe_load(path.read_text())
+    return sorted(set(_collect_terms(data)), key=str.casefold)
+def _append_keyterm(terms: list[str], seen: set[str], value: str) -> None:
+    normalized = " ".join(value.split())
+    key = normalized.casefold()
+    if key in seen or not _valid_keyterm(normalized):
+        return
+    seen.add(key)
+    terms.append(normalized)
+def _valid_keyterm(value: str) -> bool:
+    return (
+        bool(value)
+        and len(value) < 50
+        and len(value.split()) <= 5
+        and not any(char in value for char in UNSUPPORTED_KEYTERM_CHARS)
+    )
+def _people_names(connection: Connection) -> list[str]:
+    return [
+        str(row["display_name"])
+        for row in connection.execute("SELECT display_name FROM people ORDER BY lower(display_name)").fetchall()
+    ]
+def _normalize_term(value: str) -> str:
+    normalized = " ".join(value.split())
+    if not normalized:
+        raise ValueError("Glossary term cannot be empty")
+    return normalized
+def _normalize_optional(value: str | None) -> str | None:
+    if value is None:
+        return None
+    normalized = " ".join(value.split())
+    return normalized or None
+def _term_from_row(row: Any) -> GlossaryTerm:
+    return GlossaryTerm(
+        id=row["id"],
+        term=row["term"],
+        description=row["description"],
+        enabled=bool(row["enabled"]),
+    )
+def _collect_terms(value: Any) -> list[str]:
+    if value is None:
+        return []
+    if isinstance(value, str):
+        normalized = " ".join(value.split())
+        return [normalized] if normalized else []
+    if isinstance(value, list):
+        terms: list[str] = []
+        for item in value:
+            terms.extend(_collect_terms(item))
+        return terms
+    if isinstance(value, dict):
+        terms = []
+        for item in value.values():
+            terms.extend(_collect_terms(item))
+        return terms
+    return []

{fow_cli-0.2.0 → fow_cli-0.3.0}/src/fly_on_the_wall/processing.py RENAMED Viewed

@@ -13,7 +13,7 @@ from fly_on_the_wall.config import AppConfig
 from fly_on_the_wall.costs import record_openai_usage
 from fly_on_the_wall.embeddings import EmbeddingBackend
 from fly_on_the_wall.exporting import ExportResult, export_markdown_transcript
-from fly_on_the_wall.glossary import load_glossary_terms
+from fly_on_the_wall.glossary import glossary_prompt_lines, transcription_keyterms
 from fly_on_the_wall.meetings import (
     Meeting,
     get_meeting,
@@ -107,8 +107,17 @@ def process_audio(
     existing_provider_run = latest_completed_provider_run(connection, meeting.id)
     if existing_provider_run is None:
         with timed_progress.step("Transcribing audio with ElevenLabs"):
-            resolved_transcribe = transcribe_fn or _run_elevenlabs_transcription
-            provider_run_id = resolved_transcribe(connection, meeting.id, meeting.imported_audio_path, paths)
+            if transcribe_fn is not None:
+                provider_run_id = transcribe_fn(connection, meeting.id, meeting.imported_audio_path, paths)
+            else:
+                keyterms = transcription_keyterms(connection, config.glossary_path)
+                provider_run_id = _run_elevenlabs_transcription(
+                    connection,
+                    meeting.id,
+                    meeting.imported_audio_path,
+                    paths,
+                    keyterms,
+                )
     else:
         timed_progress.message("Reusing completed ElevenLabs transcription")
         provider_run_id = existing_provider_run["id"]
@@ -200,7 +209,7 @@ def _cleanup_transcript(context: RefreshContext, deterministic_transcript: str)
     if context.config.cleanup_mode != "light" or not get_api_key("openai"):
         return TranscriptArtifacts(deterministic_transcript, deterministic_transcript)
-    glossary_terms = load_glossary_terms(context.config.glossary_path)
+    glossary_terms = glossary_prompt_lines(context.connection, context.config.glossary_path)
     cleanup_cache_key = text_sha256(
         "\n".join(
             [
@@ -260,7 +269,10 @@ def _suggest_and_apply_title(
     if meeting.get("title_source") == "manual":
         return
-    title_cache_key = text_sha256("\n".join([DEFAULT_ANALYSIS_MODEL, context.description or "", transcript, analysis]))
+    glossary_terms = glossary_prompt_lines(context.connection, context.config.glossary_path)
+    title_cache_key = text_sha256(
+        "\n".join([DEFAULT_ANALYSIS_MODEL, context.description or "", "\n".join(glossary_terms), transcript, analysis])
+    )
     title_cache_dir = context.paths.artifacts / context.meeting.id / "generated-title"
     cached_title = read_cached_text(title_cache_dir, title_cache_key)
     if cached_title is not None:
@@ -274,6 +286,7 @@ def _suggest_and_apply_title(
                         transcript,
                         analysis,
                         meeting_context=context.description,
+                        glossary_terms=glossary_terms,
                         options=OpenAIRequestOptions(
                             usage_callback=lambda response: _record_openai_usage(
                                 context, DEFAULT_ANALYSIS_MODEL, "title", response
@@ -291,9 +304,13 @@ def _suggest_and_apply_title(
 def _run_elevenlabs_transcription(
-    connection: Connection, meeting_id: str, audio_path: Path, storage: StoragePaths
+    connection: Connection,
+    meeting_id: str,
+    audio_path: Path,
+    storage: StoragePaths,
+    keyterms: list[str] | None = None,
 ) -> str:
-    return run_transcription(connection, meeting_id, audio_path, storage)
+    return run_transcription(connection, meeting_id, audio_path, storage, keyterms=keyterms)
 def _meeting_from_database(connection: Connection, meeting_id: str) -> Meeting:
@@ -398,7 +415,10 @@ def _analyze_transcript(
     if not get_api_key("openai"):
         return fallback_analysis("OPENAI_API_KEY is missing")
-    analysis_cache_key = text_sha256("\n".join([DEFAULT_ANALYSIS_MODEL, context.description or "", transcript]))
+    glossary_terms = glossary_prompt_lines(context.connection, context.config.glossary_path)
+    analysis_cache_key = text_sha256(
+        "\n".join([DEFAULT_ANALYSIS_MODEL, context.description or "", "\n".join(glossary_terms), transcript])
+    )
     analysis_cache_dir = context.paths.artifacts / context.meeting.id / "analysis"
     cached_analysis = read_cached_text(analysis_cache_dir, analysis_cache_key)
     if cached_analysis is not None:
@@ -411,6 +431,7 @@ def _analyze_transcript(
                 AnalysisRequest(
                     transcript,
                     meeting_context=context.description,
+                    glossary_terms=glossary_terms,
                     options=OpenAIRequestOptions(
                         usage_callback=lambda response: _record_openai_usage(
                             context, DEFAULT_ANALYSIS_MODEL, "analysis", response

{fow_cli-0.2.0 → fow_cli-0.3.0}/src/fly_on_the_wall/providers/elevenlabs.py RENAMED Viewed

@@ -28,6 +28,7 @@ def transcribe_audio(
     num_speakers: int | None = None,
     diarization_threshold: float | None = None,
     no_verbatim: bool = False,
+    keyterms: list[str] | None = None,
 ) -> dict[str, Any]:
     resolved_api_key = api_key or get_api_key(PROVIDER)
     if not resolved_api_key:
@@ -46,6 +47,8 @@ def transcribe_audio(
         data["num_speakers"] = str(num_speakers)
     if diarization_threshold is not None:
         data["diarization_threshold"] = str(diarization_threshold)
+    if keyterms:
+        data["keyterms"] = json.dumps(keyterms, ensure_ascii=False)
     close_client = client is None
     http_client = client or httpx.Client(timeout=600)
@@ -76,15 +79,17 @@ def run_transcription(
     storage: StoragePaths | None = None,
     client: httpx.Client | None = None,
     api_key: str | None = None,
+    keyterms: list[str] | None = None,
 ) -> str:
     paths = storage or storage_paths()
     provider_run_id = str(uuid4())
     raw_response_path = paths.artifacts / meeting_id / "provider-runs" / f"{provider_run_id}.raw.json"
     raw_response_path.parent.mkdir(parents=True, exist_ok=True)
-    _insert_provider_run(connection, provider_run_id, meeting_id, raw_response_path, "running")
+    settings = {"keyterms": keyterms or []}
+    _insert_provider_run(connection, provider_run_id, meeting_id, raw_response_path, "running", settings)
     try:
-        response = transcribe_audio(audio_path, api_key=api_key, client=client)
+        response = transcribe_audio(audio_path, api_key=api_key, client=client, keyterms=keyterms)
         raw_response_path.write_text(json.dumps(response, indent=2, ensure_ascii=False) + "\n")
         duration = float(response.get("audio_duration_secs") or 0)
         record_service_usage(
@@ -112,6 +117,7 @@ def _insert_provider_run(
     meeting_id: str,
     raw_response_path: Path,
     status: str,
+    settings: dict[str, Any] | None = None,
 ) -> None:
     with connection:
         connection.execute(
@@ -121,11 +127,20 @@ def _insert_provider_run(
                 meeting_id,
                 provider,
                 model,
+                settings_json,
                 raw_response_path,
                 status
-            ) VALUES (?, ?, ?, ?, ?, ?)
+            ) VALUES (?, ?, ?, ?, ?, ?, ?)
             """,
-            (provider_run_id, meeting_id, PROVIDER, MODEL, str(raw_response_path), status),
+            (
+                provider_run_id,
+                meeting_id,
+                PROVIDER,
+                MODEL,
+                json.dumps(settings or {}, sort_keys=True),
+                str(raw_response_path),
+                status,
+            ),
         )

{fow_cli-0.2.0 → fow_cli-0.3.0}/src/fly_on_the_wall/providers/openai_analysis.py RENAMED Viewed

@@ -28,6 +28,7 @@ class OpenAIRequestOptions:
 class AnalysisRequest:
     transcript_markdown: str
     meeting_context: str | None = None
+    glossary_terms: list[str] | None = None
     options: OpenAIRequestOptions = field(default_factory=OpenAIRequestOptions)
@@ -36,6 +37,7 @@ class TitleRequest:
     transcript_markdown: str
     analysis_markdown: str
     meeting_context: str | None = None
+    glossary_terms: list[str] | None = None
     options: OpenAIRequestOptions = field(default_factory=OpenAIRequestOptions)
@@ -50,7 +52,7 @@ class ChatCompletionRequest:
 def analyze_meeting(request: AnalysisRequest) -> str:
     return _post_chat_completion(
         ChatCompletionRequest(
-            system_prompt=_system_prompt(request.meeting_context),
+            system_prompt=_system_prompt(request.meeting_context, request.glossary_terms),
             user_prompt=request.transcript_markdown,
             options=request.options,
             timeout_seconds=180,
@@ -61,7 +63,7 @@ def analyze_meeting(request: AnalysisRequest) -> str:
 def suggest_meeting_title(request: TitleRequest) -> str:
     content = _post_chat_completion(
         ChatCompletionRequest(
-            system_prompt=_title_system_prompt(request.meeting_context),
+            system_prompt=_title_system_prompt(request.meeting_context, request.glossary_terms),
             user_prompt=(f"Transcript:\n{request.transcript_markdown}\n\nAnalysis:\n{request.analysis_markdown}"),
             options=request.options,
             timeout_seconds=60,
@@ -148,8 +150,9 @@ None identified.
 """.strip()
-def _system_prompt(meeting_context: str | None) -> str:
+def _system_prompt(meeting_context: str | None, glossary_terms: list[str] | None) -> str:
     context = meeting_context or "none"
+    glossary = _format_glossary(glossary_terms)
     return f"""
 You analyze meeting transcripts for a personal note-taker.
 Return concise Markdown with exactly these headings:
@@ -163,12 +166,17 @@ Return concise Markdown with exactly these headings:
 Keep it short and prioritized. Do not invent facts.
 If a section has no useful content, write "None identified."
 For action items, use: - Owner: task. Due: date or Not mentioned.
+Use the glossary spellings when the transcript appears to refer to these names or domain terms.
+Do not insert glossary terms unless the transcript context supports them.
 Meeting context: {context}
+Known names and terms:
+{glossary}
 """.strip()
-def _title_system_prompt(meeting_context: str | None) -> str:
+def _title_system_prompt(meeting_context: str | None, glossary_terms: list[str] | None) -> str:
     context = meeting_context or "none"
+    glossary = _format_glossary(glossary_terms)
     return f"""
 You name meeting transcripts for a personal note-taker.
 Return only one title, with no Markdown, labels, quotes, or punctuation wrapper.
@@ -177,10 +185,20 @@ Prefer concrete names, projects, organizations, and topics from the transcript.
 Do not include dates unless the date is central to the meeting topic.
 Do not return generic titles like "Meeting Summary" or "Team Meeting".
 If the transcript has no meaningful content, return an empty string.
+Use the glossary spellings when the transcript appears to refer to these names or domain terms.
+Do not insert glossary terms unless the transcript context supports them.
 Meeting context: {context}
+Known names and terms:
+{glossary}
 """.strip()
+def _format_glossary(glossary_terms: list[str] | None) -> str:
+    if not glossary_terms:
+        return "- none"
+    return "\n".join(f"- {term}" for term in glossary_terms)
 def _extract_content(response: dict[str, Any]) -> str:
     try:
         content = response["choices"][0]["message"]["content"]

{fow_cli-0.2.0 → fow_cli-0.3.0}/src/fly_on_the_wall/providers/openai_cleanup.py RENAMED Viewed

@@ -10,7 +10,7 @@ from fly_on_the_wall.secrets import get_api_key
 API_URL = "https://api.openai.com/v1/chat/completions"
 DEFAULT_MODEL = "gpt-5.4-mini"
 DEFAULT_CLEANUP_TIMEOUT_SECONDS = 1800
-CLEANUP_PROMPT_VERSION = "2026-06-04-manuscript-cleanup-v4"
+CLEANUP_PROMPT_VERSION = "2026-06-13-manuscript-cleanup-glossary-v5"
 class OpenAICleanupError(RuntimeError):
@@ -61,7 +61,7 @@ def cleanup_transcript(
 def _system_prompt(glossary_terms: list[str] | None, meeting_context: str | None) -> str:
-    glossary = ", ".join(glossary_terms or []) or "none"
+    glossary = _format_glossary(glossary_terms)
     context = meeting_context or "none"
     return f"""
 You clean meeting transcripts into readable manuscript-style dialogue.
@@ -78,12 +78,21 @@ of an idiom, or used with clear literal/comparative meaning, such as "på samma
 Prefer complete readable sentences over literal STT fragments, but do not summarize,
 invent details, remove uncertainty markers, or add new content.
 Preserve standalone acknowledgements such as yes/no/okay/mm and Swedish ja/nej/okej/mm.
+Use the glossary spellings when the transcript appears to refer to these names or domain terms.
+Do not insert glossary terms unless the transcript context supports them.
 Return only the cleaned manuscript.
 Meeting context: {context}
-Glossary terms: {glossary}
+Known names and terms:
+{glossary}
 """.strip()
+def _format_glossary(glossary_terms: list[str] | None) -> str:
+    if not glossary_terms:
+        return "- none"
+    return "\n".join(f"- {term}" for term in glossary_terms)
 def _extract_content(response: dict[str, Any]) -> str:
     try:
         content = response["choices"][0]["message"]["content"]

{fow_cli-0.2.0 → fow_cli-0.3.0}/src/fly_on_the_wall/publishing.py RENAMED Viewed

@@ -133,7 +133,8 @@ def publish_meeting(connection: Connection, meeting_id_or_slug: str, target_iden
     analysis_markdown = _read_analysis_markdown(analysis_path)
     manifest = json.loads(manifest_path.read_text())
     output_path = _published_output_path(connection, meeting, target)
-    content = _obsidian_note(meeting, transcript_markdown, analysis_markdown, manifest)
+    participants = _meeting_participants(connection, meeting["id"])
+    content = _obsidian_note(meeting, transcript_markdown, analysis_markdown, manifest, participants)
     content_hash = _sha256(content)
     output_path.parent.mkdir(parents=True, exist_ok=True)
@@ -271,7 +272,34 @@ def _published_output_path(connection: Connection, meeting: dict, target: Publis
     return target.path / filename
-def _obsidian_note(meeting: dict, transcript_markdown: str, analysis_markdown: str, manifest: dict) -> str:
+def _meeting_participants(connection: Connection, meeting_id: str) -> list[str]:
+    rows = connection.execute(
+        """
+        SELECT DISTINCT people.display_name
+        FROM speaker_assignments
+        JOIN local_speakers ON local_speakers.id = speaker_assignments.local_speaker_id
+        JOIN people ON people.id = speaker_assignments.person_id
+        WHERE local_speakers.meeting_id = ?
+          AND speaker_assignments.status = 'known'
+        ORDER BY lower(people.display_name)
+        """,
+        (meeting_id,),
+    ).fetchall()
+    return [_obsidian_people_link(row["display_name"]) for row in rows]
+def _obsidian_people_link(display_name: str) -> str:
+    safe_name = display_name.replace("[[", "").replace("]]", "").replace("|", "-").strip()
+    return f"[[People/{safe_name}]]"
+def _obsidian_note(
+    meeting: dict,
+    transcript_markdown: str,
+    analysis_markdown: str,
+    manifest: dict,
+    participants: list[str] | None = None,
+) -> str:
     date, time = _date_time(_meeting_timestamp(meeting))
     frontmatter = {
         "title": meeting["title"],
@@ -284,6 +312,7 @@ def _obsidian_note(meeting: dict, transcript_markdown: str, analysis_markdown: s
         "recorded_at": meeting.get("recorded_at"),
         "duration_seconds": meeting.get("duration_seconds"),
         "recording_quality": meeting.get("recording_quality_status"),
+        "participants": participants or None,
         "tags": ["meetings", "fly-on-the-wall"],
     }
     lines = ["---", *_yaml_lines(frontmatter), "---", ""]
@@ -330,7 +359,7 @@ def _yaml_lines(values: dict) -> list[str]:
 def _yaml_scalar(value: object) -> str:
     text = str(value)
-    if re.search(r"[:#\n,]", text):
+    if re.search(r"[:#\n,\[\]{}]", text):
         return json.dumps(text, ensure_ascii=False)
     return text

fow_cli-0.2.0/src/fly_on_the_wall/glossary.py DELETED Viewed

@@ -1,31 +0,0 @@
-from __future__ import annotations
-from pathlib import Path
-from typing import Any
-import yaml
-def load_glossary_terms(path: Path | None) -> list[str]:
-    if path is None or not path.exists():
-        return []
-    data = yaml.safe_load(path.read_text())
-    return sorted(set(_collect_terms(data)))
-def _collect_terms(value: Any) -> list[str]:
-    if value is None:
-        return []
-    if isinstance(value, str):
-        return [value]
-    if isinstance(value, list):
-        terms: list[str] = []
-        for item in value:
-            terms.extend(_collect_terms(item))
-        return terms
-    if isinstance(value, dict):
-        terms = []
-        for item in value.values():
-            terms.extend(_collect_terms(item))
-        return terms
-    return []