PyPI - lattifai - Versions diffs - 0.1.5__py3-none-any.whl → 0.2.2__py3-none-any.whl - Mend

lattifai 0.1.5py3-none-any.whl → 0.2.2py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

lattifai/__init__.py +12 -47
lattifai/bin/align.py +26 -2
lattifai/bin/cli_base.py +5 -0
lattifai/client.py +26 -13
lattifai/io/reader.py +1 -2
lattifai/tokenizer/tokenizer.py +284 -0
lattifai/workers/lattice1_alpha.py +33 -11
lattifai-0.2.2.dist-info/METADATA +333 -0
lattifai-0.2.2.dist-info/RECORD +22 -0
lattifai/tokenizers/tokenizer.py +0 -147
lattifai-0.1.5.dist-info/METADATA +0 -444
lattifai-0.1.5.dist-info/RECORD +0 -24
scripts/__init__.py +0 -1
scripts/install_k2.py +0 -520
/lattifai/{tokenizers → tokenizer}/__init__.py +0 -0
/lattifai/{tokenizers → tokenizer}/phonemizer.py +0 -0
{lattifai-0.1.5.dist-info → lattifai-0.2.2.dist-info}/WHEEL +0 -0
{lattifai-0.1.5.dist-info → lattifai-0.2.2.dist-info}/entry_points.txt +0 -0
{lattifai-0.1.5.dist-info → lattifai-0.2.2.dist-info}/licenses/LICENSE +0 -0
{lattifai-0.1.5.dist-info → lattifai-0.2.2.dist-info}/top_level.txt +0 -0

lattifai-0.2.2.dist-info/METADATA ADDED Viewed

@@ -0,0 +1,333 @@
+Metadata-Version: 2.4
+Name: lattifai
+Version: 0.2.2
+Summary: Lattifai Python SDK: Seamless Integration with Lattifai's Speech and Video AI Services
+Author-email: Lattifai Technologies <tech@lattifai.com>
+Maintainer-email: Lattice <tech@lattifai.com>
+License: MIT License
+        Copyright (c) 2025 Lattifai.
+        Permission is hereby granted, free of charge, to any person obtaining a copy
+        of this software and associated documentation files (the "Software"), to deal
+        in the Software without restriction, including without limitation the rights
+        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+        copies of the Software, and to permit persons to whom the Software is
+        furnished to do so, subject to the following conditions:
+        The above copyright notice and this permission notice shall be included in all
+        copies or substantial portions of the Software.
+        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+        SOFTWARE.
+Project-URL: Homepage, https://github.com/lattifai/lattifai-python
+Project-URL: Documentation, https://github.com/lattifai/lattifai-python/README.md
+Project-URL: Bug Tracker, https://github.com/lattifai/lattifai-python/issues
+Project-URL: Discussions, https://github.com/lattifai/lattifai-python/discussions
+Project-URL: Changelog, https://github.com/lattifai/lattifai-python/CHANGELOG.md
+Keywords: lattifai,speech recognition,video analysis,ai,sdk,api client
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: Science/Research
+Classifier: License :: OSI Approved :: Apache Software License
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Operating System :: MacOS :: MacOS X
+Classifier: Operating System :: POSIX :: Linux
+Classifier: Operating System :: Microsoft :: Windows
+Classifier: Topic :: Multimedia :: Sound/Audio
+Classifier: Topic :: Multimedia :: Video
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: lattifai-core>=0.2.0
+Requires-Dist: httpx
+Requires-Dist: python-dotenv
+Requires-Dist: lhotse>=1.26.0
+Requires-Dist: colorful>=0.5.6
+Requires-Dist: pysubs2
+Requires-Dist: praatio
+Requires-Dist: tgt
+Requires-Dist: onnxruntime
+Requires-Dist: resampy
+Requires-Dist: g2p-phonemizer==0.1.1
+Requires-Dist: wtpsplit>=2.1.6
+Provides-Extra: numpy
+Requires-Dist: numpy; extra == "numpy"
+Provides-Extra: test
+Requires-Dist: pytest; extra == "test"
+Requires-Dist: pytest-cov; extra == "test"
+Requires-Dist: ruff; extra == "test"
+Requires-Dist: numpy; extra == "test"
+Provides-Extra: all
+Requires-Dist: numpy; extra == "all"
+Requires-Dist: pytest; extra == "all"
+Requires-Dist: pytest-cov; extra == "all"
+Requires-Dist: ruff; extra == "all"
+Dynamic: license-file
+# LattifAI Python
+[![PyPI version](https://badge.fury.io/py/lattifai.svg)](https://badge.fury.io/py/lattifai)
+<p align="center">
+   🌐 <a href="https://lattifai.com"><b>Official Website</b></a> &nbsp&nbsp | &nbsp&nbsp 🖥️ <a href="https://github.com/lattifai/lattifai-python">GitHub</a> &nbsp&nbsp | &nbsp&nbsp 🤗 <a href="https://huggingface.co/Lattifai/Lattice-1-Alpha">Model</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://lattifai.com/blogs">Blog</a> &nbsp&nbsp | &nbsp&nbsp <a href="https://discord.gg/gTZqdaBJ"><img src="https://img.shields.io/badge/Discord-Join-5865F2?logo=discord&logoColor=white" alt="Discord" style="vertical-align: middle;"></a>
+</p>
+Advanced forced alignment and subtitle generation powered by [Lattice-1-Alpha](https://huggingface.co/Lattifai/Lattice-1-Alpha) model.
+## Installation
+```bash
+pip install install-k2
+# The installation will automatically detect and use your already installed PyTorch version.
+install-k2  # Install k2
+pip install lattifai
+```
+> **⚠️ Important**: You must run `install-k2` before using the lattifai library.
+## Quick Start
+### Command Line
+```bash
+# Align audio with subtitle
+lattifai align audio.wav subtitle.srt output.srt
+# Convert subtitle format
+lattifai subtitle convert input.srt output.vtt
+```
+#### lattifai align options
+```
+> lattifai align --help
+Usage: lattifai align [OPTIONS] INPUT_AUDIO_PATH INPUT_SUBTITLE_PATH OUTPUT_SUBTITLE_PATH
+  Command used to align audio with subtitles
+Options:
+  -F, --input_format [srt|vtt|ass|txt|auto]  Input Subtitle format.
+  -D, --device [cpu|cuda|mps]                Device to use for inference.
+  --split_sentence                           Smart sentence splitting based on punctuation semantics.
+  --help                                     Show this message and exit.
+```
+#### Understanding --split_sentence
+The `--split_sentence` option performs intelligent sentence re-splitting based on punctuation and semantic boundaries. This is especially useful when processing subtitles that combine multiple semantic units in a single segment, such as:
+- **Mixed content**: Non-speech elements (e.g., `[APPLAUSE]`, `[MUSIC]`) followed by actual dialogue
+- **Natural punctuation boundaries**: Colons, periods, and other punctuation marks that indicate semantic breaks
+- **Concatenated phrases**: Multiple distinct utterances joined together without proper separation
+**Example transformations**:
+```
+Input:  "[APPLAUSE] >> MIRA MURATI: Thank you all"
+Output: ["[APPLAUSE]", ">> MIRA MURATI: Thank you all"]
+Input:  "[MUSIC] Welcome back. Today we discuss AI."
+Output: ["[MUSIC]", "Welcome back.", "Today we discuss AI."]
+```
+This feature helps improve alignment accuracy by:
+1. Respecting punctuation-based semantic boundaries
+2. Separating distinct utterances for more precise timing
+3. Maintaining semantic context for each independent phrase
+**Usage**:
+```bash
+lattifai align --split_sentence audio.wav subtitle.srt output.srt
+```
+### Python API
+```python
+from lattifai import LattifAI
+# Initialize client
+client = LattifAI(
+    api_key: Optional[str] = None,
+    model_name_or_path='Lattifai/Lattice-1-Alpha',
+    device='cpu',  # 'cpu', 'cuda', or 'mps'
+)
+# Perform alignment
+result = client.alignment(
+    audio="audio.wav",
+    subtitle="subtitle.srt",
+    split_sentence=False,
+    output_subtitle_path="output.srt"
+)
+```
+## Supported Formats
+**Audio**: WAV, MP3, FLAC, M4A, OGG
+**Subtitle**: SRT, VTT, ASS, TXT (plain text)
+## API Reference
+### LattifAI
+```python
+LattifAI(
+    api_key: Optional[str] = None,
+    model_name_or_path: str = 'Lattifai/Lattice-1-Alpha',
+    device: str = 'cpu'  # 'cpu', 'cuda', or 'mps'
+)
+```
+### alignment()
+```python
+client.alignment(
+    audio: str,                           # Path to audio file
+    subtitle: str,                        # Path to subtitle/text file
+    format: Optional[str] = None,         # 'srt', 'vtt', 'ass', 'txt' (auto-detect if None)
+    split_sentence: bool = False,         # Smart sentence splitting based on punctuation semantics
+    output_subtitle_path: Optional[str] = None
+) -> str
+```
+**Parameters**:
+- `audio`: Path to the audio file to be aligned
+- `subtitle`: Path to the subtitle or text file
+- `format`: Subtitle format ('srt', 'vtt', 'ass', 'txt'). Auto-detected if None
+- `split_sentence`: Enable intelligent sentence re-splitting (default: False). Set to True when subtitles combine multiple semantic units (non-speech elements + dialogue, or multiple sentences) that would benefit from separate timing alignment
+- `output_subtitle_path`: Output path for aligned subtitle (optional)
+## Examples
+### Basic Text Alignment
+```python
+client = LattifAI()
+client.alignment(
+    audio="speech.wav",
+    subtitle="transcript.txt",
+    format="txt",
+    split_sentence=False,
+    output_subtitle_path="output.srt"
+)
+```
+### Batch Processing
+```python
+from pathlib import Path
+client = LattifAI()
+audio_dir = Path("audio_files")
+subtitle_dir = Path("subtitles")
+output_dir = Path("aligned")
+for audio in audio_dir.glob("*.wav"):
+    subtitle = subtitle_dir / f"{audio.stem}.srt"
+    if subtitle.exists():
+        client.alignment(
+            audio=audio,
+            subtitle=subtitle,
+            output_subtitle_path=output_dir / f"{audio.stem}_aligned.srt"
+        )
+```
+### GPU Acceleration
+```python
+# NVIDIA GPU
+client = LattifAI(device='cuda')
+# Apple Silicon
+client = LattifAI(device='mps')
+# CLI
+lattifai align --device mps audio.wav subtitle.srt output.srt
+```
+## Configuration
+### API Key Setup
+First, create your API key at [https://lattifai.com/dashboard/api-keys](https://lattifai.com/dashboard/api-keys)
+**Recommended: Using .env file**
+Create a `.env` file in your project root:
+```bash
+LATTIFAI_API_KEY=your-api-key
+```
+The library automatically loads the `.env` file (python-dotenv is included as a dependency).
+**Alternative: Environment variable**
+```bash
+export LATTIFAI_API_KEY="your-api-key"
+```
+## Model Information
+**[Lattice-1-Alpha](https://huggingface.co/Lattifai/Lattice-1-Alpha)** features:
+- State-of-the-art alignment precision
+- **Language Support**: Currently supports English only. The upcoming **Lattice-1** release will support English, Chinese, and mixed English-Chinese content.
+- Handles noisy audio and imperfect transcripts
+- Optimized for CPU and GPU (CUDA/MPS)
+**Requirements**:
+- Python 3.9+
+- 4GB RAM recommended
+- ~2GB storage for model files
+## Development
+### Setup
+```bash
+git clone https://github.com/lattifai/lattifai-python.git
+cd lattifai-python
+pip install -e ".[test]"
+./scripts/install-hooks.sh  # Optional: install pre-commit hooks
+```
+### Testing
+```bash
+pytest                        # Run all tests
+pytest --cov=src             # With coverage
+pytest tests/test_basic.py   # Specific test
+```
+### Code Quality
+```bash
+ruff check src/ tests/       # Lint
+ruff format src/ tests/      # Format
+isort src/ tests/            # Sort imports
+```
+## Contributing
+1. Fork the repository
+2. Create a feature branch
+3. Make changes and add tests
+4. Run `pytest` and `ruff check`
+5. Submit a pull request
+## License
+Apache License 2.0
+## Support
+- **Issues**: [GitHub Issues](https://github.com/lattifai/lattifai-python/issues)
+- **Discussions**: [GitHub Discussions](https://github.com/lattifai/lattifai-python/discussions)
+- **Discord**: [Join our community](https://discord.gg/gTZqdaBJ)

lattifai-0.2.2.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,22 @@
+lattifai/__init__.py,sha256=JXUg0dT74UyAtKOjewRs9ijr5sl9SYsc6oU_WItY314,1497
+lattifai/base_client.py,sha256=ktFtATjL9pLSJUD-VqeJKA1FHkrsGHX7Uq_x00H7gO8,3322
+lattifai/client.py,sha256=QXbdTuDA5Aap2udu4iig7CVxlgwOIrydpuLlVASs0aA,5145
+lattifai/bin/__init__.py,sha256=7YhmtEM8kbxJtz2-KIskvpLKBZAvkMSceVx8z4fkgQ4,61
+lattifai/bin/align.py,sha256=nQs901SDYmxyH2AXBtjgZGzrpwLaxANQRYP49Bd1AWo,1669
+lattifai/bin/cli_base.py,sha256=y535WXDRX8StloFn9icpfw7nQt0JxuWBIuPMnRxAYy8,392
+lattifai/bin/subtitle.py,sha256=bUWImAHpvyY59Vskqb5loQiD5ytQOxR8lTQRiQ4LyNA,647
+lattifai/io/__init__.py,sha256=vHWRN7MvAch-GUeFqqO-gM57SM-4YOpGUjIxFJdjfPA,671
+lattifai/io/reader.py,sha256=mtgxT5c_BiHbqqJvPE3nf7TIe_OcWgGu1zr6iXasfrk,2591
+lattifai/io/supervision.py,sha256=5UfSsgBhXoDU3-6drDtoD7y8HIiA4xRKZnbOKgeejwM,354
+lattifai/io/writer.py,sha256=1eAEFLlL8kricxRDPFBtVmeC4IiFyFnjbWXvw0VU-q4,2036
+lattifai/tokenizer/__init__.py,sha256=aqv44PDtq6g3oFFKW_l4HSR5ywT5W8eP1dHHywIvBfs,72
+lattifai/tokenizer/phonemizer.py,sha256=SfRi1KIMpmaao6OVmR1h_I_3QU-vrE6D5bh72Afg5XM,1759
+lattifai/tokenizer/tokenizer.py,sha256=Yuo0pLPQnF2uX0Fm5g8i5vtcADn7GeLpSqdGpMJgTww,11492
+lattifai/workers/__init__.py,sha256=s6YfkIq4FDIAzY9sPjRpXnJfszj2repqnMTqydRM5Zw,83
+lattifai/workers/lattice1_alpha.py,sha256=1VFo59EcygEctTHOhkcII8v3_mrj8JEJ8Fcaqk_7LVo,5762
+lattifai-0.2.2.dist-info/licenses/LICENSE,sha256=LNuoH5jpXXNKgjQ3XLwztFq8D3O7kZI-LSg81o4ym2M,1065
+lattifai-0.2.2.dist-info/METADATA,sha256=4vmPOYKsIlvADiw0zUDQ2dbDpe-vOV-o5A0Hs1p7xfg,10971
+lattifai-0.2.2.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
+lattifai-0.2.2.dist-info/entry_points.txt,sha256=CwTI2NbJvF9msIHboAfTA99cmDr_HOWoODjS8R64JOw,131
+lattifai-0.2.2.dist-info/top_level.txt,sha256=-OVWZ68YYFcTN13ARkLasp2OUappe9wEVq-CKes7jM4,17
+lattifai-0.2.2.dist-info/RECORD,,

lattifai/tokenizers/tokenizer.py DELETED Viewed

@@ -1,147 +0,0 @@
-import gzip
-import pickle
-from collections import defaultdict
-from itertools import chain
-from typing import Any, Dict, List, Optional, Tuple
-import torch
-from lattifai.base_client import SyncAPIClient
-from lattifai.io import Supervision
-from lattifai.tokenizers.phonemizer import G2Phonemizer
-PUNCTUATION = '!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~'
-PUNCTUATION_SPACE = PUNCTUATION + ' '
-STAR_TOKEN = '※'
-GROUPING_SEPARATOR = '✹'
-MAXIMUM_WORD_LENGTH = 40
-class LatticeTokenizer:
-    """Tokenizer for converting Lhotse Cut to LatticeGraph."""
-    def __init__(self, client_wrapper: SyncAPIClient):
-        self.client_wrapper = client_wrapper
-        self.words: List[str] = []
-        self.g2p_model: Any = None  # Placeholder for G2P model
-        self.dictionaries = defaultdict(lambda: [])
-        self.oov_word = '<unk>'
-    @staticmethod
-    def from_pretrained(
-        client_wrapper: SyncAPIClient,
-        model_path: str,
-        device: str = 'cpu',
-        compressed: bool = True,
-    ):
-        """Load tokenizer from exported binary file"""
-        from pathlib import Path
-        words_model_path = f'{model_path}/words.bin'
-        if compressed:
-            with gzip.open(words_model_path, 'rb') as f:
-                data = pickle.load(f)
-        else:
-            with open(words_model_path, 'rb') as f:
-                data = pickle.load(f)
-        tokenizer = LatticeTokenizer(client_wrapper=client_wrapper)
-        tokenizer.words = data['words']
-        tokenizer.dictionaries = defaultdict(list, data['dictionaries'])
-        tokenizer.oov_word = data['oov_word']
-        g2p_model_path = f'{model_path}/g2p.bin' if Path(f'{model_path}/g2p.bin').exists() else None
-        if g2p_model_path:
-            tokenizer.g2p_model = G2Phonemizer(g2p_model_path, device=device)
-        return tokenizer
-    def prenormalize(self, texts: List[str], language: Optional[str] = None) -> List[str]:
-        if not self.g2p_model:
-            raise ValueError('G2P model is not loaded, cannot prenormalize texts')
-        oov_words = []
-        for text in texts:
-            words = text.lower().replace('-', ' ').replace('—', ' ').replace('–', ' ').split()
-            oovs = [w for w in words if w not in self.words]
-            if oovs:
-                oov_words.extend([w for w in oovs if (w not in self.words and len(w) <= MAXIMUM_WORD_LENGTH)])
-        oov_words = list(set(oov_words))
-        if oov_words:
-            indexs = []
-            for k, _word in enumerate(oov_words):
-                if any(_word.startswith(p) and _word.endswith(q) for (p, q) in [('(', ')'), ('[', ']')]):
-                    self.dictionaries[_word] = self.dictionaries[self.oov_word]
-                else:
-                    _word = _word.strip(PUNCTUATION_SPACE)
-                    if not _word or _word in self.words:
-                        indexs.append(k)
-            for idx in sorted(indexs, reverse=True):
-                del oov_words[idx]
-            g2p_words = [w for w in oov_words if w not in self.dictionaries]
-            if g2p_words:
-                predictions = self.g2p_model(words=g2p_words, lang=language, batch_size=len(g2p_words), num_prons=4)
-                for _word, _predictions in zip(g2p_words, predictions):
-                    for pronuncation in _predictions:
-                        if pronuncation and pronuncation not in self.dictionaries[_word]:
-                            self.dictionaries[_word].append(pronuncation)
-            pronunciation_dictionaries: Dict[str, List[List[str]]] = {
-                w: self.dictionaries[w] for w in oov_words if self.dictionaries[w]
-            }
-            return pronunciation_dictionaries
-        return {}
-    def tokenize(self, supervisions: List[Supervision]) -> Tuple[str, Dict[str, Any]]:
-        pronunciation_dictionaries = self.prenormalize([s.text for s in supervisions])
-        response = self.client_wrapper.post(
-            'tokenize',
-            json={
-                'supervisions': [s.to_dict() for s in supervisions],
-                'pronunciation_dictionaries': pronunciation_dictionaries,
-            },
-        )
-        if response.status_code != 200:
-            raise Exception(f'Failed to tokenize texts: {response.text}')
-        result = response.json()
-        lattice_id = result['id']
-        return lattice_id, (result['lattice_graph'], result['final_state'], result.get('acoustic_scale', 1.0))
-    def detokenize(
-        self,
-        lattice_id: str,
-        lattice_results: Tuple[torch.Tensor, Any, Any, float, float],
-        # return_supervisions: bool = True,
-        # return_details: bool = False,
-    ) -> List[Supervision]:
-        emission, results, labels, frame_shift, offset, channel = lattice_results  # noqa: F841
-        response = self.client_wrapper.post(
-            'detokenize',
-            json={
-                'lattice_id': lattice_id,
-                'frame_shift': frame_shift,
-                'results': [t.to_dict() for t in results[0]],
-                'labels': labels[0],
-                'offset': offset,
-                'channel': channel,
-                'destroy_lattice': True,
-            },
-        )
-        if response.status_code != 200:
-            raise Exception(f'Failed to detokenize lattice: {response.text}')
-        result = response.json()
-        # if return_details:
-        #     raise NotImplementedError("return_details is not implemented yet")
-        return [Supervision.from_dict(s) for s in result['supervisions']]
-# Compute average score weighted by the span length
-def _score(spans):
-    if not spans:
-        return 0.0
-    # TokenSpan(token=token, start=start, end=end, score=scores[start:end].mean().item())
-    return round(sum(s.score * len(s) for s in spans) / sum(len(s) for s in spans), ndigits=4)

lattifai 0.1.5__py3-none-any.whl → 0.2.2__py3-none-any.whl

lattifai 0.1.5py3-none-any.whl → 0.2.2py3-none-any.whl