PyPI - lattifai - Versions diffs - 1.0.5__py3-none-any.whl → 1.2.0__py3-none-any.whl - Mend

lattifai 1.0.5py3-none-any.whl → 1.2.0py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

lattifai/__init__.py +11 -12
lattifai/alignment/lattice1_aligner.py +39 -7
lattifai/alignment/lattice1_worker.py +135 -147
lattifai/alignment/tokenizer.py +38 -22
lattifai/audio2.py +1 -1
lattifai/caption/caption.py +55 -19
lattifai/cli/__init__.py +2 -0
lattifai/cli/caption.py +1 -1
lattifai/cli/diarization.py +110 -0
lattifai/cli/transcribe.py +3 -1
lattifai/cli/youtube.py +11 -0
lattifai/client.py +32 -111
lattifai/config/alignment.py +14 -0
lattifai/config/client.py +5 -0
lattifai/config/transcription.py +4 -0
lattifai/diarization/lattifai.py +18 -7
lattifai/mixin.py +26 -5
lattifai/transcription/__init__.py +1 -1
lattifai/transcription/base.py +21 -2
lattifai/transcription/gemini.py +127 -1
lattifai/transcription/lattifai.py +30 -2
lattifai/utils.py +62 -69
lattifai/workflow/youtube.py +55 -57
{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/METADATA +352 -56
{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/RECORD +29 -28
{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/entry_points.txt +2 -0
{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/WHEEL +0 -0
{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/licenses/LICENSE +0 -0
{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/top_level.txt +0 -0

{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lattifai
-Version: 1.0.5
+Version: 1.2.0
 Summary: Lattifai Python SDK: Seamless Integration with Lattifai's Speech and Video AI Services
 Author-email: Lattifai Technologies <tech@lattifai.com>
 Maintainer-email: Lattice <tech@lattifai.com>
@@ -50,7 +50,8 @@ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Requires-Python: <3.15,>=3.10
 Description-Content-Type: text/markdown
 License-File: LICENSE
-Requires-Dist: lattifai-core>=0.5.1
+Requires-Dist: k2py>=0.2.1
+Requires-Dist: lattifai-core>=0.6.0
 Requires-Dist: lattifai-run>=1.0.1
 Requires-Dist: python-dotenv
 Requires-Dist: lhotse>=1.26.0
@@ -61,11 +62,13 @@ Requires-Dist: tgt
 Requires-Dist: onnx>=1.16.0
 Requires-Dist: onnxruntime
 Requires-Dist: msgpack
+Requires-Dist: scipy!=1.16.3
 Requires-Dist: g2p-phonemizer>=0.4.0
 Requires-Dist: av
 Requires-Dist: wtpsplit>=2.1.7
+Requires-Dist: modelscope==1.33.0
 Requires-Dist: OmniSenseVoice>=0.4.2
-Requires-Dist: nemo_toolkit_asr[asr]>=2.7.0rc3
+Requires-Dist: nemo_toolkit_asr[asr]>=2.7.0rc4
 Requires-Dist: pyannote-audio-notorchdeps>=4.0.2
 Requires-Dist: questionary>=2.0
 Requires-Dist: yt-dlp
@@ -113,11 +116,9 @@ Dynamic: license-file
 Advanced forced alignment and subtitle generation powered by [ 🤗 Lattice-1](https://huggingface.co/Lattifai/Lattice-1) model.
-> **⚠️ Note on Current Limitations**:
-> 1. **Memory Usage**: We are aware of high memory consumption and are actively working on further optimizations.
 ## Table of Contents
+- [Core Capabilities](#core-capabilities)
 - [Installation](#installation)
 - [Quick Start](#quick-start)
   - [Command Line Interface](#command-line-interface)
@@ -134,16 +135,45 @@ Advanced forced alignment and subtitle generation powered by [ 🤗 Lattice-1](h
   - [YouTube Processing](#youtube-processing)
   - [Configuration Objects](#configuration-objects)
 - [Advanced Features](#advanced-features)
+  - [Audio Preprocessing](#audio-preprocessing)
+  - [Long-Form Audio Support](#long-form-audio-support)
   - [Word-Level Alignment](#word-level-alignment)
   - [Smart Sentence Splitting](#smart-sentence-splitting)
-  - [Speaker Diarization](#speaker-diarization-wip)
+  - [Speaker Diarization](#speaker-diarization)
   - [YAML Configuration Files](#yaml-configuration-files)
+- [Architecture Overview](#architecture-overview)
+- [Performance & Optimization](#performance--optimization)
 - [Supported Formats](#supported-formats)
+- [Supported Languages](#supported-languages)
 - [Roadmap](#roadmap)
 - [Development](#development)
 ---
+## Core Capabilities
+LattifAI provides comprehensive audio-text alignment powered by the Lattice-1 model:
+| Feature | Description | Status |
+|---------|-------------|--------|
+| **Forced Alignment** | Precise word-level and segment-level synchronization with audio | ✅ Production |
+| **Multi-Model Transcription** | Gemini (100+ languages), Parakeet (24 languages), SenseVoice (5 languages) | ✅ Production |
+| **Speaker Diarization** | Automatic multi-speaker identification with label preservation | ✅ Production |
+| **Audio Preprocessing** | Multi-channel selection, device optimization (CPU/CUDA/MPS) | ✅ Production |
+| **Streaming Mode** | Process audio up to 20 hours with minimal memory footprint | ✅ Production |
+| **Smart Text Processing** | Intelligent sentence splitting and non-speech element separation | ✅ Production |
+| **Universal Format Support** | 30+ caption/subtitle formats with text normalization | ✅ Production |
+| **Configuration System** | YAML-based configs for reproducible workflows | ✅ Production |
+**Key Highlights:**
+- 🎯 **Accuracy**: State-of-the-art alignment precision with Lattice-1 model
+- 🌍 **Multilingual**: Support for 100+ languages via multiple transcription models
+- 🚀 **Performance**: Hardware-accelerated processing with streaming support
+- 🔧 **Flexible**: CLI, Python SDK, and Web UI interfaces
+- 📦 **Production-Ready**: Battle-tested on diverse audio/video content
+---
 ## Installation
 ### Step 1: Install SDK
@@ -151,9 +181,6 @@ Advanced forced alignment and subtitle generation powered by [ 🤗 Lattice-1](h
 **Using pip:**
 ```bash
-pip install install-k2
-install-k2 --torch-version 2.9.1  # if not set will auto-detect PyTorch version and install compatible k2
 pip install lattifai
 ```
@@ -167,30 +194,11 @@ uv init my-project
 cd my-project
 source .venv/bin/activate
-# Install k2 (required dependency)
-uv pip install install-k2
-uv pip install pip
-uv run install-k2 --torch-version 2.9.1
 # Install LattifAI
 uv pip install lattifai
 ```
-> **Note**: `install-k2` automatically detects your PyTorch version (up to 2.9) and installs the compatible k2 wheel.
-<details>
-<summary><b>install-k2 options</b></summary>
-```
-usage: install-k2 [-h] [--system {linux,darwin,windows}] [--dry-run] [--torch-version TORCH_VERSION]
-optional arguments:
-  -h, help                      Show this help message and exit
-  --system {linux,darwin,windows}  Override OS detection
-  --dry-run                     Show what would be installed without making changes
-  --torch-version TORCH_VERSION    Specify torch version (e.g., 2.8.0)
-```
-</details>
 ### Step 2: Get Your API Key
@@ -254,7 +262,7 @@ caption = client.alignment(
 That's it! Your aligned subtitles are saved to `aligned.srt`.
-### Web Interface
+### 🚧 Web Interface
 ![web Demo](assets/web.png)
@@ -312,13 +320,9 @@ That's it! Your aligned subtitles are saved to `aligned.srt`.
    The web interface will automatically open in your browser at `http://localhost:5173`.
 **Features:**
-- ✅ Automatic backend server status detection
-- ✅ Visual file upload with drag-and-drop
-- ✅ Real-time alignment progress
-- ✅ Multiple subtitle format support
-- ✅ Built-in transcription with multiple models
-- ✅ API key management interface
-- ✅ Download aligned subtitles in various formats
+- ✅ **Drag-and-Drop Upload**: Visual file upload for audio/video and captions
+- ✅ **Real-Time Progress**: Live alignment progress with detailed status
+- ✅ **Multiple Transcription Models**: Gemini, Parakeet, SenseVoice selection
 ---
@@ -619,6 +623,78 @@ from lattifai import (
 ## Advanced Features
+### Audio Preprocessing
+LattifAI provides powerful audio preprocessing capabilities for optimal alignment:
+**Channel Selection**
+Control which audio channel to process for stereo/multi-channel files:
+```python
+from lattifai import LattifAI
+client = LattifAI()
+# Use left channel only
+caption = client.alignment(
+    input_media="stereo.wav",
+    input_caption="subtitle.srt",
+    channel_selector="left",  # Options: "left", "right", "average", or channel index (0, 1, 2, ...)
+)
+# Average all channels (default)
+caption = client.alignment(
+    input_media="stereo.wav",
+    input_caption="subtitle.srt",
+    channel_selector="average",
+)
+```
+**CLI Usage:**
+```bash
+# Use right channel
+lai alignment align audio.wav subtitle.srt output.srt \
+    media.channel_selector=right
+# Use specific channel index
+lai alignment align audio.wav subtitle.srt output.srt \
+    media.channel_selector=1
+```
+**Device Management**
+Optimize processing for your hardware:
+```python
+from lattifai import LattifAI, AlignmentConfig
+# Use CUDA GPU
+client = LattifAI(
+    alignment_config=AlignmentConfig(device="cuda")
+)
+# Use specific GPU
+client = LattifAI(
+    alignment_config=AlignmentConfig(device="cuda:0")
+)
+# Use Apple Silicon MPS
+client = LattifAI(
+    alignment_config=AlignmentConfig(device="mps")
+)
+# Use CPU
+client = LattifAI(
+    alignment_config=AlignmentConfig(device="cpu")
+)
+```
+**Supported Formats**
+- **Audio**: WAV, MP3, M4A, AAC, FLAC, OGG, OPUS, AIFF, and more
+- **Video**: MP4, MKV, MOV, WEBM, AVI, and more
+- All formats supported by FFmpeg are compatible
 ### Long-Form Audio Support
 LattifAI now supports processing long audio files (up to 20 hours) through streaming mode. Enable streaming by setting the `streaming_chunk_secs` parameter:
@@ -660,14 +736,18 @@ client = LattifAI(
 )
 ```
-**Notes:**
-- Chunk duration must be between 1 and 1800 seconds (minimum 1 second, maximum 30 minutes)
-- Default value: 600 seconds (10 minutes)
-- **Recommended: Use 60 seconds or larger for optimal performance**
-- Set to `None` to disable streaming
-- **Thanks to our precise implementation, streaming has virtually no impact on alignment accuracy**
-- Smaller chunks reduce memory usage with minimal quality trade-off
-- Recommended chunk size: 300-900 seconds (5-15 minutes) for optimal balance
+**Technical Details:**
+| Parameter | Description | Recommendation |
+|-----------|-------------|----------------|
+| **Default Value** | 600 seconds (10 minutes) | Good for most use cases |
+| **Memory Impact** | Lower chunks = less RAM usage | Adjust based on available RAM |
+| **Accuracy Impact** | Virtually zero degradation | Our precise implementation preserves quality |
+**Performance Characteristics:**
+- ✅ **Near-Perfect Accuracy**: Streaming implementation maintains alignment precision
+- 🚧 **Memory Efficient**: Process 20-hour audio with <10GB RAM (600-sec chunks)
 ### Word-Level Alignment
@@ -708,18 +788,50 @@ caption = client.alignment(
 )
 ```
-### Speaker Diarization (WIP)
+### Speaker Diarization
+Speaker diarization automatically identifies and labels different speakers in audio using state-of-the-art models.
+**Core Capabilities:**
+- 🎤 **Multi-Speaker Detection**: Automatically detect speaker changes in audio
+- 🏷️ **Smart Labeling**: Assign speaker labels (SPEAKER_00, SPEAKER_01, etc.)
+- 🔄 **Label Preservation**: Maintain existing speaker names from input captions
+- 🤖 **Gemini Integration**: Extract speaker names intelligently during transcription
+**How It Works:**
+1. **Without Existing Labels**: System assigns generic labels (SPEAKER_00, SPEAKER_01)
+2. **With Existing Labels**: System preserves your speaker names during alignment
+   - Formats: `[Alice]`, `>> Bob:`, `SPEAKER_01:`, `Alice:` are all recognized
+3. **Gemini Transcription**: When using Gemini models, speaker names are extracted from context
+   - Example: "Hi, I'm Alice" → System labels as `Alice` instead of `SPEAKER_00`
+**Speaker Label Integration:**
+The diarization engine intelligently matches detected speakers with existing labels:
+- If input captions have speaker names → **Preserved during alignment**
+- If Gemini transcription provides names → **Used for labeling**
+- Otherwise → **Generic labels (SPEAKER_00, etc.) assigned**
+    * 🚧 **Future Enhancement:**
+        - **AI-Powered Speaker Name Inference**: Upcoming feature will use large language models combined with metadata (video title, description, context) to intelligently infer speaker names, making transcripts more human-readable and contextually accurate
-**Note:** This feature is currently under development and not yet fully available.
+**CLI:**
+```bash
+# Enable speaker diarization during alignment
+lai alignment align audio.wav subtitle.srt output.srt \
+    diarization.enabled=true
-Speaker diarization automatically identifies and labels different speakers in audio. When enabled, the system will:
-- Detect speaker changes in the audio
-- Assign speaker labels (e.g., SPEAKER_00, SPEAKER_01) to each segment
-- Update subtitle segments with speaker information
+# With additional diarization settings
+lai alignment align audio.wav subtitle.srt output.srt \
+    diarization.enabled=true \
+    diarization.device=cuda \
+    diarization.min_speakers=2 \
+    diarization.max_speakers=4
-**Speaker Name Handling:**
-- **Existing speaker labels in subtitles**: If your input captions already contain speaker names (e.g., `[Alice]`, `>> Bob:`, or `SPEAKER_01:`), the system will preserve them as much as possible during alignment
-- **Gemini Transcriber**: When using Gemini models for transcription (e.g., `gemini-2.5-pro`), the model can intelligently identify and extract speaker names from dialogue context, making it easier to generate speaker-aware transcripts
+# For YouTube videos with diarization
+lai alignment youtube "https://youtube.com/watch?v=VIDEO_ID" \
+    diarization.enabled=true
+```
 **Python SDK:**
 ```python
@@ -742,6 +854,8 @@ for segment in caption.supervisions:
 ### YAML Configuration Files
+* **under development**
 Create reusable configuration files:
 ```yaml
@@ -758,6 +872,125 @@ lai alignment align audio.wav subtitle.srt output.srt \
 ---
+## Architecture Overview
+LattifAI uses a modular, config-driven architecture for maximum flexibility:
+```
+┌─────────────────────────────────────────────────────────────┐
+│                      LattifAI Client                        │
+├─────────────────────────────────────────────────────────────┤
+│  Configuration Layer (Config-Driven)                        │
+│  ├── ClientConfig      (API settings)                       │
+│  ├── AlignmentConfig   (Model & device)                     │
+│  ├── CaptionConfig     (I/O formats)                        │
+│  ├── TranscriptionConfig (ASR models)                       │
+│  └── DiarizationConfig (Speaker detection)                  │
+├─────────────────────────────────────────────────────────────┤
+│  Core Components                                            │
+│  ├── AudioLoader      → Load & preprocess audio             │
+│  ├── Aligner          → Lattice-1 forced alignment          │
+│  ├── Transcriber      → Multi-model ASR                     │
+│  ├── Diarizer         → Speaker identification              │
+│  └── Tokenizer        → Intelligent text segmentation       │
+├─────────────────────────────────────────────────────────────┤
+│  Data Flow                                                  │
+│  Input → AudioLoader → Aligner → Diarizer → Caption        │
+│             ↓                                               │
+│         Transcriber (optional)                              │
+└─────────────────────────────────────────────────────────────┘
+```
+**Component Responsibilities:**
+| Component | Purpose | Configuration |
+|-----------|---------|---------------|
+| **AudioLoader** | Load audio/video, channel selection, format conversion | `MediaConfig` |
+| **Aligner** | Forced alignment using Lattice-1 model | `AlignmentConfig` |
+| **Transcriber** | ASR with Gemini/Parakeet/SenseVoice | `TranscriptionConfig` |
+| **Diarizer** | Speaker diarization with pyannote.audio | `DiarizationConfig` |
+| **Tokenizer** | Sentence splitting and text normalization | `CaptionConfig` |
+| **Caption** | Unified data structure for alignments | `CaptionConfig` |
+**Data Flow:**
+1. **Audio Loading**: `AudioLoader` loads media, applies channel selection, converts to numpy array
+2. **Transcription** (optional): `Transcriber` generates transcript if no caption provided
+3. **Text Preprocessing**: `Tokenizer` splits sentences and normalizes text
+4. **Alignment**: `Aligner` uses Lattice-1 to compute word-level timestamps
+5. **Diarization** (optional): `Diarizer` identifies speakers and assigns labels
+6. **Output**: `Caption` object contains all results, exported to desired format
+**Configuration Philosophy:**
+- ✅ **Declarative**: Describe what you want, not how to do it
+- ✅ **Composable**: Mix and match configurations
+- ✅ **Reproducible**: Save configs to YAML for consistent results
+- ✅ **Flexible**: Override configs per-method or globally
+---
+## Performance & Optimization
+### Device Selection
+Choose the optimal device for your hardware:
+```python
+from lattifai import LattifAI, AlignmentConfig
+# NVIDIA GPU (recommended for speed)
+client = LattifAI(
+    alignment_config=AlignmentConfig(device="cuda")
+)
+# Apple Silicon GPU
+client = LattifAI(
+    alignment_config=AlignmentConfig(device="mps")
+)
+# CPU (maximum compatibility)
+client = LattifAI(
+    alignment_config=AlignmentConfig(device="cpu")
+)
+```
+**Performance Comparison** (30-minute audio):
+| Device | Time |
+|--------|------|
+| CUDA (RTX 4090) | ~18 sec |
+| MPS (M4) | ~26 sec |
+### Memory Management
+**Streaming Mode** for long audio:
+```python
+# Process 20-hour audio with <10GB RAM
+caption = client.alignment(
+    input_media="long_audio.wav",
+    input_caption="subtitle.srt",
+    streaming_chunk_secs=600.0,  # 10-minute chunks
+)
+```
+**Memory Usage** (approximate):
+| Chunk Size | Peak RAM | Suitable For |
+|------------|----------|-------------|
+| 600 sec | ~5 GB | Recommended |
+| No streaming | ~10 GB+ | Short audio only |
+### Optimization Tips
+1. **Use GPU when available**: 10x faster than CPU
+2. **WIP: Enable streaming for long audio**: Process 20+ hour files without OOM
+3. **Choose appropriate chunk size**: Balance memory vs. performance
+4. **Batch processing**: Process multiple files in sequence (coming soon)
+5. **Profile alignment**: Set `client.profile=True` to identify bottlenecks
+---
 ## Supported Formats
 LattifAI supports virtually all common media and subtitle formats:
@@ -778,14 +1011,77 @@ LattifAI supports virtually all common media and subtitle formats:
 ---
+## Supported Languages
+LattifAI supports multiple transcription models with different language capabilities:
+### Gemini Models (100+ Languages)
+**Models**: `gemini-2.5-pro`, `gemini-3-pro-preview`, `gemini-3-flash-preview`
+**Supported Languages**: English, Chinese (Mandarin & Cantonese), Spanish, French, German, Italian, Portuguese, Japanese, Korean, Arabic, Russian, Hindi, Bengali, Turkish, Dutch, Polish, Swedish, Danish, Norwegian, Finnish, Greek, Hebrew, Thai, Vietnamese, Indonesian, Malay, Filipino, Ukrainian, Czech, Romanian, Hungarian, Swahili, Tamil, Telugu, Marathi, Gujarati, Kannada, and 70+ more languages.
+> **Note**: Requires Gemini API key from [Google AI Studio](https://aistudio.google.com/apikey)
+### NVIDIA Parakeet (24 European Languages)
+**Model**: `nvidia/parakeet-tdt-0.6b-v3`
+**Supported Languages**:
+- **Western Europe**: English (en), French (fr), German (de), Spanish (es), Italian (it), Portuguese (pt), Dutch (nl)
+- **Nordic**: Danish (da), Swedish (sv), Norwegian (no), Finnish (fi)
+- **Eastern Europe**: Polish (pl), Czech (cs), Slovak (sk), Hungarian (hu), Romanian (ro), Bulgarian (bg), Ukrainian (uk), Russian (ru)
+- **Others**: Croatian (hr), Estonian (et), Latvian (lv), Lithuanian (lt), Slovenian (sl), Maltese (mt), Greek (el)
+### Alibaba SenseVoice (5 Asian Languages)
+**Model**: `iic/SenseVoiceSmall`
+**Supported Languages**:
+- Chinese/Mandarin (zh)
+- English (en)
+- Japanese (ja)
+- Korean (ko)
+- Cantonese (yue)
+### Language Selection
+```python
+from lattifai import LattifAI, TranscriptionConfig
+# Specify language for transcription
+client = LattifAI(
+    transcription_config=TranscriptionConfig(
+        model_name="nvidia/parakeet-tdt-0.6b-v3",
+        language="de",  # German
+    )
+)
+```
+**CLI Usage:**
+```bash
+lai transcribe run audio.wav output.srt \
+    transcription.model_name=nvidia/parakeet-tdt-0.6b-v3 \
+    transcription.language=de
+```
+> **Tip**: Use Gemini models for maximum language coverage, Parakeet for European languages, and SenseVoice for Asian languages.
+---
 ## Roadmap
 Visit our [LattifAI roadmap](https://lattifai.com/roadmap) for the latest updates.
-| Date | Release | Features |
+| Date | Model Release | Features |
 |------|---------|----------|
 | **Oct 2025** | **Lattice-1-Alpha** | ✅ English forced alignment<br>✅ Multi-format support<br>✅ CPU/GPU optimization |
-| **Nov 2025** | **Lattice-1** | ✅ English + Chinese + German<br>✅ Mixed languages alignment<br>🚀 Integrate Speaker Diarization |
+| **Nov 2025** | **Lattice-1** | ✅ English + Chinese + German<br>✅ Mixed languages alignment<br>✅ Speaker Diarization<br>✅ Multi-model transcription (Gemini, Parakeet, SenseVoice)<br>✅ Web interface with React<br>🚧 Advanced segmentation strategies (entire/transcription/hybrid)<br>🚧 Audio event detection ([MUSIC], [APPLAUSE], etc.)<br> |
+| **Q1 2026** | **Lattice-2** | ✅ Streaming mode for long audio<br>🔮 40+ languages support<br>🔮 Real-time alignment |
+**Legend**: ✅ Released | 🚧 In Development | 📋 Planned | 🔮 Future
 ---

{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/RECORD RENAMED Viewed

@@ -1,44 +1,45 @@
-lattifai/__init__.py,sha256=7y1R5IGw0Sgvl1tfqxEK7e-ozW0wVB-q_JZgv6YyrMQ,2751
-lattifai/audio2.py,sha256=BKMCzkuEmBFAWOEnzgLxeK8TBPTFbjzr1esOfe3MQoo,17460
-lattifai/client.py,sha256=OXDGsWVeOMEjmXI795pvnK3L-ZLn_sfUwG0i7uJ1JkY,22492
+lattifai/__init__.py,sha256=l7dIodSCVMHUXQkd8BVGBoDdYojBCh_lyBWlVibynk8,2695
+lattifai/audio2.py,sha256=P3N8_BwiscbetzDbkbj-n8BcMu2vWD6-MvtQvGwWWf0,17448
+lattifai/client.py,sha256=Vqg4vY--6tox9Js0qGWlE7IGeHJVyQeYLTXYtlzPk3w,19020
 lattifai/errors.py,sha256=LyWRGVhQ6Ak2CYn9FBYAPRgQ7_VHpxzNsXI31HXD--s,11291
 lattifai/logging.py,sha256=MbUEeOUFlF92pA9v532DiPPWKl03S7UHCJ6Z652cf0w,2860
-lattifai/mixin.py,sha256=yj3H1SSQSQrhUeqKhQmRRELRr5fp2mb2ovkK9p8Vwn4,23858
+lattifai/mixin.py,sha256=wdgxEhgxR--dHXmeiJZ4AQDxEjKo49GLYQ0BXJw3qpk,25206
 lattifai/types.py,sha256=SjYBfwrCBOXlICvH04niFQJ7OzTx7oTaa_npfRkB67U,659
-lattifai/utils.py,sha256=ZYEUaoTBCwzv4PBBD-woeiDSTx8T1a1vXHIT0g1YmRI,5345
+lattifai/utils.py,sha256=cMiC5CY6gSMtcOtf_wK1BBMBEfHwc5R_S8_NIoVYk6I,5321
 lattifai/alignment/__init__.py,sha256=ehpkKfjNIYUx7_M-RWD_8Efcrzd9bE-NSm0QgMMVLW0,178
-lattifai/alignment/lattice1_aligner.py,sha256=DpN_it7ETZgz6uH3I90Y926bvjhFRdL6dycxz5S_tkI,5142
-lattifai/alignment/lattice1_worker.py,sha256=1yYK_xLOL_xHZTVGgNb957R7HhHnl6xwrXUcN372ZIY,12407
+lattifai/alignment/lattice1_aligner.py,sha256=wm1BWNu4h-b507OAvLV0ITi7g0qaWthOPwvzWFHKyZQ,6251
+lattifai/alignment/lattice1_worker.py,sha256=ls2o3pVChB63OQrElJOmHzYIhCkjBFPt8EsLIVR1sJ0,11104
 lattifai/alignment/phonemizer.py,sha256=fbhN2DOl39lW4nQWKzyUUTMUabg7v61lB1kj8SKK-Sw,1761
 lattifai/alignment/segmenter.py,sha256=mzWEQC6hWZtI2mR2WU59W7qLHa7KXy7fdU6991kyUuQ,6276
-lattifai/alignment/tokenizer.py,sha256=oqgy5L9wU0_AMyUVNArEtPIDXm7WdvNNfJuB2ZJBpqI,22394
+lattifai/alignment/tokenizer.py,sha256=JY11uEe-v4KQLoHZuaHgdFqgxR3u_1D9ZXXMnB6hA-Q,22994
 lattifai/caption/__init__.py,sha256=6MM_2j6CaqwZ81LfSy4di2EP0ykvheRjMZKAYDx2rQs,477
-lattifai/caption/caption.py,sha256=Ljt-6K89AauIK05hdDqjV6G03mkTTJL2UE9ukt-tck0,52502
+lattifai/caption/caption.py,sha256=mZYobxuZ8tkJUkZMVvRTrNeGTdmIZYSXTEySQdaGQd8,54595
 lattifai/caption/gemini_reader.py,sha256=GqY2w78xGYCMDP5kD5WGS8jK0gntel2SK-EPpPKTrwU,15138
 lattifai/caption/gemini_writer.py,sha256=sYPxYEmVQcEan5WVGgSrcraxs3QJRQRh8CJkl2yUQ1s,6515
 lattifai/caption/supervision.py,sha256=DRrM8lfKU_x9aVBcLG6xnT0xIJrnc8jzHpzcSwQOg8c,905
 lattifai/caption/text_parser.py,sha256=XDb8KTt031uJ1hg6dpbINglGOTX-6pBcghbg3DULM1I,4633
-lattifai/cli/__init__.py,sha256=dIUmrpN-OwR4h6BqMhXp87_5ZwgO41ShPru_iZGnpQs,463
+lattifai/cli/__init__.py,sha256=LafsAf8YfDcfTeJ1IevFcyLm-mNbxpOOnm33OFKtpDM,523
 lattifai/cli/alignment.py,sha256=06em-Uaf6NhSz1ce4dwT2r8n56NrtibR7ZsSkmc18Kc,5954
 lattifai/cli/app_installer.py,sha256=gAndH3Yo97fGRDe2CQnGtOgZZ4k3_v5ftcUo5g6xbSA,5884
-lattifai/cli/caption.py,sha256=p0VY6orf3D77tr30NQka7A84kwEmYiZrCDB6FbTgoFM,6312
+lattifai/cli/caption.py,sha256=4qQ9DFhxcfaeFMY0TB5I42x4W_gOo2zY6kjXnHnFDms,6313
+lattifai/cli/diarization.py,sha256=GTd2vnTm6cJN6Q3mFP-ShY9bZBl1_zKzWFu-4HHcMzk,4075
 lattifai/cli/server.py,sha256=sXMfOSse9-V79slXUU8FDLeqtI5U9zeU-5YpjTIGyVw,1186
-lattifai/cli/transcribe.py,sha256=W42SVhnOQ0EndMk-Lu38BiG1LuMcJnzre9X83M6kBZ4,8137
-lattifai/cli/youtube.py,sha256=-EIDSS1Iel3_6qD9M2CZZHwKOvgdkIa1cMY4rX7xwVo,5331
+lattifai/cli/transcribe.py,sha256=_vHzrdaGiPepQWATqvEDYDjwzfVLAd2i8RjOLkvdb0w,8218
+lattifai/cli/youtube.py,sha256=9M2dpcUCvT7vVbXJCIxJwe9klJXoF2jUeLxiatslYso,6063
 lattifai/config/__init__.py,sha256=Z8OudvS6fgfLNLu_2fvoXartQiYCECOnNfzDt-PfCN4,543
-lattifai/config/alignment.py,sha256=v6SuryAVNET9hgH_ZidYN2QhZqpEDnNhR-rogSSSfAg,4039
+lattifai/config/alignment.py,sha256=vLiH150YWvBUiVkFOIO-nPXCB-b8fP9iSZgS79k1Qbg,4586
 lattifai/config/caption.py,sha256=AYOyUJ1xZsX8CBZy3GpLitbcCAHcZ9LwXui_v3vtuso,6786
-lattifai/config/client.py,sha256=I1JqLQlsQNU5ouovTumr-PP_8GWC9DI_e9B5UwsDZws,1492
+lattifai/config/client.py,sha256=46b816MiYja3Uan_3wjnhtqDr0M6T-FqEygJ3e50IZc,1664
 lattifai/config/diarization.py,sha256=cIkwCfsYqfMns3i6tKWcwBBBkdnhhmB_Eo0TuOPCw9o,2484
 lattifai/config/media.py,sha256=cjM8eGeZ7ELhmy4cCqHAyogeHItaVqMrPzSwwIx79HY,14856
-lattifai/config/transcription.py,sha256=bzghOGgcNWzTnDYd_cqCOB7GT8OnzHDiyam7LSixqxM,2901
+lattifai/config/transcription.py,sha256=_gPJD6cob_jWNdf841nBHhAqJGCxS6PfSyvx2W_vPcM,3082
 lattifai/diarization/__init__.py,sha256=MgBDQ1ehL2qDnZprEp8KqON7CmbG-qaP37gzBsV0jzk,119
-lattifai/diarization/lattifai.py,sha256=SE2BpIZ3_deKyhXdBqe77bsDLXIUV9AQV34gfINv7_s,2657
+lattifai/diarization/lattifai.py,sha256=tCnFL6ywITqeKR8YoCsYvyJxNoIwoC6GsnI9zkXNB-Q,3128
 lattifai/server/app.py,sha256=wXYgXc_yGQACtUJdhkfhLsTOQjhhIhDQRiVRny7Ogcs,15455
-lattifai/transcription/__init__.py,sha256=mEoMTbs5jAgtXQn1jTjlFY_GUr-S0WmPn8uZ6WZCkU0,2643
-lattifai/transcription/base.py,sha256=59b4nQHFMyTRyyzBJTM8ZpEuUy1KjwA2o6rNfrNluKY,3911
-lattifai/transcription/gemini.py,sha256=1VNi9gl-Kpkw3ljZcOZG5oq_OY8fMC9Xv4kOwyQpI0Q,7992
-lattifai/transcription/lattifai.py,sha256=h0nhXST0qljhyndf80IEddM7Y_N1jiS28YoaE536eME,3483
+lattifai/transcription/__init__.py,sha256=vMHciyCEPKhhfM3KjMCeDqnyxU1oghF8g5o5SvpnT_4,2669
+lattifai/transcription/base.py,sha256=v_b1_JGYiBqeMmwns0wHCJ7UOm6j9k-76Uzbr-qmzrs,4467
+lattifai/transcription/gemini.py,sha256=LJSQt9nGqQdEG6ZFXoHWltumyMEM7-Ezy8ss0iPJb7k,12414
+lattifai/transcription/lattifai.py,sha256=EKEdCafgdRWKw_084eD07BqGh2_D-qo3ig3H5X3XYGg,4621
 lattifai/transcription/prompts/README.md,sha256=X49KWSQVdjWxxWUp4R2w3ZqKrAOi6_kDNHh1hMaQ4PE,694
 lattifai/transcription/prompts/__init__.py,sha256=G9b42COaCYv3sPPNkHsGDLOMBuVGKt4mXGYal_BYtYQ,1351
 lattifai/transcription/prompts/gemini/README.md,sha256=rt7f7yDGtaobKBo95LG3u56mqa3ABOXQd0UVgJYtYuo,781
@@ -47,10 +48,10 @@ lattifai/workflow/__init__.py,sha256=GOT9jptXwpIMiNRqJ_LToEt_5Dt0k7XXbLkFzhrl31o
 lattifai/workflow/agents.py,sha256=yEOnxnhcTvr1iOhCorNvp8B76P6nQsLRXJCu_rCYFfM,38
 lattifai/workflow/base.py,sha256=8QoVIBZwJfr5mppJbtUFafHv5QR9lL-XrULjTWD0oBg,6257
 lattifai/workflow/file_manager.py,sha256=IUWW838ta83kfwM4gpW83gsD_Tx-pa-L_RWKjiefQbQ,33017
-lattifai/workflow/youtube.py,sha256=ON9z0UUk16ThQzdhdgyOiwBmewZOcxfT05dsl3aKYqw,23840
-lattifai-1.0.5.dist-info/licenses/LICENSE,sha256=xGMLmdFJy6Jkz3Hd0znyQLmcxC93FSZB5isKnEDMoQQ,1066
-lattifai-1.0.5.dist-info/METADATA,sha256=cTg6ivcixFAv-464qk0R2v19LdEgGkETcNvRzycFSKk,26117
-lattifai-1.0.5.dist-info/WHEEL,sha256=SmOxYU7pzNKBqASvQJ7DjX3XGUF92lrGhMb3R6_iiqI,91
-lattifai-1.0.5.dist-info/entry_points.txt,sha256=F8Akof3VtKtrbnYSav1umgoo9Xbv34rUcKn-ioRfeGQ,474
-lattifai-1.0.5.dist-info/top_level.txt,sha256=tHSoXF26r-IGfbIP_JoYATqbmf14h5NrnNJGH4j5reI,9
-lattifai-1.0.5.dist-info/RECORD,,
+lattifai/workflow/youtube.py,sha256=0B1l_8gdz_O0cy2c9AY9wRPizESQrpRuCP4rwvWRxLA,23687
+lattifai-1.2.0.dist-info/licenses/LICENSE,sha256=xGMLmdFJy6Jkz3Hd0znyQLmcxC93FSZB5isKnEDMoQQ,1066
+lattifai-1.2.0.dist-info/METADATA,sha256=9iEaT3muzKIUmIvQ0oqg4DhM_CvZ53jHvk97kHfPNlQ,37399
+lattifai-1.2.0.dist-info/WHEEL,sha256=SmOxYU7pzNKBqASvQJ7DjX3XGUF92lrGhMb3R6_iiqI,91
+lattifai-1.2.0.dist-info/entry_points.txt,sha256=nHZri2VQkPYEl0tQ0dMYTpVGlCOgVWlDG_JtDR3QXF8,545
+lattifai-1.2.0.dist-info/top_level.txt,sha256=tHSoXF26r-IGfbIP_JoYATqbmf14h5NrnNJGH4j5reI,9
+lattifai-1.2.0.dist-info/RECORD,,

{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/entry_points.txt RENAMED Viewed

@@ -1,6 +1,7 @@
 [console_scripts]
 lai-align = lattifai.cli.alignment:main
 lai-app-install = lattifai.cli.app_installer:main
+lai-diarize = lattifai.cli.diarization:main
 lai-server = lattifai.cli.server:main
 lai-transcribe = lattifai.cli.transcribe:main
 lai-youtube = lattifai.cli.youtube:main
@@ -11,4 +12,5 @@ laicap-shift = lattifai.cli.caption:main_shift
 [lai_run.cli]
 alignment = lattifai.cli
 caption = lattifai.cli
+diarization = lattifai.cli
 transcribe = lattifai.cli

{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/WHEEL RENAMED Viewed

File without changes

{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/licenses/LICENSE RENAMED Viewed

File without changes

{lattifai-1.0.5.dist-info → lattifai-1.2.0.dist-info}/top_level.txt RENAMED Viewed

File without changes

lattifai 1.0.5__py3-none-any.whl → 1.2.0__py3-none-any.whl

lattifai 1.0.5py3-none-any.whl → 1.2.0py3-none-any.whl