lattifai 1.2.1__py3-none-any.whl → 1.3.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (59) hide show
  1. lattifai/_init.py +20 -0
  2. lattifai/alignment/__init__.py +9 -1
  3. lattifai/alignment/lattice1_aligner.py +175 -54
  4. lattifai/alignment/lattice1_worker.py +47 -4
  5. lattifai/alignment/punctuation.py +38 -0
  6. lattifai/alignment/segmenter.py +3 -2
  7. lattifai/alignment/text_align.py +441 -0
  8. lattifai/alignment/tokenizer.py +134 -65
  9. lattifai/audio2.py +162 -183
  10. lattifai/cli/__init__.py +2 -1
  11. lattifai/cli/alignment.py +5 -0
  12. lattifai/cli/caption.py +111 -4
  13. lattifai/cli/transcribe.py +2 -6
  14. lattifai/cli/youtube.py +7 -1
  15. lattifai/client.py +72 -123
  16. lattifai/config/__init__.py +28 -0
  17. lattifai/config/alignment.py +14 -0
  18. lattifai/config/caption.py +45 -31
  19. lattifai/config/client.py +16 -0
  20. lattifai/config/event.py +102 -0
  21. lattifai/config/media.py +20 -0
  22. lattifai/config/transcription.py +25 -1
  23. lattifai/data/__init__.py +8 -0
  24. lattifai/data/caption.py +228 -0
  25. lattifai/diarization/__init__.py +41 -1
  26. lattifai/errors.py +78 -53
  27. lattifai/event/__init__.py +65 -0
  28. lattifai/event/lattifai.py +166 -0
  29. lattifai/mixin.py +49 -32
  30. lattifai/transcription/base.py +8 -2
  31. lattifai/transcription/gemini.py +147 -16
  32. lattifai/transcription/lattifai.py +25 -63
  33. lattifai/types.py +1 -1
  34. lattifai/utils.py +7 -13
  35. lattifai/workflow/__init__.py +28 -4
  36. lattifai/workflow/file_manager.py +2 -5
  37. lattifai/youtube/__init__.py +43 -0
  38. lattifai/youtube/client.py +1265 -0
  39. lattifai/youtube/types.py +23 -0
  40. lattifai-1.3.0.dist-info/METADATA +678 -0
  41. lattifai-1.3.0.dist-info/RECORD +57 -0
  42. {lattifai-1.2.1.dist-info → lattifai-1.3.0.dist-info}/entry_points.txt +1 -2
  43. lattifai/__init__.py +0 -88
  44. lattifai/alignment/sentence_splitter.py +0 -219
  45. lattifai/caption/__init__.py +0 -20
  46. lattifai/caption/caption.py +0 -1467
  47. lattifai/caption/gemini_reader.py +0 -462
  48. lattifai/caption/gemini_writer.py +0 -173
  49. lattifai/caption/supervision.py +0 -34
  50. lattifai/caption/text_parser.py +0 -145
  51. lattifai/cli/app_installer.py +0 -142
  52. lattifai/cli/server.py +0 -44
  53. lattifai/server/app.py +0 -427
  54. lattifai/workflow/youtube.py +0 -577
  55. lattifai-1.2.1.dist-info/METADATA +0 -1134
  56. lattifai-1.2.1.dist-info/RECORD +0 -58
  57. {lattifai-1.2.1.dist-info → lattifai-1.3.0.dist-info}/WHEEL +0 -0
  58. {lattifai-1.2.1.dist-info → lattifai-1.3.0.dist-info}/licenses/LICENSE +0 -0
  59. {lattifai-1.2.1.dist-info → lattifai-1.3.0.dist-info}/top_level.txt +0 -0
@@ -1,1134 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: lattifai
3
- Version: 1.2.1
4
- Summary: Lattifai Python SDK: Seamless Integration with Lattifai's Speech and Video AI Services
5
- Author-email: Lattifai Technologies <tech@lattifai.com>
6
- Maintainer-email: Lattice <tech@lattifai.com>
7
- License: MIT License
8
-
9
- Copyright (c) 2025 LattifAI.
10
-
11
- Permission is hereby granted, free of charge, to any person obtaining a copy
12
- of this software and associated documentation files (the "Software"), to deal
13
- in the Software without restriction, including without limitation the rights
14
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
15
- copies of the Software, and to permit persons to whom the Software is
16
- furnished to do so, subject to the following conditions:
17
-
18
- The above copyright notice and this permission notice shall be included in all
19
- copies or substantial portions of the Software.
20
-
21
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
22
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
23
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
24
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
25
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
26
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
27
- SOFTWARE.
28
-
29
- Project-URL: Homepage, https://github.com/lattifai/lattifai-python
30
- Project-URL: Documentation, https://github.com/lattifai/lattifai-python/blob/main/README.md
31
- Project-URL: Bug Tracker, https://github.com/lattifai/lattifai-python/issues
32
- Project-URL: Discussions, https://github.com/lattifai/lattifai-python/discussions
33
- Project-URL: Changelog, https://github.com/lattifai/lattifai-python/blob/main/CHANGELOG.md
34
- Keywords: lattifai,speech recognition,video analysis,ai,sdk,api client
35
- Classifier: Development Status :: 5 - Production/Stable
36
- Classifier: Intended Audience :: Developers
37
- Classifier: Intended Audience :: Science/Research
38
- Classifier: License :: OSI Approved :: Apache Software License
39
- Classifier: Programming Language :: Python :: 3.10
40
- Classifier: Programming Language :: Python :: 3.11
41
- Classifier: Programming Language :: Python :: 3.12
42
- Classifier: Programming Language :: Python :: 3.13
43
- Classifier: Programming Language :: Python :: 3.14
44
- Classifier: Operating System :: MacOS :: MacOS X
45
- Classifier: Operating System :: POSIX :: Linux
46
- Classifier: Operating System :: Microsoft :: Windows
47
- Classifier: Topic :: Multimedia :: Sound/Audio
48
- Classifier: Topic :: Multimedia :: Video
49
- Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
50
- Requires-Python: <3.15,>=3.10
51
- Description-Content-Type: text/markdown
52
- License-File: LICENSE
53
- Requires-Dist: k2py>=0.2.1
54
- Requires-Dist: lattifai-core>=0.6.0
55
- Requires-Dist: lattifai-run>=1.0.1
56
- Requires-Dist: python-dotenv
57
- Requires-Dist: lhotse>=1.26.0
58
- Requires-Dist: colorful>=0.5.6
59
- Requires-Dist: pysubs2
60
- Requires-Dist: praatio
61
- Requires-Dist: tgt
62
- Requires-Dist: onnx>=1.16.0
63
- Requires-Dist: onnxruntime
64
- Requires-Dist: msgpack
65
- Requires-Dist: scipy!=1.16.3
66
- Requires-Dist: g2p-phonemizer>=0.4.0
67
- Requires-Dist: av
68
- Requires-Dist: wtpsplit>=2.1.7
69
- Requires-Dist: modelscope==1.33.0
70
- Requires-Dist: error-align-fix>=0.1.2
71
- Requires-Dist: OmniSenseVoice>=0.4.2
72
- Requires-Dist: nemo_toolkit_asr[asr]>=2.7.0rc4
73
- Requires-Dist: pyannote-audio-notorchdeps>=4.0.2
74
- Requires-Dist: questionary>=2.0
75
- Requires-Dist: yt-dlp
76
- Requires-Dist: pycryptodome
77
- Requires-Dist: google-genai>=1.22.0
78
- Requires-Dist: fastapi>=0.111.0
79
- Requires-Dist: uvicorn>=0.30.0
80
- Requires-Dist: python-multipart>=0.0.9
81
- Requires-Dist: jinja2>=3.1.4
82
- Provides-Extra: numpy
83
- Requires-Dist: numpy; extra == "numpy"
84
- Provides-Extra: diarization
85
- Requires-Dist: torch-audiomentations==0.12.0; extra == "diarization"
86
- Requires-Dist: pyannote.audio>=4.0.2; extra == "diarization"
87
- Provides-Extra: transcription
88
- Requires-Dist: OmniSenseVoice>=0.4.0; extra == "transcription"
89
- Requires-Dist: nemo_toolkit_asr[asr]>=2.7.0rc3; extra == "transcription"
90
- Provides-Extra: test
91
- Requires-Dist: pytest; extra == "test"
92
- Requires-Dist: pytest-cov; extra == "test"
93
- Requires-Dist: pytest-asyncio; extra == "test"
94
- Requires-Dist: numpy; extra == "test"
95
- Provides-Extra: all
96
- Requires-Dist: numpy; extra == "all"
97
- Requires-Dist: pytest; extra == "all"
98
- Requires-Dist: pytest-cov; extra == "all"
99
- Requires-Dist: pytest-asyncio; extra == "all"
100
- Requires-Dist: pyannote.audio>=4.0.2; extra == "all"
101
- Dynamic: license-file
102
-
103
- <div align="center">
104
- <img src="https://raw.githubusercontent.com/lattifai/lattifai-python/main/assets/logo.png" width=256>
105
-
106
- [![PyPI version](https://badge.fury.io/py/lattifai.svg)](https://badge.fury.io/py/lattifai)
107
- [![Python Versions](https://img.shields.io/pypi/pyversions/lattifai.svg)](https://pypi.org/project/lattifai)
108
- [![PyPI Status](https://pepy.tech/badge/lattifai)](https://pepy.tech/project/lattifai)
109
- </div>
110
-
111
- <p align="center">
112
- 🌐 <a href="https://lattifai.com"><b>Official Website</b></a> &nbsp&nbsp | &nbsp&nbsp 🖥️ <a href="https://github.com/lattifai/lattifai-python">GitHub</a> &nbsp&nbsp | &nbsp&nbsp 🤗 <a href="https://huggingface.co/Lattifai/Lattice-1">Model</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://lattifai.com/blogs">Blog</a> &nbsp&nbsp | &nbsp&nbsp <a href="https://discord.gg/kvF4WsBRK8"><img src="https://img.shields.io/badge/Discord-Join-5865F2?logo=discord&logoColor=white" alt="Discord" style="vertical-align: middle;"></a>
113
- </p>
114
-
115
-
116
- # LattifAI: Precision Alignment, Infinite Possibilities
117
-
118
- Advanced forced alignment and subtitle generation powered by [ 🤗 Lattice-1](https://huggingface.co/Lattifai/Lattice-1) model.
119
-
120
- ## Table of Contents
121
-
122
- - [Core Capabilities](#core-capabilities)
123
- - [Installation](#installation)
124
- - [Quick Start](#quick-start)
125
- - [Command Line Interface](#command-line-interface)
126
- - [Python SDK (5 Lines of Code)](#python-sdk-5-lines-of-code)
127
- - [Web Interface](#web-interface)
128
- - [CLI Reference](#cli-reference)
129
- - [lai alignment align](#lai-alignment-align)
130
- - [lai alignment youtube](#lai-alignment-youtube)
131
- - [lai transcribe run](#lai-transcribe-run)
132
- - [lai caption convert](#lai-caption-convert)
133
- - [lai caption shift](#lai-caption-shift)
134
- - [Python SDK Reference](#python-sdk-reference)
135
- - [Basic Alignment](#basic-alignment)
136
- - [YouTube Processing](#youtube-processing)
137
- - [Configuration Objects](#configuration-objects)
138
- - [Advanced Features](#advanced-features)
139
- - [Audio Preprocessing](#audio-preprocessing)
140
- - [Long-Form Audio Support](#long-form-audio-support)
141
- - [Word-Level Alignment](#word-level-alignment)
142
- - [Smart Sentence Splitting](#smart-sentence-splitting)
143
- - [Speaker Diarization](#speaker-diarization)
144
- - [YAML Configuration Files](#yaml-configuration-files)
145
- - [Architecture Overview](#architecture-overview)
146
- - [Performance & Optimization](#performance--optimization)
147
- - [Supported Formats](#supported-formats)
148
- - [Supported Languages](#supported-languages)
149
- - [Roadmap](#roadmap)
150
- - [Development](#development)
151
-
152
- ---
153
-
154
- ## Core Capabilities
155
-
156
- LattifAI provides comprehensive audio-text alignment powered by the Lattice-1 model:
157
-
158
- | Feature | Description | Status |
159
- |---------|-------------|--------|
160
- | **Forced Alignment** | Precise word-level and segment-level synchronization with audio | ✅ Production |
161
- | **Multi-Model Transcription** | Gemini (100+ languages), Parakeet (24 languages), SenseVoice (5 languages) | ✅ Production |
162
- | **Speaker Diarization** | Automatic multi-speaker identification with label preservation | ✅ Production |
163
- | **Audio Preprocessing** | Multi-channel selection, device optimization (CPU/CUDA/MPS) | ✅ Production |
164
- | **Streaming Mode** | Process audio up to 20 hours with minimal memory footprint | ✅ Production |
165
- | **Smart Text Processing** | Intelligent sentence splitting and non-speech element separation | ✅ Production |
166
- | **Universal Format Support** | 30+ caption/subtitle formats with text normalization | ✅ Production |
167
- | **Configuration System** | YAML-based configs for reproducible workflows | ✅ Production |
168
-
169
- **Key Highlights:**
170
- - 🎯 **Accuracy**: State-of-the-art alignment precision with Lattice-1 model
171
- - 🌍 **Multilingual**: Support for 100+ languages via multiple transcription models
172
- - 🚀 **Performance**: Hardware-accelerated processing with streaming support
173
- - 🔧 **Flexible**: CLI, Python SDK, and Web UI interfaces
174
- - 📦 **Production-Ready**: Battle-tested on diverse audio/video content
175
-
176
- ---
177
-
178
- ## Installation
179
-
180
- ### Step 1: Install SDK
181
-
182
- **Using pip:**
183
- ```bash
184
-
185
- pip install lattifai
186
- ```
187
-
188
- **Using uv (Recommended - 10-100x faster):**
189
- ```bash
190
- # Install uv if you haven't already
191
- curl -LsSf https://astral.sh/uv/install.sh | sh
192
-
193
- # Create a new project with uv
194
- uv init my-project
195
- cd my-project
196
- source .venv/bin/activate
197
-
198
- # Install LattifAI
199
- uv pip install lattifai
200
- ```
201
-
202
-
203
-
204
- ### Step 2: Get Your API Key
205
-
206
- **LattifAI API Key (Required)**
207
-
208
- Get your **free API key** at [https://lattifai.com/dashboard/api-keys](https://lattifai.com/dashboard/api-keys)
209
-
210
- **Option A: Environment variable (recommended)**
211
- ```bash
212
- export LATTIFAI_API_KEY="lf_your_api_key_here"
213
- ```
214
-
215
- **Option B: `.env` file**
216
- ```bash
217
- # .env
218
- LATTIFAI_API_KEY=lf_your_api_key_here
219
- ```
220
-
221
- **Gemini API Key (Optional - for transcription)**
222
-
223
- If you want to use Gemini models for transcription (e.g., `gemini-2.5-pro`), get your **free Gemini API key** at [https://aistudio.google.com/apikey](https://aistudio.google.com/apikey)
224
-
225
- ```bash
226
- # Add to environment variable
227
- export GEMINI_API_KEY="your_gemini_api_key_here"
228
-
229
- # Or add to .env file
230
- GEMINI_API_KEY=your_gemini_api_key_here # AIzaSyxxxx
231
- ```
232
-
233
- > **Note**: Gemini API key is only required if you use Gemini models for transcription. It's not needed for alignment or when using other transcription models.
234
-
235
- ---
236
-
237
- ## Quick Start
238
-
239
- ### Command Line Interface
240
-
241
- ![CLI Demo](assets/cli.png)
242
-
243
- ```bash
244
- # Align local audio with subtitle
245
- lai alignment align audio.wav subtitle.srt output.srt
246
-
247
- # Download and align YouTube video
248
- lai alignment youtube "https://youtube.com/watch?v=VIDEO_ID"
249
- ```
250
-
251
- ### Python SDK (5 Lines of Code)
252
-
253
- ```python
254
- from lattifai import LattifAI
255
-
256
- client = LattifAI()
257
- caption = client.alignment(
258
- input_media="audio.wav",
259
- input_caption="subtitle.srt",
260
- output_caption_path="aligned.srt",
261
- )
262
- ```
263
-
264
- That's it! Your aligned subtitles are saved to `aligned.srt`.
265
-
266
- ### 🚧 Web Interface
267
-
268
- ![web Demo](assets/web.png)
269
-
270
- 1. **Install the web application (one-time setup):**
271
- ```bash
272
- lai-app-install
273
- ```
274
-
275
- This command will:
276
- - Check if Node.js/npm is installed (and install if needed)
277
- - Install frontend dependencies
278
- - Build the application
279
- - Setup the `lai-app` command globally
280
-
281
- 2. **Start the backend server:**
282
- ```bash
283
- lai-server
284
-
285
- # Custom port (default: 8001)
286
- lai-server --port 9000
287
-
288
- # Custom host
289
- lai-server --host 127.0.0.1 --port 9000
290
-
291
- # Production mode (disable auto-reload)
292
- lai-server --no-reload
293
- ```
294
-
295
- **Backend Server Options:**
296
- - `-p, --port` - Server port (default: 8001)
297
- - `--host` - Host address (default: 0.0.0.0)
298
- - `--no-reload` - Disable auto-reload for production
299
- - `-h, --help` - Show help message
300
-
301
- 3. **Start the frontend application:**
302
- ```bash
303
- lai-app
304
-
305
- # Custom port (default: 5173)
306
- lai-app --port 8080
307
-
308
- # Custom backend URL
309
- lai-app --backend http://localhost:9000
310
-
311
- # Don't auto-open browser
312
- lai-app --no-open
313
- ```
314
-
315
- **Frontend Application Options:**
316
- - `-p, --port` - Frontend server port (default: 5173)
317
- - `--backend` - Backend API URL (default: http://localhost:8001)
318
- - `--no-open` - Don't automatically open browser
319
- - `-h, --help` - Show help message
320
-
321
- The web interface will automatically open in your browser at `http://localhost:5173`.
322
-
323
- **Features:**
324
- - ✅ **Drag-and-Drop Upload**: Visual file upload for audio/video and captions
325
- - ✅ **Real-Time Progress**: Live alignment progress with detailed status
326
- - ✅ **Multiple Transcription Models**: Gemini, Parakeet, SenseVoice selection
327
-
328
- ---
329
-
330
- ## CLI Reference
331
-
332
- ### Command Overview
333
-
334
- | Command | Description |
335
- |---------|-------------|
336
- | `lai alignment align` | Align local audio/video with caption |
337
- | `lai alignment youtube` | Download & align YouTube content |
338
- | `lai transcribe run` | Transcribe audio/video or YouTube URL to caption |
339
- | `lai transcribe align` | Transcribe audio/video and align with generated transcript |
340
- | `lai caption convert` | Convert between caption formats |
341
- | `lai caption normalize` | Clean and normalize caption text |
342
- | `lai caption shift` | Shift caption timestamps |
343
-
344
-
345
- ### lai alignment align
346
-
347
- ```bash
348
- # Basic usage
349
- lai alignment align <audio> <caption> <output>
350
-
351
- # Examples
352
- lai alignment align audio.wav caption.srt output.srt
353
- lai alignment align video.mp4 caption.vtt output.srt alignment.device=cuda
354
- lai alignment align audio.wav caption.srt output.json \
355
- caption.split_sentence=true \
356
- caption.word_level=true
357
- ```
358
-
359
- ### lai alignment youtube
360
-
361
- ```bash
362
- # Basic usage
363
- lai alignment youtube <url>
364
-
365
- # Examples
366
- lai alignment youtube "https://youtube.com/watch?v=VIDEO_ID"
367
- lai alignment youtube "https://youtube.com/watch?v=VIDEO_ID" \
368
- media.output_dir=~/Downloads \
369
- caption.output_path=aligned.srt \
370
- caption.split_sentence=true
371
- ```
372
-
373
- ### lai transcribe run
374
-
375
- Perform automatic speech recognition (ASR) on audio/video files or YouTube URLs to generate timestamped transcriptions.
376
-
377
- ```bash
378
- # Basic usage - local file
379
- lai transcribe run <input> <output>
380
-
381
- # Basic usage - YouTube URL
382
- lai transcribe run <url> <output_dir>
383
-
384
- # Examples - Local files
385
- lai transcribe run audio.wav output.srt
386
- lai transcribe run audio.mp4 output.ass \
387
- transcription.model_name=nvidia/parakeet-tdt-0.6b-v3
388
-
389
- # Examples - YouTube URLs
390
- lai transcribe run "https://youtube.com/watch?v=VIDEO_ID" output_dir=./output
391
- lai transcribe run "https://youtube.com/watch?v=VIDEO_ID" output.ass output_dir=./output \
392
- transcription.model_name=gemini-2.5-pro \
393
- transcription.gemini_api_key=YOUR_GEMINI_API_KEY
394
-
395
- # Full configuration with keyword arguments
396
- lai transcribe run \
397
- input=audio.wav \
398
- output_caption=output.srt \
399
- channel_selector=average \
400
- transcription.device=cuda \
401
- transcription.model_name=iic/SenseVoiceSmall
402
- ```
403
-
404
- **Parameters:**
405
- - `input`: Path to audio/video file or YouTube URL (required)
406
- - `output_caption`: Path for output caption file (for local files)
407
- - `output_dir`: Directory for output files (for YouTube URLs, defaults to current directory)
408
- - `media_format`: Media format for YouTube downloads (default: mp3)
409
- - `channel_selector`: Audio channel selection - "average", "left", "right", or channel index (default: "average")
410
- - Note: Ignored when transcribing YouTube URLs with Gemini models
411
- - `transcription`: Transcription configuration (model_name, device, language, gemini_api_key)
412
-
413
- **Supported Transcription Models (More Coming Soon):**
414
- - `gemini-2.5-pro` - Google Gemini API (requires API key)
415
- - Languages: 100+ languages including English, Chinese, Spanish, French, German, Japanese, Korean, Arabic, and more
416
- - `gemini-3-pro-preview` - Google Gemini API (requires API key)
417
- - Languages: 100+ languages (same as gemini-2.5-pro)
418
- - `nvidia/parakeet-tdt-0.6b-v3` - NVIDIA Parakeet model
419
- - Languages: Bulgarian (bg), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hungarian (hu), Italian (it), Latvian (lv), Lithuanian (lt), Maltese (mt), Polish (pl), Portuguese (pt), Romanian (ro), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Russian (ru), Ukrainian (uk)
420
- - `iic/SenseVoiceSmall` - Alibaba SenseVoice model
421
- - Languages: Chinese/Mandarin (zh), English (en), Japanese (ja), Korean (ko), Cantonese (yue)
422
- - More models will be integrated in future releases
423
-
424
- **Note:** For transcription with alignment on local files, use `lai transcribe align` instead.
425
-
426
- ### lai transcribe align
427
-
428
- Transcribe audio/video file and automatically align the generated transcript with the audio.
429
-
430
- This command combines transcription and alignment in a single step, producing precisely aligned captions.
431
-
432
- ```bash
433
- # Basic usage
434
- lai transcribe align <input_media> <output_caption>
435
-
436
- # Examples
437
- lai transcribe align audio.wav output.srt
438
- lai transcribe align audio.mp4 output.ass \
439
- transcription.model_name=nvidia/parakeet-tdt-0.6b-v3 \
440
- alignment.device=cuda
441
-
442
- # Using Gemini transcription with alignment
443
- lai transcribe align audio.wav output.srt \
444
- transcription.model_name=gemini-2.5-pro \
445
- transcription.gemini_api_key=YOUR_KEY \
446
- caption.split_sentence=true
447
-
448
- # Full configuration
449
- lai transcribe align \
450
- input_media=audio.wav \
451
- output_caption=output.srt \
452
- transcription.device=mps \
453
- transcription.model_name=iic/SenseVoiceSmall \
454
- alignment.device=cuda \
455
- caption.word_level=true
456
- ```
457
-
458
- **Parameters:**
459
- - `input_media`: Path to input audio/video file (required)
460
- - `output_caption`: Path for output aligned caption file (required)
461
- - `transcription`: Transcription configuration (model_name, device, language, gemini_api_key)
462
- - `alignment`: Alignment configuration (model_name, device)
463
- - `caption`: Caption formatting options (split_sentence, word_level, etc.)
464
-
465
-
466
- ### lai caption convert
467
-
468
- ```bash
469
- lai caption convert input.srt output.vtt
470
- lai caption convert input.srt output.json
471
- # Enable normalization to clean HTML entities and special characters:
472
- lai caption convert input.srt output.json normalize_text=true
473
- ```
474
-
475
- ### lai caption shift
476
-
477
- ```bash
478
- lai caption shift input.srt output.srt 2.0 # Delay by 2 seconds
479
- lai caption shift input.srt output.srt -1.5 # Advance by 1.5 seconds
480
- ```
481
-
482
- ---
483
-
484
- ## Python SDK Reference
485
-
486
- ### Basic Alignment
487
-
488
- ```python
489
- from lattifai import LattifAI
490
-
491
- # Initialize client (uses LATTIFAI_API_KEY from environment)
492
- client = LattifAI()
493
-
494
- # Align audio/video with subtitle
495
- caption = client.alignment(
496
- input_media="audio.wav", # Audio or video file
497
- input_caption="subtitle.srt", # Input subtitle file
498
- output_caption_path="output.srt", # Output aligned subtitle
499
- split_sentence=True, # Enable smart sentence splitting
500
- )
501
-
502
- # Access alignment results
503
- for segment in caption.supervisions:
504
- print(f"{segment.start:.2f}s - {segment.end:.2f}s: {segment.text}")
505
- ```
506
-
507
- ### YouTube Processing
508
-
509
- ```python
510
- from lattifai import LattifAI
511
-
512
- client = LattifAI()
513
-
514
- # Download YouTube video and align with auto-downloaded subtitles
515
- caption = client.youtube(
516
- url="https://youtube.com/watch?v=VIDEO_ID",
517
- output_dir="./downloads",
518
- output_caption_path="aligned.srt",
519
- split_sentence=True,
520
- )
521
- ```
522
-
523
-
524
- ### Configuration Objects
525
-
526
- LattifAI uses a config-driven architecture for fine-grained control:
527
-
528
- #### ClientConfig - API Settings
529
-
530
- ```python
531
- from lattifai import LattifAI, ClientConfig
532
-
533
- client = LattifAI(
534
- client_config=ClientConfig(
535
- api_key="lf_your_api_key", # Or use LATTIFAI_API_KEY env var
536
- timeout=30.0,
537
- max_retries=3,
538
- )
539
- )
540
- ```
541
-
542
- #### AlignmentConfig - Model Settings
543
-
544
- ```python
545
- from lattifai import LattifAI, AlignmentConfig
546
-
547
- client = LattifAI(
548
- alignment_config=AlignmentConfig(
549
- model_name="Lattifai/Lattice-1",
550
- device="cuda", # "cpu", "cuda", "cuda:0", "mps"
551
- )
552
- )
553
- ```
554
-
555
- #### CaptionConfig - Subtitle Settings
556
-
557
- ```python
558
- from lattifai import LattifAI, CaptionConfig
559
-
560
- client = LattifAI(
561
- caption_config=CaptionConfig(
562
- split_sentence=True, # Smart sentence splitting (default: False)
563
- word_level=True, # Word-level timestamps (default: False)
564
- normalize_text=True, # Clean HTML entities (default: True)
565
- include_speaker_in_text=False, # Include speaker labels (default: True)
566
- )
567
- )
568
- ```
569
-
570
- #### Complete Configuration Example
571
-
572
- ```python
573
- from lattifai import (
574
- LattifAI,
575
- ClientConfig,
576
- AlignmentConfig,
577
- CaptionConfig
578
- )
579
-
580
- client = LattifAI(
581
- client_config=ClientConfig(
582
- api_key="lf_your_api_key",
583
- timeout=60.0,
584
- ),
585
- alignment_config=AlignmentConfig(
586
- model_name="Lattifai/Lattice-1",
587
- device="cuda",
588
- ),
589
- caption_config=CaptionConfig(
590
- split_sentence=True,
591
- word_level=True,
592
- output_format="json",
593
- ),
594
- )
595
-
596
- caption = client.alignment(
597
- input_media="audio.wav",
598
- input_caption="subtitle.srt",
599
- output_caption_path="output.json",
600
- )
601
- ```
602
-
603
- ### Available Exports
604
-
605
- ```python
606
- from lattifai import (
607
- # Client classes
608
- LattifAI,
609
- # AsyncLattifAI, # For async support
610
-
611
- # Config classes
612
- ClientConfig,
613
- AlignmentConfig,
614
- CaptionConfig,
615
- DiarizationConfig,
616
- MediaConfig,
617
-
618
- # I/O classes
619
- Caption,
620
- )
621
- ```
622
-
623
- ---
624
-
625
- ## Advanced Features
626
-
627
- ### Audio Preprocessing
628
-
629
- LattifAI provides powerful audio preprocessing capabilities for optimal alignment:
630
-
631
- **Channel Selection**
632
-
633
- Control which audio channel to process for stereo/multi-channel files:
634
-
635
- ```python
636
- from lattifai import LattifAI
637
-
638
- client = LattifAI()
639
-
640
- # Use left channel only
641
- caption = client.alignment(
642
- input_media="stereo.wav",
643
- input_caption="subtitle.srt",
644
- channel_selector="left", # Options: "left", "right", "average", or channel index (0, 1, 2, ...)
645
- )
646
-
647
- # Average all channels (default)
648
- caption = client.alignment(
649
- input_media="stereo.wav",
650
- input_caption="subtitle.srt",
651
- channel_selector="average",
652
- )
653
- ```
654
-
655
- **CLI Usage:**
656
- ```bash
657
- # Use right channel
658
- lai alignment align audio.wav subtitle.srt output.srt \
659
- media.channel_selector=right
660
-
661
- # Use specific channel index
662
- lai alignment align audio.wav subtitle.srt output.srt \
663
- media.channel_selector=1
664
- ```
665
-
666
- **Device Management**
667
-
668
- Optimize processing for your hardware:
669
-
670
- ```python
671
- from lattifai import LattifAI, AlignmentConfig
672
-
673
- # Use CUDA GPU
674
- client = LattifAI(
675
- alignment_config=AlignmentConfig(device="cuda")
676
- )
677
-
678
- # Use specific GPU
679
- client = LattifAI(
680
- alignment_config=AlignmentConfig(device="cuda:0")
681
- )
682
-
683
- # Use Apple Silicon MPS
684
- client = LattifAI(
685
- alignment_config=AlignmentConfig(device="mps")
686
- )
687
-
688
- # Use CPU
689
- client = LattifAI(
690
- alignment_config=AlignmentConfig(device="cpu")
691
- )
692
- ```
693
-
694
- **Supported Formats**
695
- - **Audio**: WAV, MP3, M4A, AAC, FLAC, OGG, OPUS, AIFF, and more
696
- - **Video**: MP4, MKV, MOV, WEBM, AVI, and more
697
- - All formats supported by FFmpeg are compatible
698
-
699
- ### Long-Form Audio Support
700
-
701
- LattifAI now supports processing long audio files (up to 20 hours) through streaming mode. Enable streaming by setting the `streaming_chunk_secs` parameter:
702
-
703
- **Python SDK:**
704
- ```python
705
- from lattifai import LattifAI
706
-
707
- client = LattifAI()
708
-
709
- # Enable streaming for long audio files
710
- caption = client.alignment(
711
- input_media="long_audio.wav",
712
- input_caption="subtitle.srt",
713
- output_caption_path="output.srt",
714
- streaming_chunk_secs=600.0, # Process in 30-second chunks
715
- )
716
- ```
717
-
718
- **CLI:**
719
- ```bash
720
- # Enable streaming with chunk size
721
- lai alignment align long_audio.wav subtitle.srt output.srt \
722
- media.streaming_chunk_secs=300.0
723
-
724
- # For YouTube videos
725
- lai alignment youtube "https://youtube.com/watch?v=VIDEO_ID" \
726
- media.streaming_chunk_secs=300.0
727
- ```
728
-
729
- **MediaConfig:**
730
- ```python
731
- from lattifai import LattifAI, MediaConfig
732
-
733
- client = LattifAI(
734
- media_config=MediaConfig(
735
- streaming_chunk_secs=600.0, # Chunk duration in seconds (1-1800), default: 600 (10 minutes)
736
- )
737
- )
738
- ```
739
-
740
- **Technical Details:**
741
-
742
- | Parameter | Description | Recommendation |
743
- |-----------|-------------|----------------|
744
- | **Default Value** | 600 seconds (10 minutes) | Good for most use cases |
745
- | **Memory Impact** | Lower chunks = less RAM usage | Adjust based on available RAM |
746
- | **Accuracy Impact** | Virtually zero degradation | Our precise implementation preserves quality |
747
-
748
- **Performance Characteristics:**
749
- - ✅ **Near-Perfect Accuracy**: Streaming implementation maintains alignment precision
750
- - 🚧 **Memory Efficient**: Process 20-hour audio with <10GB RAM (600-sec chunks)
751
-
752
-
753
- ### Word-Level Alignment
754
-
755
- Enable `word_level=True` to get precise timestamps for each word:
756
-
757
- ```python
758
- from lattifai import LattifAI, CaptionConfig
759
-
760
- client = LattifAI(
761
- caption_config=CaptionConfig(word_level=True)
762
- )
763
-
764
- caption = client.alignment(
765
- input_media="audio.wav",
766
- input_caption="subtitle.srt",
767
- output_caption_path="output.json", # JSON preserves word-level data
768
- )
769
-
770
- # Access word-level alignments
771
- for segment in caption.alignments:
772
- if segment.alignment and "word" in segment.alignment:
773
- for word_item in segment.alignment["word"]:
774
- print(f"{word_item.start:.2f}s: {word_item.symbol} (confidence: {word_item.score:.2f})")
775
- ```
776
-
777
- ### Smart Sentence Splitting
778
-
779
- The `split_sentence` option intelligently separates:
780
- - Non-speech elements (`[APPLAUSE]`, `[MUSIC]`) from dialogue
781
- - Multiple sentences within a single subtitle
782
- - Speaker labels from content
783
-
784
- ```python
785
- caption = client.alignment(
786
- input_media="audio.wav",
787
- input_caption="subtitle.srt",
788
- split_sentence=True,
789
- )
790
- ```
791
-
792
- ### Speaker Diarization
793
-
794
- Speaker diarization automatically identifies and labels different speakers in audio using state-of-the-art models.
795
-
796
- **Core Capabilities:**
797
- - 🎤 **Multi-Speaker Detection**: Automatically detect speaker changes in audio
798
- - 🏷️ **Smart Labeling**: Assign speaker labels (SPEAKER_00, SPEAKER_01, etc.)
799
- - 🔄 **Label Preservation**: Maintain existing speaker names from input captions
800
- - 🤖 **Gemini Integration**: Extract speaker names intelligently during transcription
801
-
802
- **How It Works:**
803
-
804
- 1. **Without Existing Labels**: System assigns generic labels (SPEAKER_00, SPEAKER_01)
805
- 2. **With Existing Labels**: System preserves your speaker names during alignment
806
- - Formats: `[Alice]`, `>> Bob:`, `SPEAKER_01:`, `Alice:` are all recognized
807
- 3. **Gemini Transcription**: When using Gemini models, speaker names are extracted from context
808
- - Example: "Hi, I'm Alice" → System labels as `Alice` instead of `SPEAKER_00`
809
-
810
- **Speaker Label Integration:**
811
-
812
- The diarization engine intelligently matches detected speakers with existing labels:
813
- - If input captions have speaker names → **Preserved during alignment**
814
- - If Gemini transcription provides names → **Used for labeling**
815
- - Otherwise → **Generic labels (SPEAKER_00, etc.) assigned**
816
- * 🚧 **Future Enhancement:**
817
- - **AI-Powered Speaker Name Inference**: Upcoming feature will use large language models combined with metadata (video title, description, context) to intelligently infer speaker names, making transcripts more human-readable and contextually accurate
818
-
819
- **CLI:**
820
- ```bash
821
- # Enable speaker diarization during alignment
822
- lai alignment align audio.wav subtitle.srt output.srt \
823
- diarization.enabled=true
824
-
825
- # With additional diarization settings
826
- lai alignment align audio.wav subtitle.srt output.srt \
827
- diarization.enabled=true \
828
- diarization.device=cuda \
829
- diarization.min_speakers=2 \
830
- diarization.max_speakers=4
831
-
832
- # For YouTube videos with diarization
833
- lai alignment youtube "https://youtube.com/watch?v=VIDEO_ID" \
834
- diarization.enabled=true
835
- ```
836
-
837
- **Python SDK:**
838
- ```python
839
- from lattifai import LattifAI, DiarizationConfig
840
-
841
- client = LattifAI(
842
- diarization_config=DiarizationConfig(enabled=True)
843
- )
844
-
845
- caption = client.alignment(
846
- input_media="audio.wav",
847
- input_caption="subtitle.srt",
848
- output_caption_path="output.srt",
849
- )
850
-
851
- # Access speaker information
852
- for segment in caption.supervisions:
853
- print(f"[{segment.speaker}] {segment.text}")
854
- ```
855
-
856
- ### YAML Configuration Files
857
-
858
- * **under development**
859
-
860
- Create reusable configuration files:
861
-
862
- ```yaml
863
- # config/alignment.yaml
864
- model_name: "Lattifai/Lattice-1"
865
- device: "cuda"
866
- batch_size: 1
867
- ```
868
-
869
- ```bash
870
- lai alignment align audio.wav subtitle.srt output.srt \
871
- alignment=config/alignment.yaml
872
- ```
873
-
874
- ---
875
-
876
- ## Architecture Overview
877
-
878
- LattifAI uses a modular, config-driven architecture for maximum flexibility:
879
-
880
- ```
881
- ┌─────────────────────────────────────────────────────────────┐
882
- │ LattifAI Client │
883
- ├─────────────────────────────────────────────────────────────┤
884
- │ Configuration Layer (Config-Driven) │
885
- │ ├── ClientConfig (API settings) │
886
- │ ├── AlignmentConfig (Model & device) │
887
- │ ├── CaptionConfig (I/O formats) │
888
- │ ├── TranscriptionConfig (ASR models) │
889
- │ └── DiarizationConfig (Speaker detection) │
890
- ├─────────────────────────────────────────────────────────────┤
891
- │ Core Components │
892
- │ ├── AudioLoader → Load & preprocess audio │
893
- │ ├── Aligner → Lattice-1 forced alignment │
894
- │ ├── Transcriber → Multi-model ASR │
895
- │ ├── Diarizer → Speaker identification │
896
- │ └── Tokenizer → Intelligent text segmentation │
897
- ├─────────────────────────────────────────────────────────────┤
898
- │ Data Flow │
899
- │ Input → AudioLoader → Aligner → Diarizer → Caption │
900
- │ ↓ │
901
- │ Transcriber (optional) │
902
- └─────────────────────────────────────────────────────────────┘
903
- ```
904
-
905
- **Component Responsibilities:**
906
-
907
- | Component | Purpose | Configuration |
908
- |-----------|---------|---------------|
909
- | **AudioLoader** | Load audio/video, channel selection, format conversion | `MediaConfig` |
910
- | **Aligner** | Forced alignment using Lattice-1 model | `AlignmentConfig` |
911
- | **Transcriber** | ASR with Gemini/Parakeet/SenseVoice | `TranscriptionConfig` |
912
- | **Diarizer** | Speaker diarization with pyannote.audio | `DiarizationConfig` |
913
- | **Tokenizer** | Sentence splitting and text normalization | `CaptionConfig` |
914
- | **Caption** | Unified data structure for alignments | `CaptionConfig` |
915
-
916
- **Data Flow:**
917
-
918
- 1. **Audio Loading**: `AudioLoader` loads media, applies channel selection, converts to numpy array
919
- 2. **Transcription** (optional): `Transcriber` generates transcript if no caption provided
920
- 3. **Text Preprocessing**: `Tokenizer` splits sentences and normalizes text
921
- 4. **Alignment**: `Aligner` uses Lattice-1 to compute word-level timestamps
922
- 5. **Diarization** (optional): `Diarizer` identifies speakers and assigns labels
923
- 6. **Output**: `Caption` object contains all results, exported to desired format
924
-
925
- **Configuration Philosophy:**
926
- - ✅ **Declarative**: Describe what you want, not how to do it
927
- - ✅ **Composable**: Mix and match configurations
928
- - ✅ **Reproducible**: Save configs to YAML for consistent results
929
- - ✅ **Flexible**: Override configs per-method or globally
930
-
931
- ---
932
-
933
- ## Performance & Optimization
934
-
935
- ### Device Selection
936
-
937
- Choose the optimal device for your hardware:
938
-
939
- ```python
940
- from lattifai import LattifAI, AlignmentConfig
941
-
942
- # NVIDIA GPU (recommended for speed)
943
- client = LattifAI(
944
- alignment_config=AlignmentConfig(device="cuda")
945
- )
946
-
947
- # Apple Silicon GPU
948
- client = LattifAI(
949
- alignment_config=AlignmentConfig(device="mps")
950
- )
951
-
952
- # CPU (maximum compatibility)
953
- client = LattifAI(
954
- alignment_config=AlignmentConfig(device="cpu")
955
- )
956
- ```
957
-
958
- **Performance Comparison** (30-minute audio):
959
-
960
- | Device | Time |
961
- |--------|------|
962
- | CUDA (RTX 4090) | ~18 sec |
963
- | MPS (M4) | ~26 sec |
964
-
965
- ### Memory Management
966
-
967
- **Streaming Mode** for long audio:
968
-
969
- ```python
970
- # Process 20-hour audio with <10GB RAM
971
- caption = client.alignment(
972
- input_media="long_audio.wav",
973
- input_caption="subtitle.srt",
974
- streaming_chunk_secs=600.0, # 10-minute chunks
975
- )
976
- ```
977
-
978
- **Memory Usage** (approximate):
979
-
980
- | Chunk Size | Peak RAM | Suitable For |
981
- |------------|----------|-------------|
982
- | 600 sec | ~5 GB | Recommended |
983
- | No streaming | ~10 GB+ | Short audio only |
984
-
985
- ### Optimization Tips
986
-
987
- 1. **Use GPU when available**: 10x faster than CPU
988
- 2. **WIP: Enable streaming for long audio**: Process 20+ hour files without OOM
989
- 3. **Choose appropriate chunk size**: Balance memory vs. performance
990
- 4. **Batch processing**: Process multiple files in sequence (coming soon)
991
- 5. **Profile alignment**: Set `client.profile=True` to identify bottlenecks
992
-
993
- ---
994
-
995
- ## Supported Formats
996
-
997
- LattifAI supports virtually all common media and subtitle formats:
998
-
999
- | Type | Formats |
1000
- |------|---------|
1001
- | **Audio** | WAV, MP3, M4A, AAC, FLAC, OGG, OPUS, AIFF, and more |
1002
- | **Video** | MP4, MKV, MOV, WEBM, AVI, and more |
1003
- | **Caption/Subtitle Input** | SRT, VTT, ASS, SSA, SUB, SBV, TXT, Gemini, and more |
1004
- | **Caption/Subtitle Output** | All input formats + TextGrid (Praat) |
1005
-
1006
- **Tabular Formats:**
1007
- - **TSV**: Tab-separated values with optional speaker column
1008
- - **CSV**: Comma-separated values with optional speaker column
1009
- - **AUD**: Audacity labels format with `[[speaker]]` notation
1010
-
1011
- > **Note**: If a format is not listed above but commonly used, it's likely supported. Feel free to try it or reach out if you encounter any issues.
1012
-
1013
- ---
1014
-
1015
- ## Supported Languages
1016
-
1017
- LattifAI supports multiple transcription models with different language capabilities:
1018
-
1019
- ### Gemini Models (100+ Languages)
1020
-
1021
- **Models**: `gemini-2.5-pro`, `gemini-3-pro-preview`, `gemini-3-flash-preview`
1022
-
1023
- **Supported Languages**: English, Chinese (Mandarin & Cantonese), Spanish, French, German, Italian, Portuguese, Japanese, Korean, Arabic, Russian, Hindi, Bengali, Turkish, Dutch, Polish, Swedish, Danish, Norwegian, Finnish, Greek, Hebrew, Thai, Vietnamese, Indonesian, Malay, Filipino, Ukrainian, Czech, Romanian, Hungarian, Swahili, Tamil, Telugu, Marathi, Gujarati, Kannada, and 70+ more languages.
1024
-
1025
- > **Note**: Requires Gemini API key from [Google AI Studio](https://aistudio.google.com/apikey)
1026
-
1027
- ### NVIDIA Parakeet (24 European Languages)
1028
-
1029
- **Model**: `nvidia/parakeet-tdt-0.6b-v3`
1030
-
1031
- **Supported Languages**:
1032
- - **Western Europe**: English (en), French (fr), German (de), Spanish (es), Italian (it), Portuguese (pt), Dutch (nl)
1033
- - **Nordic**: Danish (da), Swedish (sv), Norwegian (no), Finnish (fi)
1034
- - **Eastern Europe**: Polish (pl), Czech (cs), Slovak (sk), Hungarian (hu), Romanian (ro), Bulgarian (bg), Ukrainian (uk), Russian (ru)
1035
- - **Others**: Croatian (hr), Estonian (et), Latvian (lv), Lithuanian (lt), Slovenian (sl), Maltese (mt), Greek (el)
1036
-
1037
- ### Alibaba SenseVoice (5 Asian Languages)
1038
-
1039
- **Model**: `iic/SenseVoiceSmall`
1040
-
1041
- **Supported Languages**:
1042
- - Chinese/Mandarin (zh)
1043
- - English (en)
1044
- - Japanese (ja)
1045
- - Korean (ko)
1046
- - Cantonese (yue)
1047
-
1048
- ### Language Selection
1049
-
1050
- ```python
1051
- from lattifai import LattifAI, TranscriptionConfig
1052
-
1053
- # Specify language for transcription
1054
- client = LattifAI(
1055
- transcription_config=TranscriptionConfig(
1056
- model_name="nvidia/parakeet-tdt-0.6b-v3",
1057
- language="de", # German
1058
- )
1059
- )
1060
- ```
1061
-
1062
- **CLI Usage:**
1063
- ```bash
1064
- lai transcribe run audio.wav output.srt \
1065
- transcription.model_name=nvidia/parakeet-tdt-0.6b-v3 \
1066
- transcription.language=de
1067
- ```
1068
-
1069
- > **Tip**: Use Gemini models for maximum language coverage, Parakeet for European languages, and SenseVoice for Asian languages.
1070
-
1071
- ---
1072
-
1073
- ## Roadmap
1074
-
1075
- Visit our [LattifAI roadmap](https://lattifai.com/roadmap) for the latest updates.
1076
-
1077
- | Date | Model Release | Features |
1078
- |------|---------|----------|
1079
- | **Oct 2025** | **Lattice-1-Alpha** | ✅ English forced alignment<br>✅ Multi-format support<br>✅ CPU/GPU optimization |
1080
- | **Nov 2025** | **Lattice-1** | ✅ English + Chinese + German<br>✅ Mixed languages alignment<br>✅ Speaker Diarization<br>✅ Multi-model transcription (Gemini, Parakeet, SenseVoice)<br>✅ Web interface with React<br>🚧 Advanced segmentation strategies (entire/transcription/hybrid)<br>🚧 Audio event detection ([MUSIC], [APPLAUSE], etc.)<br> |
1081
- | **Q1 2026** | **Lattice-2** | ✅ Streaming mode for long audio<br>🔮 40+ languages support<br>🔮 Real-time alignment |
1082
-
1083
-
1084
-
1085
- **Legend**: ✅ Released | 🚧 In Development | 📋 Planned | 🔮 Future
1086
-
1087
- ---
1088
-
1089
- ## Development
1090
-
1091
- ### Setup
1092
-
1093
- ```bash
1094
- git clone https://github.com/lattifai/lattifai-python.git
1095
- cd lattifai-python
1096
-
1097
- # Using uv (recommended)
1098
- curl -LsSf https://astral.sh/uv/install.sh | sh
1099
- uv sync
1100
- source .venv/bin/activate
1101
-
1102
- # Or using pip
1103
- pip install -e ".[test]"
1104
-
1105
- pre-commit install
1106
- ```
1107
-
1108
- ### Testing
1109
-
1110
- ```bash
1111
- pytest # Run all tests
1112
- pytest --cov=src # With coverage
1113
- pytest tests/test_basic.py # Specific test
1114
- ```
1115
-
1116
- ---
1117
-
1118
- ## Contributing
1119
-
1120
- 1. Fork the repository
1121
- 2. Create a feature branch
1122
- 3. Make changes and add tests
1123
- 4. Run `pytest` and `pre-commit run`
1124
- 5. Submit a pull request
1125
-
1126
- ## License
1127
-
1128
- Apache License 2.0
1129
-
1130
- ## Support
1131
-
1132
- - **Issues**: [GitHub Issues](https://github.com/lattifai/lattifai-python/issues)
1133
- - **Discussions**: [GitHub Discussions](https://github.com/lattifai/lattifai-python/discussions)
1134
- - **Discord**: [Join our community](https://discord.gg/kvF4WsBRK8)