PyPI - voxpulse - Versions diffs - 1.0.0__tar.gz - Mend

voxpulse 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

voxpulse-1.0.0/LICENSE +21 -0
voxpulse-1.0.0/PKG-INFO +106 -0
voxpulse-1.0.0/README.md +76 -0
voxpulse-1.0.0/setup.cfg +4 -0
voxpulse-1.0.0/setup.py +33 -0
voxpulse-1.0.0/voxpulse/__init__.py +5 -0
voxpulse-1.0.0/voxpulse/augment.py +77 -0
voxpulse-1.0.0/voxpulse/features.py +89 -0
voxpulse-1.0.0/voxpulse/inference.py +110 -0
voxpulse-1.0.0/voxpulse/model.py +78 -0
voxpulse-1.0.0/voxpulse.egg-info/PKG-INFO +106 -0
voxpulse-1.0.0/voxpulse.egg-info/SOURCES.txt +13 -0
voxpulse-1.0.0/voxpulse.egg-info/dependency_links.txt +1 -0
voxpulse-1.0.0/voxpulse.egg-info/requires.txt +8 -0
voxpulse-1.0.0/voxpulse.egg-info/top_level.txt +1 -0

voxpulse-1.0.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Abhishek Gour
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

voxpulse-1.0.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,106 @@
+Metadata-Version: 2.4
+Name: voxpulse
+Version: 1.0.0
+Summary: A lightweight, fast, and AI-powered custom wake word detection system.
+Home-page: https://github.com/itzabhishekgour/VoxPulse
+Author: Abhishek Gour
+Classifier: Programming Language :: Python :: 3
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.8
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: numpy
+Requires-Dist: scipy
+Requires-Dist: librosa
+Requires-Dist: sounddevice
+Requires-Dist: audiomentations
+Requires-Dist: tensorflow
+Requires-Dist: soundfile
+Requires-Dist: scikit-learn
+Dynamic: author
+Dynamic: classifier
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license-file
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
+# VoxPulse - Custom Wake Word Detection Framework
+VoxPulse is a lightweight, offline, and 100% private DIY custom wake-word detection library for Python. Instead of relying on pre-trained corporate wake words like "Alexa" or "Hey Siri", VoxPulse empowers developers to train their own voice assistants with any custom name, in any language!
+---
+## Why VoxPulse? (Pros & Cons)
+### Pros (The Good Stuff)
+* **100% Privacy:** Everything runs locally on your machine. No internet required, no voice data is sent to the cloud.
+* **Auto-Data Pipeline:** You just provide raw `.wav` recordings. VoxPulse automatically handles background noise mixing, time-stretching, pitch-shifting, and Mel-Spectrogram feature extraction.
+* **CPU & Battery Efficient:** Features RMS Silence Gating. The AI model goes to sleep when the room is silent (CPU usage drops to ~0%) and only triggers the neural network when someone speaks.
+* **Lightweight:** Uses a custom 2D Convolutional Neural Network (CNN) compiled into TensorFlow Lite (`.tflite`), making it blazing fast even on low-end hardware.
+### Cons (The Limitations)
+* **DIY Approach:** Since it's a custom framework, there is no pre-trained model. You must spend 5 minutes recording your own voice and room noise to use it.
+* **Environment Sensitive:** The accuracy heavily depends on the quality of the background noise (`negative` dataset) you provide during training.
+---
+## How to Use VoxPulse (Quick Start Guide)
+### Step 1: Install the Library
+Install VoxPulse directly via pip:
+```bash
+pip install voxpulse
+```
+### Step 2: Prepare Your Dataset
+Create a folder named `dataset` in your project directory with two sub-folders:
+* **dataset/positive/** - Record and save 10-15 short `.wav` files of you saying your custom wake word (e.g., "Hey Friday"). Keep them around 1 to 1.5 seconds long.
+* **dataset/negative/** - Record a single 5-10 minute `.wav` file of your normal room background noise (fan sounds, typing, distant talking) and place it here.
+### Step 3: Train Your Custom Model
+Create a python script (e.g., `train.py`) and run the auto-pipeline:
+```python
+from voxpulse.model import VoxPulseTrainer
+# This single command will automatically augment data, extract features, and train the CNN!
+trainer = VoxPulseTrainer(dataset_dir="dataset")
+trainer.train_and_export(epochs=20, export_name="my_custom_model.tflite")
+```
+### Step 4: Run the Inference Engine
+Once your `.tflite` model is generated, you can use it to trigger any Python function in real-time. Create `run.py`:
+```python
+from voxpulse.inference import VoxPulseEngine
+def trigger_my_action():
+    print("Custom Wake Word Detected! Executing action...")
+    # Add your automation code here (e.g., open Spotify, turn on lights)
+# Load your newly trained model
+engine = VoxPulseEngine(model_path="my_custom_model.tflite", threshold=0.70)
+# Start listening in the background
+engine.start_listening(on_detect_callback=trigger_my_action)
+```
+---
+## Under the Hood (Architecture)
+VoxPulse abstracts away the complexity of audio machine learning. When you call the training function, it executes the following pipeline automatically:
+```mermaid
+graph TD
+    A[Raw Audio: dataset/positive] -->|Step 1: Auto-Augmentation| B[Pitch Shift & Time Stretch]
+    D[Background Noise: dataset/negative] -->|Mix Noise| B
+    B -->|Step 2: Mel-Spectrogram| C[Feature Matrices]
+    C -->|Step 3: CNN Training| E[Keras Sequential Model]
+    E -->|Step 4: Compilation| F[Lightweight TFLite Model]
+```

voxpulse-1.0.0/README.md ADDED Viewed

@@ -0,0 +1,76 @@
+# VoxPulse - Custom Wake Word Detection Framework
+VoxPulse is a lightweight, offline, and 100% private DIY custom wake-word detection library for Python. Instead of relying on pre-trained corporate wake words like "Alexa" or "Hey Siri", VoxPulse empowers developers to train their own voice assistants with any custom name, in any language!
+---
+## Why VoxPulse? (Pros & Cons)
+### Pros (The Good Stuff)
+* **100% Privacy:** Everything runs locally on your machine. No internet required, no voice data is sent to the cloud.
+* **Auto-Data Pipeline:** You just provide raw `.wav` recordings. VoxPulse automatically handles background noise mixing, time-stretching, pitch-shifting, and Mel-Spectrogram feature extraction.
+* **CPU & Battery Efficient:** Features RMS Silence Gating. The AI model goes to sleep when the room is silent (CPU usage drops to ~0%) and only triggers the neural network when someone speaks.
+* **Lightweight:** Uses a custom 2D Convolutional Neural Network (CNN) compiled into TensorFlow Lite (`.tflite`), making it blazing fast even on low-end hardware.
+### Cons (The Limitations)
+* **DIY Approach:** Since it's a custom framework, there is no pre-trained model. You must spend 5 minutes recording your own voice and room noise to use it.
+* **Environment Sensitive:** The accuracy heavily depends on the quality of the background noise (`negative` dataset) you provide during training.
+---
+## How to Use VoxPulse (Quick Start Guide)
+### Step 1: Install the Library
+Install VoxPulse directly via pip:
+```bash
+pip install voxpulse
+```
+### Step 2: Prepare Your Dataset
+Create a folder named `dataset` in your project directory with two sub-folders:
+* **dataset/positive/** - Record and save 10-15 short `.wav` files of you saying your custom wake word (e.g., "Hey Friday"). Keep them around 1 to 1.5 seconds long.
+* **dataset/negative/** - Record a single 5-10 minute `.wav` file of your normal room background noise (fan sounds, typing, distant talking) and place it here.
+### Step 3: Train Your Custom Model
+Create a python script (e.g., `train.py`) and run the auto-pipeline:
+```python
+from voxpulse.model import VoxPulseTrainer
+# This single command will automatically augment data, extract features, and train the CNN!
+trainer = VoxPulseTrainer(dataset_dir="dataset")
+trainer.train_and_export(epochs=20, export_name="my_custom_model.tflite")
+```
+### Step 4: Run the Inference Engine
+Once your `.tflite` model is generated, you can use it to trigger any Python function in real-time. Create `run.py`:
+```python
+from voxpulse.inference import VoxPulseEngine
+def trigger_my_action():
+    print("Custom Wake Word Detected! Executing action...")
+    # Add your automation code here (e.g., open Spotify, turn on lights)
+# Load your newly trained model
+engine = VoxPulseEngine(model_path="my_custom_model.tflite", threshold=0.70)
+# Start listening in the background
+engine.start_listening(on_detect_callback=trigger_my_action)
+```
+---
+## Under the Hood (Architecture)
+VoxPulse abstracts away the complexity of audio machine learning. When you call the training function, it executes the following pipeline automatically:
+```mermaid
+graph TD
+    A[Raw Audio: dataset/positive] -->|Step 1: Auto-Augmentation| B[Pitch Shift & Time Stretch]
+    D[Background Noise: dataset/negative] -->|Mix Noise| B
+    B -->|Step 2: Mel-Spectrogram| C[Feature Matrices]
+    C -->|Step 3: CNN Training| E[Keras Sequential Model]
+    E -->|Step 4: Compilation| F[Lightweight TFLite Model]
+```

voxpulse-1.0.0/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

voxpulse-1.0.0/setup.py ADDED Viewed

@@ -0,0 +1,33 @@
+from setuptools import setup, find_packages
+# Packaging and distribution configuration for setuptools to publish VoxPulse on PyPI.
+# Load the README file to serve as the long description on PyPI
+with open("README.md", "r", encoding="utf-8") as fh:
+    long_description = fh.read()
+setup(
+    name="voxpulse",
+    version="1.0.0",
+    author="Abhishek Gour",
+    description="A lightweight, fast, and AI-powered custom wake word detection system.",
+    long_description=long_description,
+    long_description_content_type="text/markdown",
+    url="https://github.com/itzabhishekgour/VoxPulse",
+    packages=find_packages(),
+    install_requires=[
+        "numpy",
+        "scipy",
+        "librosa",
+        "sounddevice",
+        "audiomentations",
+        "tensorflow",
+        "soundfile",
+        "scikit-learn"
+    ],
+    classifiers=[
+        "Programming Language :: Python :: 3",
+        "License :: OSI Approved :: MIT License",
+        "Operating System :: OS Independent",
+    ],
+    python_requires='>=3.8',
+)

voxpulse-1.0.0/voxpulse/__init__.py ADDED Viewed

@@ -0,0 +1,5 @@
+# Re-export inference engine class for cleaner top-level package imports
+from .inference import VoxPulseEngine
+__version__ = "1.0.0"
+__author__ = "Abhishek Gour | AI Developer"

voxpulse-1.0.0/voxpulse/augment.py ADDED Viewed

@@ -0,0 +1,77 @@
+import os
+import glob
+import numpy as np
+import librosa
+import soundfile as sf
+from audiomentations import Compose, PitchShift, TimeStretch
+SAMPLE_RATE = 16000
+def load_noise_files(neg_dir):
+    noise_audios = []
+    if os.path.exists(neg_dir):
+        noise_files = glob.glob(os.path.join(neg_dir, "*.wav"))
+        for nf in noise_files:
+            try:
+                y, _ = librosa.load(nf, sr=SAMPLE_RATE, mono=True)
+                noise_audios.append(y)
+                print(f"    Loaded background noise source: {os.path.basename(nf)}")
+            except Exception as e:
+                print(f"    Could not load noise file {nf}: {e}")
+    return noise_audios
+def apply_augmentation(audio, filename_prefix, noise_audios, aug_dir):
+    print(f"    Augmenting '{filename_prefix}' (50 variations)...")
+    augmenter = Compose([
+        PitchShift(min_semitones=-4, max_semitones=4, p=0.8),
+        TimeStretch(min_rate=0.8, max_rate=1.2, p=0.8)
+    ])
+    for i in range(50):
+        aug_audio = augmenter(samples=audio, sample_rate=SAMPLE_RATE)
+        if noise_audios:
+            noise = noise_audios[np.random.randint(len(noise_audios))]
+            if len(noise) >= len(aug_audio):
+                start = np.random.randint(0, len(noise) - len(aug_audio) + 1)
+                noise_chunk = noise[start:start + len(aug_audio)]
+            else:
+                tiles = int(np.ceil(len(aug_audio) / len(noise)))
+                tiled_noise = np.tile(noise, tiles)
+                noise_chunk = tiled_noise[:len(aug_audio)]
+            noise_factor = np.random.uniform(0.02, 0.08)
+            aug_audio = aug_audio + noise_factor * noise_chunk
+            max_val = np.max(np.abs(aug_audio))
+            if max_val > 1.0:
+                aug_audio = aug_audio / max_val
+        aug_path = os.path.join(aug_dir, f"aug_{filename_prefix}_{i}.wav")
+        sf.write(aug_path, aug_audio, SAMPLE_RATE)
+def run_augmentation(dataset_dir):
+    """Callable function to run data augmentation automatically"""
+    print("\n[INFO] Starting Step 1: Data Augmentation Pipeline...")
+    pos_dir = os.path.join(dataset_dir, "positive")
+    aug_dir = os.path.join(dataset_dir, "augmented")
+    neg_dir = os.path.join(dataset_dir, "negative")
+    os.makedirs(pos_dir, exist_ok=True)
+    os.makedirs(aug_dir, exist_ok=True)
+    os.makedirs(neg_dir, exist_ok=True)
+    noise_audios = load_noise_files(neg_dir)
+    wav_files = [f for f in os.listdir(pos_dir) if f.endswith('.wav')]
+    if len(wav_files) == 0:
+        raise FileNotFoundError(f"\n[ERROR] No .wav files found in '{pos_dir}'!\nPlease record 10-15 voice samples of your wake word and place them in 'dataset/positive/' before training.")
+    print(f"[INFO] Found {len(wav_files)} positive samples. Generating 50x variations...")
+    for file_name in wav_files:
+        file_path = os.path.join(pos_dir, file_name)
+        audio, _ = librosa.load(file_path, sr=SAMPLE_RATE, mono=True)
+        prefix = os.path.splitext(file_name)[0]
+        apply_augmentation(audio, prefix, noise_audios, aug_dir)
+    print("[SUCCESS] Data augmentation completed!")

voxpulse-1.0.0/voxpulse/features.py ADDED Viewed

@@ -0,0 +1,89 @@
+import os
+import glob
+import numpy as np
+import librosa
+SAMPLE_RATE = 16000
+DURATION = 1.5
+N_MELS = 40
+N_FFT = 400
+HOP_LENGTH = 160
+def extract_features(file_path, is_negative=False):
+    """Extracts Mel-Spectrogram features from an audio file"""
+    try:
+        audio, _ = librosa.load(file_path, sr=SAMPLE_RATE, mono=True)
+        if is_negative:
+            samples_per_chunk = int(DURATION * SAMPLE_RATE)
+            chunks = []
+            for i in range(0, len(audio) - samples_per_chunk, samples_per_chunk):
+                chunk = audio[i:i + samples_per_chunk]
+                spectrogram = librosa.feature.melspectrogram(
+                    y=chunk, sr=SAMPLE_RATE, n_mels=N_MELS, n_fft=N_FFT, hop_length=HOP_LENGTH
+                )
+                spectrogram_db = librosa.power_to_db(spectrogram, ref=np.max)
+                chunks.append(spectrogram_db.T)
+            return chunks
+        else:
+            target_length = int(DURATION * SAMPLE_RATE)
+            if len(audio) > target_length:
+                audio = audio[:target_length]
+            else:
+                audio = np.pad(audio, (0, max(0, target_length - len(audio))), "constant")
+            spectrogram = librosa.feature.melspectrogram(
+                y=audio, sr=SAMPLE_RATE, n_mels=N_MELS, n_fft=N_FFT, hop_length=HOP_LENGTH
+            )
+            spectrogram_db = librosa.power_to_db(spectrogram, ref=np.max)
+            return [spectrogram_db.T]
+    except Exception as e:
+        print(f"Error processing {file_path}: {e}")
+        return []
+def run_feature_extraction(dataset_dir):
+    """Callable function to extract features automatically"""
+    print("\n[INFO] Starting Step 2: Feature Extraction Pipeline...")
+    aug_dir = os.path.join(dataset_dir, "augmented")
+    neg_dir = os.path.join(dataset_dir, "negative")
+    # 1. PROCESS POSITIVE DATA
+    print("[INFO] Generating spectrograms for positive (augmented) files...")
+    positive_files = glob.glob(os.path.join(aug_dir, "*.wav"))
+    positive_features = []
+    for i, file in enumerate(positive_files):
+        if i % 100 == 0:
+            print(f"    Processing positive files: {i}/{len(positive_files)}")
+        feats = extract_features(file)
+        if feats:
+            positive_features.extend(feats)
+    pos_array = np.array(positive_features)
+    np.save(os.path.join(dataset_dir, "positive_features.npy"), pos_array)
+    # 2. PROCESS NEGATIVE DATA
+    print("\n[INFO] Generating spectrograms for negative (background noise) data...")
+    negative_files = glob.glob(os.path.join(neg_dir, "*.wav"))
+    negative_features = []
+    if len(negative_files) == 0:
+        print("[WARNING] The 'dataset/negative' directory is empty. Using zero-arrays to prevent crash (NOT RECOMMENDED).")
+        # Dummy data fallback just in case user forgets negative noise
+        dummy_feats = np.zeros_like(pos_array[:10])
+        negative_features.extend(dummy_feats)
+    else:
+        for file in negative_files:
+            print(f"    Cutting chunks from: {os.path.basename(file)}")
+            feats = extract_features(file, is_negative=True)
+            if feats:
+                negative_features.extend(feats)
+        target_neg_count = len(positive_features) * 2
+        if len(negative_features) > target_neg_count:
+             np.random.shuffle(negative_features)
+             negative_features = negative_features[:target_neg_count]
+    neg_array = np.array(negative_features)
+    np.save(os.path.join(dataset_dir, "negative_features.npy"), neg_array)
+    print("\n[SUCCESS] Feature extraction completed! The dataset is ready for training.")

voxpulse-1.0.0/voxpulse/inference.py ADDED Viewed

@@ -0,0 +1,110 @@
+import os
+import time
+import numpy as np
+import sounddevice as sd
+import librosa
+import tensorflow as tf
+# Hides TensorFlow warning logs
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+DEFAULT_MODEL_PATH = os.path.join(os.getcwd(), "voxpulse_model.tflite")
+class VoxPulseEngine:
+    def __init__(self, model_path=None, threshold=0.60, energy_threshold=0.002):
+        self.sample_rate = 16000
+        self.duration = 1.5
+        self.chunk_duration = 0.5
+        self.buffer_size = int(self.sample_rate * self.duration)
+        self.chunk_size = int(self.sample_rate * self.chunk_duration)
+        self.threshold = threshold
+        self.energy_threshold = energy_threshold
+        self.is_paused = False
+        self.n_mels = 40
+        self.n_fft = 400
+        self.hop_length = 160
+        self.audio_buffer = np.zeros(self.buffer_size, dtype=np.float32)
+        # Load TFLite Model
+        if not model_path:
+            model_path = DEFAULT_MODEL_PATH
+        elif not os.path.isabs(model_path):
+            model_path = os.path.join(os.getcwd(), model_path)
+        if not os.path.exists(model_path):
+            raise FileNotFoundError(f"Model file '{model_path}' not found!")
+        self.interpreter = tf.lite.Interpreter(model_path=model_path)
+        self.interpreter.allocate_tensors()
+        self.input_details = self.interpreter.get_input_details()
+        self.output_details = self.interpreter.get_output_details()
+    def _process_audio(self, indata, frames, time_info, status):
+        """Internal function: Updates the audio buffer"""
+        if self.is_paused:
+            return
+        self.audio_buffer = np.roll(self.audio_buffer, -frames)
+        self.audio_buffer[-frames:] = indata[:, 0]
+    def _predict(self):
+        """Internal function: Generates spectrogram and runs inference using the AI model"""
+        audio_data = self.audio_buffer.copy()
+        # Calculate RMS energy for silence/noise gating
+        rms = np.sqrt(np.mean(audio_data**2))
+        if rms < self.energy_threshold:
+            return 0.0
+        spectrogram = librosa.feature.melspectrogram(
+            y=audio_data, sr=self.sample_rate, n_mels=self.n_mels, n_fft=self.n_fft, hop_length=self.hop_length
+        )
+        features = librosa.power_to_db(spectrogram, ref=np.max).T
+        features = np.expand_dims(features, axis=0)
+        features = np.expand_dims(features, axis=-1).astype(np.float32)
+        self.interpreter.set_tensor(self.input_details[0]['index'], features)
+        self.interpreter.invoke()
+        return self.interpreter.get_tensor(self.output_details[0]['index'])[0][0]
+    def start_listening(self, on_detect_callback):
+        """Main loop function to start listening to the microphone stream in the background"""
+        print("\n[INFO] VoxPulse Library Loaded!")
+        print(f"[INFO] Listening in background (RMS Threshold: {self.energy_threshold})...\n")
+        with sd.InputStream(samplerate=self.sample_rate, channels=1, dtype='float32',
+                            blocksize=self.chunk_size, callback=self._process_audio):
+            try:
+                while True:
+                    time.sleep(self.chunk_duration)
+                    prob = self._predict()
+                    # Live status tracking
+                    rms = np.sqrt(np.mean(self.audio_buffer**2))
+                    if rms < self.energy_threshold:
+                        print(f"  Engine running... [Silent Room] (Energy: {rms:.5f}) Match: 0.00%", end='\r')
+                    else:
+                        print(f"  Engine running... [Active Sound] (Energy: {rms:.5f}) Match: {prob*100:.2f}%", end='\r')
+                    if prob > self.threshold:
+                        print(f"\n[ALERT] WAKE WORD DETECTED! (Confidence: {prob*100:.1f}%)")
+                        # Pause microphone stream processing during execution & cooldown
+                        self.is_paused = True
+                        self.audio_buffer.fill(0.0)
+                        try:
+                            # Trigger the action!
+                            on_detect_callback()
+                        except Exception as cb_err:
+                            print(f"\n[ERROR] Callback error: {cb_err}")
+                        # Post-detect cooldown/pause to avoid double triggers
+                        time.sleep(1.5)
+                        self.audio_buffer.fill(0.0)
+                        self.is_paused = False
+                        print("\nResuming listening...")
+            except KeyboardInterrupt:
+                print("\n[INFO] VoxPulse Engine Stopped.")

voxpulse-1.0.0/voxpulse/model.py ADDED Viewed

@@ -0,0 +1,78 @@
+import os
+import numpy as np
+import tensorflow as tf
+from sklearn.model_selection import train_test_split
+# Import modules for automated pipeline
+from .augment import run_augmentation
+from .features import run_feature_extraction
+DEFAULT_DATASET_DIR = os.path.join(os.getcwd(), "dataset")
+class VoxPulseTrainer:
+    def __init__(self, dataset_dir=None):
+        self.dataset_dir = dataset_dir if dataset_dir else DEFAULT_DATASET_DIR
+        self.model = None
+    def prepare_data(self):
+        """Automated pipeline to augment and extract features before training."""
+        print("\n[AUTO-PIPELINE] Initializing data preparation engine...")
+        run_augmentation(self.dataset_dir)
+        run_feature_extraction(self.dataset_dir)
+        print("[AUTO-PIPELINE] Data preparation complete!\n")
+    def load_data(self):
+        pos_path = os.path.join(self.dataset_dir, 'positive_features.npy')
+        neg_path = os.path.join(self.dataset_dir, 'negative_features.npy')
+        # SMART CHECK: Automatically trigger data preparation if feature files do not exist
+        if not os.path.exists(pos_path) or not os.path.exists(neg_path):
+            print("[WARNING] Features (.npy files) not found. Automatically triggering preparation pipeline...")
+            self.prepare_data()
+        print("[INFO] Loading feature datasets into memory...")
+        pos_data = np.load(pos_path)
+        neg_data = np.load(neg_path)
+        # Labels: Wake word = 1, Background = 0
+        X = np.concatenate([pos_data, neg_data])
+        y = np.concatenate([np.ones(len(pos_data)), np.zeros(len(neg_data))])
+        # Add channel dimension for CNN
+        X = X[..., np.newaxis]
+        return train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
+    def build_model(self, input_shape):
+        print("[INFO] Designing CNN Model architecture...")
+        self.model = tf.keras.Sequential([
+            tf.keras.layers.InputLayer(input_shape=input_shape),
+            tf.keras.layers.Conv2D(16, (3,3), activation='relu'),
+            tf.keras.layers.MaxPooling2D(2,2),
+            tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
+            tf.keras.layers.MaxPooling2D(2,2),
+            tf.keras.layers.Flatten(),
+            tf.keras.layers.Dense(64, activation='relu'),
+            tf.keras.layers.Dropout(0.3),
+            tf.keras.layers.Dense(1, activation='sigmoid')
+        ])
+        self.model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
+    def train_and_export(self, epochs=20, export_name="voxpulse_model.tflite"):
+        X_train, X_test, y_train, y_test = self.load_data()
+        self.build_model(input_shape=(X_train.shape[1], X_train.shape[2], 1))
+        print("[INFO] Starting local training...")
+        self.model.fit(X_train, y_train, epochs=epochs, batch_size=32, validation_data=(X_test, y_test))
+        print("\n[INFO] Converting Keras model to TensorFlow Lite format...")
+        converter = tf.lite.TFLiteConverter.from_keras_model(self.model)
+        tflite_model = converter.convert()
+        save_path = os.path.join(os.getcwd(), export_name)
+        with open(save_path, 'wb') as f:
+            f.write(tflite_model)
+        print(f"[SUCCESS] Model trained and exported successfully to: {save_path}")
+if __name__ == "__main__":
+    trainer = VoxPulseTrainer()
+    trainer.train_and_export()

voxpulse-1.0.0/voxpulse.egg-info/PKG-INFO ADDED Viewed

@@ -0,0 +1,106 @@
+Metadata-Version: 2.4
+Name: voxpulse
+Version: 1.0.0
+Summary: A lightweight, fast, and AI-powered custom wake word detection system.
+Home-page: https://github.com/itzabhishekgour/VoxPulse
+Author: Abhishek Gour
+Classifier: Programming Language :: Python :: 3
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.8
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: numpy
+Requires-Dist: scipy
+Requires-Dist: librosa
+Requires-Dist: sounddevice
+Requires-Dist: audiomentations
+Requires-Dist: tensorflow
+Requires-Dist: soundfile
+Requires-Dist: scikit-learn
+Dynamic: author
+Dynamic: classifier
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license-file
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
+# VoxPulse - Custom Wake Word Detection Framework
+VoxPulse is a lightweight, offline, and 100% private DIY custom wake-word detection library for Python. Instead of relying on pre-trained corporate wake words like "Alexa" or "Hey Siri", VoxPulse empowers developers to train their own voice assistants with any custom name, in any language!
+---
+## Why VoxPulse? (Pros & Cons)
+### Pros (The Good Stuff)
+* **100% Privacy:** Everything runs locally on your machine. No internet required, no voice data is sent to the cloud.
+* **Auto-Data Pipeline:** You just provide raw `.wav` recordings. VoxPulse automatically handles background noise mixing, time-stretching, pitch-shifting, and Mel-Spectrogram feature extraction.
+* **CPU & Battery Efficient:** Features RMS Silence Gating. The AI model goes to sleep when the room is silent (CPU usage drops to ~0%) and only triggers the neural network when someone speaks.
+* **Lightweight:** Uses a custom 2D Convolutional Neural Network (CNN) compiled into TensorFlow Lite (`.tflite`), making it blazing fast even on low-end hardware.
+### Cons (The Limitations)
+* **DIY Approach:** Since it's a custom framework, there is no pre-trained model. You must spend 5 minutes recording your own voice and room noise to use it.
+* **Environment Sensitive:** The accuracy heavily depends on the quality of the background noise (`negative` dataset) you provide during training.
+---
+## How to Use VoxPulse (Quick Start Guide)
+### Step 1: Install the Library
+Install VoxPulse directly via pip:
+```bash
+pip install voxpulse
+```
+### Step 2: Prepare Your Dataset
+Create a folder named `dataset` in your project directory with two sub-folders:
+* **dataset/positive/** - Record and save 10-15 short `.wav` files of you saying your custom wake word (e.g., "Hey Friday"). Keep them around 1 to 1.5 seconds long.
+* **dataset/negative/** - Record a single 5-10 minute `.wav` file of your normal room background noise (fan sounds, typing, distant talking) and place it here.
+### Step 3: Train Your Custom Model
+Create a python script (e.g., `train.py`) and run the auto-pipeline:
+```python
+from voxpulse.model import VoxPulseTrainer
+# This single command will automatically augment data, extract features, and train the CNN!
+trainer = VoxPulseTrainer(dataset_dir="dataset")
+trainer.train_and_export(epochs=20, export_name="my_custom_model.tflite")
+```
+### Step 4: Run the Inference Engine
+Once your `.tflite` model is generated, you can use it to trigger any Python function in real-time. Create `run.py`:
+```python
+from voxpulse.inference import VoxPulseEngine
+def trigger_my_action():
+    print("Custom Wake Word Detected! Executing action...")
+    # Add your automation code here (e.g., open Spotify, turn on lights)
+# Load your newly trained model
+engine = VoxPulseEngine(model_path="my_custom_model.tflite", threshold=0.70)
+# Start listening in the background
+engine.start_listening(on_detect_callback=trigger_my_action)
+```
+---
+## Under the Hood (Architecture)
+VoxPulse abstracts away the complexity of audio machine learning. When you call the training function, it executes the following pipeline automatically:
+```mermaid
+graph TD
+    A[Raw Audio: dataset/positive] -->|Step 1: Auto-Augmentation| B[Pitch Shift & Time Stretch]
+    D[Background Noise: dataset/negative] -->|Mix Noise| B
+    B -->|Step 2: Mel-Spectrogram| C[Feature Matrices]
+    C -->|Step 3: CNN Training| E[Keras Sequential Model]
+    E -->|Step 4: Compilation| F[Lightweight TFLite Model]
+```

voxpulse-1.0.0/voxpulse.egg-info/SOURCES.txt ADDED Viewed

@@ -0,0 +1,13 @@
+LICENSE
+README.md
+setup.py
+voxpulse/__init__.py
+voxpulse/augment.py
+voxpulse/features.py
+voxpulse/inference.py
+voxpulse/model.py
+voxpulse.egg-info/PKG-INFO
+voxpulse.egg-info/SOURCES.txt
+voxpulse.egg-info/dependency_links.txt
+voxpulse.egg-info/requires.txt
+voxpulse.egg-info/top_level.txt

voxpulse-1.0.0/voxpulse.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+

voxpulse-1.0.0/voxpulse.egg-info/requires.txt ADDED Viewed

@@ -0,0 +1,8 @@
+numpy
+scipy
+librosa
+sounddevice
+audiomentations
+tensorflow
+soundfile
+scikit-learn

voxpulse-1.0.0/voxpulse.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+ voxpulse