PyPI - monkeyplug-enhanced - Versions diffs - 2.2.0__tar.gz - Mend

monkeyplug-enhanced 2.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

monkeyplug_enhanced-2.2.0/.gitignore +142 -0
monkeyplug_enhanced-2.2.0/LICENSE +29 -0
monkeyplug_enhanced-2.2.0/PKG-INFO +360 -0
monkeyplug_enhanced-2.2.0/README.md +337 -0
monkeyplug_enhanced-2.2.0/pyproject.toml +60 -0
monkeyplug_enhanced-2.2.0/src/monkeyplug/__init__.py +21 -0
monkeyplug_enhanced-2.2.0/src/monkeyplug/data/profanity_list.json +1 -0
monkeyplug_enhanced-2.2.0/src/monkeyplug/groq_config.py +70 -0
monkeyplug_enhanced-2.2.0/src/monkeyplug/monkeyplug.py +2892 -0
monkeyplug_enhanced-2.2.0/src/monkeyplug/separation.py +147 -0

monkeyplug_enhanced-2.2.0/.gitignore ADDED Viewed

@@ -0,0 +1,142 @@
+*.wav
+*.mp3
+*.ogg
+*.flac
+*.m4a
+*.m4b
+*.mp4
+*.wma
+*.aac
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# Groq API key files
+.groq_key

monkeyplug_enhanced-2.2.0/LICENSE ADDED Viewed

@@ -0,0 +1,29 @@
+BSD 3-Clause License
+Copyright (c) 2021, SG
+All rights reserved.
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+1. Redistributions of source code must retain the above copyright notice, this
+   list of conditions and the following disclaimer.
+2. Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation
+   and/or other materials provided with the distribution.
+3. Neither the name of the copyright holder nor the names of its
+   contributors may be used to endorse or promote products derived from
+   this software without specific prior written permission.
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

monkeyplug_enhanced-2.2.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,360 @@
+Metadata-Version: 2.4
+Name: monkeyplug-enhanced
+Version: 2.2.0
+Summary: Enhanced fork of monkeyplug — censors profanity in audio files using speech recognition with Groq API, AI instrumental generation, and batch processing.
+Project-URL: Homepage, https://github.com/ljbred08/monkeyplug
+Project-URL: Issues, https://github.com/ljbred08/monkeyplug/issues
+Project-URL: Repository, https://github.com/ljbred08/monkeyplug.git
+Author-email: Seth Grover <mero.mero.guero@gmail.com>, Lincoln Brown <link@brown.fm>
+License-File: LICENSE
+Classifier: License :: OSI Approved :: BSD License
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python :: 3
+Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
+Requires-Python: >=3.6
+Requires-Dist: groq>=0.1.0
+Requires-Dist: mmguero==2.0.3
+Requires-Dist: mutagen==1.47.0
+Requires-Dist: numpy>=1.24.0
+Requires-Dist: requests==2.32.5
+Requires-Dist: sherpa-onnx>=1.10.0
+Requires-Dist: soundfile>=0.12.0
+Description-Content-Type: text/markdown
+# monkeyplug-enhanced
+[![Latest Version](https://img.shields.io/pypi/v/monkeyplug-enhanced)](https://pypi.python.org/pypi/monkeyplug-enhanced/) [![VOSK Docker Images](https://github.com/mmguero/monkeyplug/workflows/monkeyplug-build-push-vosk-ghcr/badge.svg)](https://github.com/mmguero/monkeyplug/pkgs/container/monkeyplug) [![Whisper Docker Images](https://github.com/mmguero/monkeyplug/workflows/monkeyplug-build-push-whisper-ghcr/badge.svg)](https://github.com/mmguero/monkeyplug/pkgs/container/monkeyplug)
+**monkeyplug-enhanced** is an enhanced fork of [mmguero/monkeyplug](https://github.com/mmguero/monkeyplug) (available on PyPI as `monkeyplug`). It censors profanity in audio files using speech recognition, detecting profanity timestamps and either muting, beeping, or splicing in instrumental audio using FFmpeg.
+The CLI command is still `monkeyplug` — only the package name changed to avoid conflicting with the original.
+### Enhancements over the original
+- **Groq API** integration (fast, default mode)
+- **AI instrumental generation** via sherpa-onnx source separation
+- **Wildcard/batch processing** with automatic vocal detection
+- **Transcript save/reuse** for faster reprocessing
+- **Config file** support with sensible defaults
+## How It Works
+1. Speech recognition produces word-level timestamps (using Groq, Whisper, or Vosk)
+2. Each word is checked against a built-in profanity list (or your custom list)
+3. FFmpeg creates a cleaned audio file by either muting, beeping, or replacing profanity sections with instrumental audio
+4. Optionally, transcripts can be saved and reused to skip transcription on future runs
+If provided a video file, monkeyplug processes the audio stream and remultiplexes it with the original video stream.
+## Installation
+```bash
+pip install monkeyplug-enhanced
+```
+Or install from GitHub:
+```bash
+pip install 'git+https://github.com/ljbred08/monkeyplug'
+```
+### Prerequisites
+- **FFmpeg** — install via your OS package manager or from [ffmpeg.org](https://www.ffmpeg.org/download.html)
+- **Python 3.6+**
+- **Groq API key** (for default mode) — see [Groq API Setup](#groq-api-setup)
+- Optional: [Whisper](https://github.com/openai/whisper) or [Vosk](https://github.com/alphacep/vosk-api) for offline recognition
+## Groq API Setup
+The default mode uses Groq's fast Whisper API. Configure your API key using one of these methods (in order of priority):
+**Command-line parameter:**
+```bash
+monkeyplug -i input.mp3 -o output.mp3 --groq-api-key gsk_...
+```
+**Environment variable:**
+```bash
+export GROQ_API_KEY=gsk_...
+```
+**Config file** (`~/.groq/config.json`):
+```json
+{"api_key": "gsk_..."}
+```
+**Project-local file** (add `.groq_key` to `.gitignore`):
+```bash
+echo 'gsk_...' > .groq_key
+```
+## Quick Start
+```bash
+# Basic usage — mutes profanity using Groq API and built-in word list
+monkeyplug -i song.mp3 -o song_clean.mp3
+# Verbose output to see what's happening
+monkeyplug -i song.mp3 -o song_clean.mp3 -v
+# Use local Whisper instead of Groq
+monkeyplug -i song.mp3 -o song_clean.mp3 -m whisper
+```
+## Censorship Modes
+Three modes are available. Priority order: `--mute` > `--beep` > `--instrumental`.
+### Mute
+Silences profanity sections with short fade transitions.
+```bash
+monkeyplug -i song.mp3 -o song_clean.mp3 --mute
+```
+### Beep
+Replaces profanity with a beep tone.
+```bash
+# Basic beep
+monkeyplug -i song.mp3 -o song_clean.mp3 -b
+# Customize beep frequency and mix
+monkeyplug -i song.mp3 -o song_clean.mp3 -b -z 1000 --beep-mix-normalize
+```
+### Instrumental
+Replaces profanity sections with instrumental audio for a professional-sounding clean edit. Supports several sub-modes:
+#### Provide an instrumental file directly
+```bash
+monkeyplug -i explicit.mp3 -o clean.mp3 --instrumental instrumental.mp3
+```
+#### Auto mode (default)
+Searches for an instrumental file using fuzzy matching. If not found, falls back to AI generation.
+```bash
+# Default behavior — searches for matching instrumental, generates if not found
+monkeyplug -i song.mp3 -o song_clean.mp3 --instrumental auto
+# This is also the default when no --instrumental flag is given
+monkeyplug -i song.mp3 -o song_clean.mp3
+```
+AUTO fuzzy matching searches the same directory for audio files with similar names (30% similarity threshold). Examples:
+- `1-satisfied.mp3` → finds `satisfied-inst.mp3`
+- `MySong_v2.mp3` → finds `MySong_instrumental.mp3`
+#### Prefix search
+Searches for instrumental files using a specific prefix/suffix pattern:
+```bash
+# Searches for: song_inst.mp3, song-inst.mp3, inst_song.mp3, etc.
+monkeyplug -i song.mp3 -o song_clean.mp3 --instrumental prefix --instrumental-prefix inst
+```
+#### AI Generation (force)
+Uses sherpa-onnx to AI-generate instrumental sections for profanity segments. Skips all instrumental file searching.
+```bash
+monkeyplug -i song.mp3 -o song_clean.mp3 --instrumental generate
+```
+The AI separation process:
+1. Extracts profanity segments from the original audio
+2. Concatenates them with configurable padding (default: 1.0s)
+3. Separates vocals from instrumental using a Spleeter model
+4. Splices the AI-generated instrumental back into the original
+Separation models are cached at `~/.cache/monkeyplug/separation_models/` (downloaded on first use).
+## Wildcard / Batch Mode
+Process multiple files at once using `*` wildcards:
+```bash
+# Process all MP3s in current directory
+monkeyplug -i "*.mp3" -o "*_clean.mp3" --instrumental generate
+# With verbose output
+monkeyplug -i "*.mp3" -o "*_clean.mp3 -v
+```
+### Vocal detection
+In wildcard mode, monkeyplug automatically detects which files have vocals by transcribing a 10-second sample from the middle of each file. Instrumental files (no speech detected) are skipped.
+With `--instrumental generate`, vocal detection is **skipped by default** (all files are processed) since you're generating instrumentals anyway. Use `--filter-instrumentals` to re-enable it:
+```bash
+# Process all files (default — no vocal detection)
+monkeyplug -i "*.mp3" -o "*_clean.mp3" --instrumental generate
+# Skip files detected as instrumentals
+monkeyplug -i "*.mp3" -o "*_clean.mp3" --instrumental generate --filter-instrumentals
+```
+Files matching the output pattern are automatically skipped (already processed).
+## Transcript Workflow
+Save and reuse transcripts to avoid redundant API calls (up to 22x faster on repeat runs):
+```bash
+# Generate and save transcript alongside output
+monkeyplug -i song.mp3 -o song_clean.mp3 --save-transcript
+# Creates: song_clean.mp3 + song_clean_transcript.json
+# Second run: automatically finds and reuses the transcript
+monkeyplug -i song.mp3 -o song_clean.mp3 --save-transcript
+# Force new transcription (ignore existing transcript)
+monkeyplug -i song.mp3 -o song_clean.mp3 --save-transcript --force-retranscribe
+# Manually specify a transcript to load
+monkeyplug -i song.mp3 -o song_clean_strict.mp3 --input-transcript song_clean_transcript.json -w strict_swears.txt
+```
+## Custom Profanity Lists
+```bash
+# Use a custom text file (one word per line, or word|replacement)
+monkeyplug -i podcast.mp3 -o podcast_clean.mp3 -w custom_swears.txt
+# Use a custom JSON file (array of strings)
+monkeyplug -i podcast.mp3 -o podcast_clean.mp3 -w custom_swears.json
+# Custom words are merged with the built-in profanity list
+```
+## Config File
+monkeyplug looks for a JSON config file in this order (first found wins):
+1. `./.monkeyplug.json` (current directory — project-specific)
+2. `~/.cache/monkeyplug/config.json` (user-specific)
+If neither exists, a default config is auto-created at `~/.cache/monkeyplug/config.json`:
+```json
+{
+  "pad_milliseconds": 10,
+  "pad_milliseconds_pre": 10,
+  "pad_milliseconds_post": 10,
+  "separation_padding": 1.0,
+  "beep_hertz": 1000
+}
+```
+Config values provide defaults that can be overridden by CLI arguments.
+Clean all caches (models, config) with:
+```bash
+monkeyplug --clean-cache
+```
+## Padding Control
+Add padding around profanity for smoother transitions:
+```bash
+# Equal padding on both sides
+monkeyplug -i song.mp3 -o clean.mp3 --pad-milliseconds 100
+# Different pre and post padding
+monkeyplug -i song.mp3 -o clean.mp3 --pad-milliseconds-pre 50 --pad-milliseconds-post 100
+```
+## Full Usage Reference
+```
+usage: monkeyplug <arguments>
+Core Options:
+  -i, --input <string>              Input file, URL, or wildcard pattern
+  -o, --output <string>             Output file or pattern
+  -v [concise|full], --verbose      Verbose output
+  -m [groq|whisper|vosk], --mode    Speech recognition engine (default: groq)
+Censorship Modes:
+  --mute                            Mute profanity (disables instrumental mode)
+  -b, --beep                        Beep instead of silence
+  --instrumental <mode|file>        Instrumental mode: auto, generate, prefix, or file path
+  --instrumental-prefix <string>    Prefix to search for instrumental file (default: AUTO)
+  --instrumental-auto-candidates <int>  Top candidates for AUTO matching (default: 5)
+Profanity:
+  -w, --swears <file>               Custom profanity list (text or JSON)
+  --pad-milliseconds <int>          Padding around profanity (default: 10)
+  --pad-milliseconds-pre <int>      Padding before profanity (default: 10)
+  --pad-milliseconds-post <int>     Padding after profanity (default: 10)
+Beep Options:
+  -z, --beep-hertz <int>            Beep frequency in Hz (default: 1000)
+  --beep-mix-normalize              Normalize audio/beep mix
+  --beep-audio-weight <int>         Non-beeped audio weight (default: 1)
+  --beep-sine-weight <int>          Beep weight (default: 1)
+  --beep-dropout-transition <int>   Dropout transition for beep (default: 0)
+Transcript:
+  --save-transcript                 Save transcript JSON alongside output
+  --input-transcript <file>         Load existing transcript JSON
+  --output-json <file>              Save transcript to specific file
+  --force-retranscribe              Force new transcription
+AI Separation:
+  --separation-padding <seconds>    Context padding for AI generation (default: 1.0)
+  --filter-instrumentals            Filter out instrumental files in wildcard mode with generate
+Audio Output:
+  -f, --format <string>             Output format (default: inferred from extension or "MATCH")
+  -c, --channels <int>              Output channels (default: 2)
+  -s, --sample-rate <int>           Output sample rate (default: 48000)
+  -r, --bitrate <string>            Output bitrate (default: 256K)
+  -a, --audio-params <string>       FFmpeg audio parameters
+  -q, --vorbis-qscale <int>         qscale for libvorbis (default: 5)
+Other:
+  --force                           Process file even if already tagged
+  --clean-cache                     Delete all cached data (models, config) and exit
+Groq Options:
+  --groq-api-key <string>           Groq API key
+  --groq-model <string>             Groq Whisper model (default: whisper-large-v3)
+Whisper Options:
+  --whisper-model-dir <string>      Model directory (default: ~/.cache/whisper)
+  --whisper-model-name <string>     Model name (default: small.en)
+  --torch-threads <int>             CPU inference threads (default: 0)
+VOSK Options:
+  --vosk-model-dir <string>         Model directory (default: ~/.cache/vosk)
+  --vosk-read-frames-chunk <int>    WAV frame chunk (default: 8000)
+```
+## Docker
+Docker images are available for running monkeyplug in containers. See [mmguero/monkeyplug](https://github.com/mmguero/monkeyplug) for available images.
+## Contributing
+Pull requests welcome!
+## Authors
+- **Seth Grover** - Initial work - [mmguero](https://github.com/mmguero)
+- **Lincoln Brown** - Enhanced fork (Groq API, AI generation, batch mode) - [ljbred08](https://github.com/ljbred08)
+## License
+BSD 3-Clause License — see the [LICENSE](LICENSE) file for details.