PyPI - nkululeko - Versions diffs - 0.94.1__tar.gz → 0.94.3__tar.gz - Mend

nkululeko 0.94.1tar.gz → 0.94.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (184) hide show

{nkululeko-0.94.1 → nkululeko-0.94.3}/CHANGELOG.md RENAMED Viewed

@@ -1,6 +1,16 @@
 Changelog
 =========
+Version 0.94.3 (25-07-22)
+--------------------------
+* adding the following features (related to dementia/alzheimer):
+* pause_lognorm_mu, pause_lognorm_sigma, pause_lognorm_ks_pvalue
+* pause_mean_duration, pause_std_duration, pause_cv, proportion_pause_duration (
+Version 0.94.2 (25-06-02)
+--------------------------
+* added better error message: util.py might not have a logger
 Version 0.94.1 (25-04-03)
 --------------------------
 * fixed bug: plot uncertainties had wrong file path

nkululeko-0.94.3/PKG-INFO ADDED Viewed

@@ -0,0 +1,76 @@
+Metadata-Version: 2.4
+Name: nkululeko
+Version: 0.94.3
+Summary: Machine learning audio prediction experiments based on templates
+Home-page: https://github.com/felixbur/nkululeko
+Author: Felix Burkhardt
+Author-email: fxburk@gmail.com
+Classifier: Programming Language :: Python :: 3
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: OS Independent
+Classifier: Development Status :: 3 - Alpha
+Classifier: Topic :: Scientific/Engineering
+Requires-Python: >=3.9
+License-File: LICENSE
+Requires-Dist: audeer>=1.0.0
+Requires-Dist: audformat>=1.3.1
+Requires-Dist: audinterface>=1.0.0
+Requires-Dist: audiofile>=1.0.0
+Requires-Dist: audiomentations==0.31.0
+Requires-Dist: audmetric>=1.0.0
+Requires-Dist: audonnx>=0.7.0
+Requires-Dist: confidence-intervals>=0.0.2
+Requires-Dist: datasets>=2.0.0
+Requires-Dist: imageio>=2.0.0
+Requires-Dist: matplotlib>=3.0.0
+Requires-Dist: numpy>=1.20.0
+Requires-Dist: opensmile>=2.0.0
+Requires-Dist: pandas>=1.0.0
+Requires-Dist: praat-parselmouth>=0.4.0
+Requires-Dist: scikit_learn>=1.0.0
+Requires-Dist: scipy>=1.0.0
+Requires-Dist: seaborn>=0.11.0
+Requires-Dist: sounddevice>=0.4.0
+Requires-Dist: transformers>=4.0.0
+Requires-Dist: umap-learn>=0.5.0
+Requires-Dist: xgboost>=1.0.0
+Requires-Dist: pylatex>=1.0.0
+Provides-Extra: torch
+Requires-Dist: torch>=1.0.0; extra == "torch"
+Requires-Dist: torchvision>=0.10.0; extra == "torch"
+Requires-Dist: torchaudio>=0.10.0; extra == "torch"
+Provides-Extra: torch-cpu
+Requires-Dist: torch>=1.0.0; extra == "torch-cpu"
+Requires-Dist: torchvision>=0.10.0; extra == "torch-cpu"
+Requires-Dist: torchaudio>=0.10.0; extra == "torch-cpu"
+Provides-Extra: torch-nightly
+Requires-Dist: torch; extra == "torch-nightly"
+Requires-Dist: torchvision; extra == "torch-nightly"
+Requires-Dist: torchaudio; extra == "torch-nightly"
+Provides-Extra: spotlight
+Requires-Dist: renumics-spotlight>=1.6.13; extra == "spotlight"
+Requires-Dist: sliceguard>=0.0.35; extra == "spotlight"
+Provides-Extra: tensorflow
+Requires-Dist: tensorflow>=2.0.0; extra == "tensorflow"
+Requires-Dist: tensorflow_hub>=0.12.0; extra == "tensorflow"
+Provides-Extra: all
+Requires-Dist: torch>=1.0.0; extra == "all"
+Requires-Dist: torchvision>=0.10.0; extra == "all"
+Requires-Dist: torchaudio>=0.10.0; extra == "all"
+Requires-Dist: renumics-spotlight>=0.1.0; extra == "all"
+Requires-Dist: sliceguard>=0.1.0; extra == "all"
+Requires-Dist: tensorflow>=2.0.0; extra == "all"
+Requires-Dist: tensorflow_hub>=0.12.0; extra == "all"
+Requires-Dist: shap>=0.40.0; extra == "all"
+Requires-Dist: imblearn>=0.0.0; extra == "all"
+Requires-Dist: cylimiter>=0.0.1; extra == "all"
+Requires-Dist: audtorch>=0.0.1; extra == "all"
+Requires-Dist: splitutils>=0.0.1; extra == "all"
+Dynamic: author
+Dynamic: author-email
+Dynamic: home-page
+Dynamic: license-file
+Dynamic: provides-extra
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary

{nkululeko-0.94.1 → nkululeko-0.94.3}/README.md RENAMED Viewed

@@ -1,141 +1,96 @@
+## Nkululeko
-- [Overview](#overview)
-  - [Confusion matrix](#confusion-matrix)
-  - [Epoch progression](#epoch-progression)
-  - [Feature importance](#feature-importance)
-  - [Feature distribution](#feature-distribution)
-  - [t-SNE plots](#t-sne-plots)
-  - [Data distribution](#data-distribution)
-  - [Bias checking](#bias-checking)
-  - [Uncertainty](#uncertainty)
-- [Documentation](#documentation)
-- [Installation](#installation)
-- [Usage](#usage)
-  - [ini-file values](#ini-file-values)
-  - [Hello World example](#hello-world-example)
-  - [Features](#features)
-- [License](#license)
-- [Contributing](#contributing)
-- [Citing](#citing)
-## Overview
-A project to detect speaker characteristics by machine learning experiments with a high-level interface.
-The idea is to have a framework (based on e.g. sklearn and torch) that can be used to rapidly and automatically analyse audio data and explore machine learning models based on that data.
-* NEW with nkululeko: [Ensemble learning](http://blog.syntheticspeech.de/2024/06/25/nkululeko-ensemble-classifiers-with-late-fusion/)
-* NEW: [Finetune transformer-models](http://blog.syntheticspeech.de/2024/05/29/nkululeko-how-to-finetune-a-transformer-model/)
-* The latest features can be seen in [the ini-file](./ini_file.md) options that are used to control Nkululeko
-* Below is a [Hello World example](#helloworld) that should set you up fastly, also on [Google Colab](https://colab.research.google.com/drive/1GYNBd5cdZQ1QC3Jm58qoeMaJg3UuPhjw?usp=sharing#scrollTo=4G_SjuF9xeQf), and [with Kaggle](https://www.kaggle.com/felixburk/nkululeko-hello-world-example)
-* [Here's a blog post on how to set up nkululeko on your computer.](http://blog.syntheticspeech.de/2021/08/30/how-to-set-up-your-first-nkululeko-project/)
-* [Here is a slack channel to discuss issues related to nkululeko](https://join.slack.com/t/nkululekoworkspace/shared_invite/zt-2v3q3yfzk-XfNGoqLfp3ts9KfCZpfTyg). Please click the link if interested in contributing.
-* [Here's a slide presentation about nkululeko](docs/nkululeko.pdf)
-* [Here's a video presentation about nkululeko](https://www.youtube.com/playlist?list=PLRceVavtxLg0y2jiLmpnUfiMtfvkK912D)
-* [Here's the 2022 LREC article on nkululeko](http://felix.syntheticspeech.de/publications/Nkululeko_LREC.pdf)
-Here are some examples of typical output:
+Nkululeko is a project to detect speaker characteristics by machine learning experiments with a high-level interface. The idea is to have a framework (based on e.g. sklearn and torch) that can be used to rapidly and automatically analyse audio data and explore machine learning models based on that data.
-### Confusion matrix
-Per default, Nkululeko displays results as a confusion matrix using binning with regression.
+Some abilities that Nkululeko provides: combines acoustic features and machine learning models (including feature selection and features concatenation); performs data exploration, selection and visualization the results; finetuning; ensemble learning models; soft labeling (predicting labels with pre-trained model); and inference the model on a test set.
-<img src="meta/images/conf_mat.png" width="500px"/>
+Nkululeko orchestrates data loading, feature extraction, and model training, allowing you to specify your experiment in a configuration file. The framework handles the process from raw data to trained model and evaluation, making it easy to run machine learning experiments without directly coding in Python.
-### Epoch progression
-The point when overfitting starts can sometimes be seen by looking at the results per epoch:
+## Who is this for?
+Nkululeko is for speech processing learners, researchers and ML practitioners focused on speaker characteristics, e.g., emotion, age, gender, or disorder detection.
-<img src="meta/images/epoch_progression.png" width="500px"/>
+## Installation
-### Feature importance
-Using the *explore* interface, Nkululeko analyses the importance of acoustic features:
-<img src="meta/images/feat_importance.png" width="500px"/>
+Nkululeko requires Python 3.9 or higher with the following build status:
-### Feature distribution
-And can show the distribution of specific features per category:
+![Python 3.10](https://github.com/bagustris/nkululeko/actions/workflows/py310-aud-csv.yml/badge.svg)
+![Python 3.11](https://github.com/bagustris/nkululeko/actions/workflows/py311.yml/badge.svg)
+![Python 3.12](https://github.com/bagustris/nkululeko/actions/workflows/py312.yml/badge.svg)
+![Python 3.13](https://github.com/bagustris/nkululeko/actions/workflows/py313.yml/badge.svg)
-<img src="meta/images/feat_dist.png" width="500px"/>
+Create and activate a virtual Python environment and simply install Nkululeko:
-If there are only two categories, a Mann-Whitney U test for significance is given:
+```bash
+python -m venv .env
+source .env/bin/activate  # specify OS versions, add a separate line for Windows users
+pip install nkululeko
+```
-<img src="meta/images/feat_dist_2.png" width="500px"/>
+Current version: **0.94.1**
-### t-SNE plots
-A t-SNE plot can give you an estimate of whether your acoustic features are useful at all:
+### Optional Dependencies
-<img src="meta/images/tsne.png" width="500px"/>
+Nkululeko supports optional dependencies through extras:
-### Data distribution
-Sometimes, you only want to take a look at your data:
+```bash
+# Install with PyTorch support
+pip install nkululeko[torch]
-<img src="meta/images/data_plot.png" width="500px"/>
+# Install with CPU-only PyTorch
+pip install nkululeko[torch-cpu]
-### Bias checking
-In some cases, you might wonder if there's bias in your data. You can try to detect this with automatically estimated speech properties by visualizing the correlation of target labels and predicted labels.
+# Install with TensorFlow support
+pip install nkululeko[tensorflow]
-<img src="meta/images/emotion-pesq.png" width="500px"/>
-### Uncertainty
-Nkululeko estimates the uncertainty of model decisions (only for classifiers) with entropy over the class probabilities or logits per sample.
-<img src="meta/images/uncertainty.png" width="500px"/>
+# Install all optional dependencies
+pip install nkululeko[all]
+```
+#### Manual Installation Options
+You can also install dependencies manually:
-## Documentation
-The documentation, along with extensions of installation, usage, INI file format, and examples, can be found [nkululeko.readthedocs.io](https://nkululeko.readthedocs.io).
-## Installation
+##### PyTorch Installation
-Create and activate a virtual Python environment and simply run
-```
-pip install nkululeko
-```
-We excluded some packages from the automatic installation because they might depend on your computer and some of them are only needed in special cases. So if the error
-```
-module x not found
-```
-appears, please try
-```
-pip install x
-```
-For many packages, you will need the missing torch package.
-If you don't have a GPU (which is probably true if you don't know what that is), please use
-```
-pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
-```
-else, you can use the default:
+For CPU-only installation (recommended for most users):
+```bash
+pip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 --index-url https://download.pytorch.org/whl/cpu
 ```
+For GPU support (cuda 12.6):
+```bash
 pip install torch torchvision torchaudio
 ```
 Some functionalities require extra packages to be installed, which we didn't include automatically:
-* the SQUIM model needs a special torch version:
-  ```
-  pip uninstall -y torch torchvision torchaudio
-  pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
-  ```
-* the spotlight adapter needs spotlight:
-  ```
-  pip install renumics-spotlight sliceguard
+* For spotlight adapter:
+  ```bash
+  pip install PyYAML  # Install PyYAML first to avoid dependency issues
+  pip install nkululeko[spotlight]
   ```
+Some examples for *ini*-files (which you use to control nkululeko) are in the [examples folder](https://github.com/felixbur/nkululeko/tree/main/examples).
-Some examples for *ini*-files (which you use to control nkululeko) are in the [tests folder](https://github.com/felixbur/nkululeko/tree/main/tests).
-## Usage
+## Documentation
+The documentation, along with extensions of installation, usage, INI file format, and examples, can be found [nkululeko.readthedocs.io](https://nkululeko.readthedocs.io).
-### [ini-file values](./ini_file.md)
-Nkululeko works by specifiying
+## Usage
+### [ini-file values](./ini_file.md)
 Basically, you specify your experiment in an ["ini" file](./ini_file.md) (e.g. *experiment.ini*) and then call one of the Nkululeko interfaces to run the experiment like this:
-  * ```python -m nkululeko.nkululeko --config experiment.ini```
+  ```bash
+  python -m nkululeko.nkululeko --config experiment.ini
+  ```
 A basic configuration looks like this:
-```
+```ini
 [EXP]
 root = ./
 name = exp_emodb
@@ -159,20 +114,10 @@ Here is an overview of the interfaces/modules:
 All of them take *--config <my_config.ini>* as an argument.
-* **nkululeko.nkululeko**: do machine learning experiments combining features and learners
+* **nkululeko.nkululeko**: do machine learning experiments combining features and learners (e.g. opensmile with SVM)
 * **nkululeko.ensemble**: [combine several nkululeko experiments](http://blog.syntheticspeech.de/2024/06/25/nkululeko-ensemble-classifiers-with-late-fusion/) and report on late fusion results
-  * *--config*: which experiments (INI files) to combine
-  * *--method* (optional): majority_voting, mean (default), max, sum, uncertainty, uncertainty_weighted, confidence_weighted, performance_weighted
-  * *--threshold*: uncertainty threshold (1.0 means no threshold)
-  * *--weights*: weights for performance_weighted method (could be from previous UAR, ACC)
-  * *--outfile* (optional): name of CSV file for output (default: ensemble_result.csv)
-  * *--no_labels* (optional): indicate that no ground truth is given
 * **nkululeko.multidb**: do [multiple experiments](http://blog.syntheticspeech.de/2024/01/02/nkululeko-compare-several-databases/), comparing several databases cross and in itself
 * **nkululeko.demo**: [demo the current best model](http://blog.syntheticspeech.de/2022/01/24/nkululeko-try-out-demo-a-trained-model/) on the command line
-  * *--list* (optional) list of input files
-  * *--file* (optional) name of input file
-  * *--folder* (optional) parent folder for input files
-  * *--outfile* (optional) name of CSV file for output
 * **nkululeko.test**: predict a [given data set](http://blog.syntheticspeech.de/2022/09/01/nkululeko-how-to-evaluate-a-test-set-with-a-given-best-model/) with the current best model
 * **nkululeko.explore**: perform [data exploration](http://blog.syntheticspeech.de/2023/05/11/nkululeko-how-to-visualize-your-data-distribution/)
 * **nkululeko.augment**: [augment](http://blog.syntheticspeech.de/2023/03/13/nkululeko-how-to-augment-the-training-set/) the current training data
@@ -182,59 +127,7 @@ All of them take *--config <my_config.ini>* as an argument.
 * **nkululeko.resample**: check on all [sampling rates and change](http://blog.syntheticspeech.de/2023/08/31/how-to-fix-different-sampling-rates-in-a-dataset-with-nkululeko/) to 16kHz
 * **nkululeko.nkuluflag**: a convenient module to specify configuration parameters on the command line. Usage:
-  ```bash
-  $ python -m nkululeko.nkuluflag.py [-h] [--config CONFIG] [--data [DATA ...]] [--label [LABEL ...]] [--tuning_params [TUNING_PARAMS ...]] [--layers [LAYERS ...]] [--model MODEL] [--feat FEAT] [--set SET] [--with_os WITH_OS] [--target TARGET] [--epochs EPOCHS] [--runs RUNS] [--learning_rate LEARNING_RATE] [--drop DROP]
-  ```
-There's my [blog](http://blog.syntheticspeech.de/?s=nkululeko) with tutorials:
-* [Introduction](http://blog.syntheticspeech.de/2021/08/04/machine-learning-experiment-framework/)
-* [Nkulueko FAQ](http://blog.syntheticspeech.de/2022/07/07/nkululeko-faq/)
-* [How to set up your first nkululeko project](http://blog.syntheticspeech.de/2021/08/30/how-to-set-up-your-first-nkululeko-project/)
-* [Setting up a base nkululeko experiment](http://blog.syntheticspeech.de/2021/10/05/setting-up-a-base-nkululeko-experiment/)
-* [How to import a database](http://blog.syntheticspeech.de/2022/01/27/nkululeko-how-to-import-a-database/)
-* [Comparing classifiers and features](http://blog.syntheticspeech.de/2021/10/05/nkululeko-comparing-classifiers-and-features/)
-* [Use Praat features](http://blog.syntheticspeech.de/2022/06/27/how-to-use-selected-features-from-praat-with-nkululeko/)
-* [Combine feature sets](http://blog.syntheticspeech.de/2022/06/30/how-to-combine-feature-sets-with-nkululeko/)
-* [Classifying continuous variables](http://blog.syntheticspeech.de/2022/01/26/nkululeko-classifying-continuous-variables/)
-* [Try out / demo a trained model](http://blog.syntheticspeech.de/2022/01/24/nkululeko-try-out-demo-a-trained-model/)
-* [Perform cross-database experiments](http://blog.syntheticspeech.de/2021/10/05/nkululeko-perform-cross-database-experiments/)
-* [Meta parameter optimization](http://blog.syntheticspeech.de/2021/09/03/perform-optimization-with-nkululeko/)
-* [How to set up wav2vec embedding](http://blog.syntheticspeech.de/2021/12/03/how-to-set-up-wav2vec-embedding-for-nkululeko/)
-* [How to soft-label a database](http://blog.syntheticspeech.de/2022/01/24/how-to-soft-label-a-database-with-nkululeko/)
-* [Re-generate the progressing confusion matrix animation wit a different framerate](demos/plot_faster_anim.py)
-* [How to limit/filter a dataset](http://blog.syntheticspeech.de/2022/02/22/how-to-limit-a-dataset-with-nkululeko/)
-* [Specifying database disk location](http://blog.syntheticspeech.de/2022/02/21/specifying-database-disk-location-with-nkululeko/)
-* [Add dropout with MLP models](http://blog.syntheticspeech.de/2022/02/25/adding-dropout-to-mlp-models-with-nkululeko/)
-* [Do cross-validation](http://blog.syntheticspeech.de/2022/03/23/how-to-do-cross-validation-with-nkululeko/)
-* [Combine predictions per speaker](http://blog.syntheticspeech.de/2022/03/24/how-to-combine-predictions-per-speaker-with-nkululeko/)
-* [Run multiple experiments in one go](http://blog.syntheticspeech.de/2022/03/28/how-to-run-multiple-experiments-in-one-go-with-nkululeko/)
-* [Compare several MLP layer layouts with each other](http://blog.syntheticspeech.de/2022/04/11/how-to-compare-several-mlp-layer-layouts-with-each-other/)
-* [Import features from outside the software](http://blog.syntheticspeech.de/2022/10/18/how-to-import-features-from-outside-the-nkululeko-software/)
-* [Export acoustic features](http://blog.syntheticspeech.de/2024/05/30/nkululeko-export-acoustic-features/)
-* [Explore feature importance](http://blog.syntheticspeech.de/2023/02/20/nkululeko-show-feature-importance/)
-* [Plot distributions for feature values](http://blog.syntheticspeech.de/2023/02/16/nkululeko-how-to-plot-distributions-of-feature-values/)
-* [Show feature importance](http://blog.syntheticspeech.de/2023/02/20/nkululeko-show-feature-importance/)
-* [Augment the training set](http://blog.syntheticspeech.de/2023/03/13/nkululeko-how-to-augment-the-training-set/)
-* [Visualize clusters of acoustic features](http://blog.syntheticspeech.de/2023/04/20/nkululeko-visualize-clusters-of-your-acoustic-features/)
-* [Visualize your data distribution](http://blog.syntheticspeech.de/2023/05/11/nkululeko-how-to-visualize-your-data-distribution/)
-* [Check your dataset](http://blog.syntheticspeech.de/2023/07/11/nkululeko-check-your-dataset/)
-* [Segmenting a database](http://blog.syntheticspeech.de/2023/07/14/nkululeko-segmenting-a-database/)
-* [Predict new labels for your data from public models and check bias](http://blog.syntheticspeech.de/2023/08/16/nkululeko-how-to-predict-labels-for-your-data-from-existing-models-and-check-them/)
-* [Resample](http://blog.syntheticspeech.de/2023/08/31/how-to-fix-different-sampling-rates-in-a-dataset-with-nkululeko/)
-* [Get some statistics on correlation and effect-size](http://blog.syntheticspeech.de/2023/09/05/nkululeko-get-some-statistics-on-correlation-and-effect-size/)
-* [Automatic generation of a latex/pdf report](http://blog.syntheticspeech.de/2023/09/26/nkululeko-generate-a-latex-pdf-report/)
-* [Inspect your data with Spotlight](http://blog.syntheticspeech.de/2023/10/31/nkululeko-inspect-your-data-with-spotlight/)
-* [Automatically stratify your split sets](http://blog.syntheticspeech.de/2023/11/07/nkululeko-automatically-stratify-your-split-sets/)
-* [re-name data column names](http://blog.syntheticspeech.de/2023/11/16/nkululeko-re-name-data-column-names/)
-* [Oversample the training set](http://blog.syntheticspeech.de/2023/11/16/nkululeko-oversample-the-training-set/)
-* [Compare several databases](http://blog.syntheticspeech.de/2024/01/02/nkululeko-compare-several-databases/)
-* [Tweak the target variable for database comparison](http://blog.syntheticspeech.de/2024/03/13/nkululeko-how-to-tweak-the-target-variable-for-database-comparison/)
-* [How to run multiple experiments in one go](http://blog.syntheticspeech.de/2022/03/28/how-to-run-multiple-experiments-in-one-go-with-nkululeko/)
-* [How to finetune a transformer-model](http://blog.syntheticspeech.de/2024/05/29/nkululeko-how-to-finetune-a-transformer-model/)
-* [Ensemble (combine) classifiers with late-fusion](http://blog.syntheticspeech.de/2024/06/25/nkululeko-ensemble-classifiers-with-late-fusion/)
-* [Use train, dev and test splits](https://blog.syntheticspeech.de/2025/03/31/nkululeko-how-to-use-train-dev-test-splits/)
-### <a name="helloworld">Hello World example</a>
+## <a name="helloworld">Hello World example</a>
 * NEW: [Here's a Google colab that runs this example out-of-the-box](https://colab.research.google.com/drive/1Up7t5Nn7VwDPCCEpTg2U7cpZ_PdoEgj-?usp=sharing), and here is the same [with Kaggle](https://www.kaggle.com/felixburk/nkululeko-hello-world-example)
 * [I made a video to show you how to do this on Windows](https://www.youtube.com/playlist?list=PLRceVavtxLg0y2jiLmpnUfiMtfvkK912D)
 * Set up Python on your computer, version >= 3.8
@@ -266,7 +159,7 @@ There's my [blog](http://blog.syntheticspeech.de/?s=nkululeko) with tutorials:
 * Inspect and play around with the [demo configuration file](meta/demos/exp_emodb.ini) that defined your experiment, then re-run.
 * There are many ways to experiment with different classifiers and acoustic feature sets, [all described here](https://github.com/felixbur/nkululeko/blob/main/ini_file.md)
-### Features
+## Features
 The framework is targeted at the speech domain and supports experiments where different classifiers are combined with different feature extractors.
 * Classifiers: Naive Bayes, KNN, Tree, XGBoost, SVM, MLP
@@ -275,6 +168,7 @@ The framework is targeted at the speech domain and supports experiments where di
 * Label encoding
 * Binning (continuous to categorical)
 * Online demo interface for trained models
+* Visualization: confusion matrix, feature importance, feature distribution, epoch progression, t-SNE plot, data distribution, bias checking, uncertainty estimation
 Here's a rough UML-like sketch of the framework (and [here's the real one done with pyreverse](meta/images/classes.png)).
 ![sketch](meta/images/class_diagram.png)
@@ -284,8 +178,89 @@ Currently, the following linear classifiers are implemented (integrated from skl
   and the following ANNs (artificial neural networks)
 * MLP (multi-layer perceptron), CNN (convolutional neural network)
-Here's [an animation that shows the progress of classification done with nkululeko](https://youtu.be/6Y0M382GjvM)
+For visualization, besides confusion matrix, feature importance, feature distribution, t-SNE plot, data distribution (just names a few), Nkululeko can also be used for bias checking, uncertainty estimation, and epoch progression.
+### Bias checking
+<details>
+In some cases, you might wonder if there's bias in your data. You can try to detect this with automatically estimated speech properties by visualizing the correlation of target labels and predicted labels.
+<img src="meta/images/emotion-pesq.png" width="500px"/>
+</details>
+### Uncertainty
+<details>
+Nkululeko estimates the uncertainty of model decisions (only for classifiers) with entropy over the class probabilities or logits per sample.
+<img src="meta/images/uncertainty.png" width="500px"/>
+</details>
+Here's [an animation that shows the progress of classification done with nkululeko](https://youtu.be/6Y0M382GjvM).
+## News
+<details>
+There's Felix [blog](http://blog.syntheticspeech.de/?s=nkululeko) with tutorials below:
+* [Ensemble learning with Nkululeko](http://blog.syntheticspeech.de/2024/06/25/nkululeko-ensemble-classifiers-with-late-fusion/)
+* [Finetune transformer-models with Nkululeko](http://blog.syntheticspeech.de/2024/05/29/nkululeko-how-to-finetune-a-transformer-model/)
+* Below is a [Hello World example for Nkululeko](#helloworld) that should set you up fastly, also on [Google Colab](https://colab.research.google.com/drive/1GYNBd5cdZQ1QC3Jm58qoeMaJg3UuPhjw?usp=sharing#scrollTo=4G_SjuF9xeQf), and [with Kaggle](https://www.kaggle.com/felixburk/nkululeko-hello-world-example)
+* [Thanks to deepwiki, here's an analysis of the source code](https://deepwiki.com/felixbur/nkululeko)
+* [Here's a blog post on how to set up nkululeko on your computer.](http://blog.syntheticspeech.de/2021/08/30/how-to-set-up-your-first-nkululeko-project/)
+* [Here's a slide presentation about nkululeko](docs/nkululeko.pdf)
+* [Here's a video presentation about nkululeko](https://www.youtube.com/playlist?list=PLRceVavtxLg0y2jiLmpnUfiMtfvkK912D)
+* [Here's the 2022 LREC article on nkululeko](http://felix.syntheticspeech.de/publications/Nkululeko_LREC.pdf)
+* [Introduction](http://blog.syntheticspeech.de/2021/08/04/machine-learning-experiment-framework/)
+* [Nkululeko FAQ](http://blog.syntheticspeech.de/2022/07/07/nkululeko-faq/)
+* [How to set up your first nkululeko project](http://blog.syntheticspeech.de/2021/08/30/how-to-set-up-your-first-nkululeko-project/)
+* [Setting up a base nkululeko experiment](http://blog.syntheticspeech.de/2021/10/05/setting-up-a-base-nkululeko-experiment/)
+* [How to import a database](http://blog.syntheticspeech.de/2022/01/27/nkululeko-how-to-import-a-database/)
+* [Comparing classifiers and features](http://blog.syntheticspeech.de/2021/10/05/nkululeko-comparing-classifiers-and-features/)
+* [Use Praat features](http://blog.syntheticspeech.de/2022/06/27/how-to-use-selected-features-from-praat-with-nkululeko/)
+* [Combine feature sets](http://blog.syntheticspeech.de/2022/06/30/how-to-combine-feature-sets-with-nkululeko/)
+* [Classifying continuous variables](http://blog.syntheticspeech.de/2022/01/26/nkululeko-classifying-continuous-variables/)
+* [Try out / demo a trained model](http://blog.syntheticspeech.de/2022/01/24/nkululeko-try-out-demo-a-trained-model/)
+* [Perform cross-database experiments](http://blog.syntheticspeech.de/2021/10/05/nkululeko-perform-cross-database-experiments/)
+* [Meta parameter optimization](http://blog.syntheticspeech.de/2021/09/03/perform-optimization-with-nkululeko/)
+* [How to set up wav2vec embedding](http://blog.syntheticspeech.de/2021/12/03/how-to-set-up-wav2vec-embedding-for-nkululeko/)
+* [How to soft-label a database](http://blog.syntheticspeech.de/2022/01/24/how-to-soft-label-a-database-with-nkululeko/)
+* [Re-generate the progressing confusion matrix animation wit a different framerate](demos/plot_faster_anim.py)
+* [How to limit/filter a dataset](http://blog.syntheticspeech.de/2022/02/22/how-to-limit-a-dataset-with-nkululeko/)
+* [Specifying database disk location](http://blog.syntheticspeech.de/2022/02/21/specifying-database-disk-location-with-nkululeko/)
+* [Add dropout with MLP models](http://blog.syntheticspeech.de/2022/02/25/adding-dropout-to-mlp-models-with-nkululeko/)
+* [Do cross-validation](http://blog.syntheticspeech.de/2022/03/23/how-to-do-cross-validation-with-nkululeko/)
+* [Combine predictions per speaker](http://blog.syntheticspeech.de/2022/03/24/how-to-combine-predictions-per-speaker-with-nkululeko/)
+* [Run multiple experiments in one go](http://blog.syntheticspeech.de/2022/03/28/how-to-run-multiple-experiments-in-one-go-with-nkululeko/)
+* [Compare several MLP layer layouts with each other](http://blog.syntheticspeech.de/2022/04/11/how-to-compare-several-mlp-layer-layouts-with-each-other/)
+* [Import features from outside the software](http://blog.syntheticspeech.de/2022/10/18/how-to-import-features-from-outside-the-nkululeko-software/)
+* [Export acoustic features](http://blog.syntheticspeech.de/2024/05/30/nkululeko-export-acoustic-features/)
+* [Explore feature importance](http://blog.syntheticspeech.de/2023/02/20/nkululeko-show-feature-importance/)
+* [Plot distributions for feature values](http://blog.syntheticspeech.de/2023/02/16/nkululeko-how-to-plot-distributions-of-feature-values/)
+* [Show feature importance](http://blog.syntheticspeech.de/2023/02/20/nkululeko-show-feature-importance/)
+* [Augment the training set](http://blog.syntheticspeech.de/2023/03/13/nkululeko-how-to-augment-the-training-set/)
+* [Visualize clusters of acoustic features](http://blog.syntheticspeech.de/2023/04/20/nkululeko-visualize-clusters-of-your-acoustic-features/)
+* [Visualize your data distribution](http://blog.syntheticspeech.de/2023/05/11/nkululeko-how-to-visualize-your-data-distribution/)
+* [Check your dataset](http://blog.syntheticspeech.de/2023/07/11/nkululeko-check-your-dataset/)
+* [Segmenting a database](http://blog.syntheticspeech.de/2023/07/14/nkululeko-segmenting-a-database/)
+* [Predict new labels for your data from public models and check bias](http://blog.syntheticspeech.de/2023/08/16/nkululeko-how-to-predict-labels-for-your-data-from-existing-models-and-check-them/)
+* [Resample](http://blog.syntheticspeech.de/2023/08/31/how-to-fix-different-sampling-rates-in-a-dataset-with-nkululeko/)
+* [Get some statistics on correlation and effect-size](http://blog.syntheticspeech.de/2023/09/05/nkululeko-get-some-statistics-on-correlation-and-effect-size/)
+* [Automatic generation of a latex/pdf report](http://blog.syntheticspeech.de/2023/09/26/nkululeko-generate-a-latex-pdf-report/)
+* [Inspect your data with Spotlight](http://blog.syntheticspeech.de/2023/10/31/nkululeko-inspect-your-data-with-spotlight/)
+* [Automatically stratify your split sets](http://blog.syntheticspeech.de/2023/11/07/nkululeko-automatically-stratify-your-split-sets/)
+* [re-name data column names](http://blog.syntheticspeech.de/2023/11/16/nkululeko-re-name-data-column-names/)
+* [Oversample the training set](http://blog.syntheticspeech.de/2023/11/16/nkululeko-oversample-the-training-set/)
+* [Compare several databases](http://blog.syntheticspeech.de/2024/01/02/nkululeko-compare-several-databases/)
+* [Tweak the target variable for database comparison](http://blog.syntheticspeech.de/2024/03/13/nkululeko-how-to-tweak-the-target-variable-for-database-comparison/)
+* [How to run multiple experiments in one go](http://blog.syntheticspeech.de/2022/03/28/how-to-run-multiple-experiments-in-one-go-with-nkululeko/)
+* [How to finetune a transformer-model](http://blog.syntheticspeech.de/2024/05/29/nkululeko-how-to-finetune-a-transformer-model/)
+* [Ensemble (combine) classifiers with late-fusion](http://blog.syntheticspeech.de/2024/06/25/nkululeko-ensemble-classifiers-with-late-fusion/)
+* [Use train, dev and test splits](https://blog.syntheticspeech.de/2025/03/31/nkululeko-how-to-use-train-dev-test-splits/)
+</details>
 ## License
 Nkululeko can be used under the [MIT license](https://choosealicense.com/licenses/mit/).
@@ -294,8 +269,8 @@ Nkululeko can be used under the [MIT license](https://choosealicense.com/license
 ## Contributing
 Contributions are welcome and encouraged. To learn more about how to contribute to nkululeko, please refer to the [Contributing guidelines](./CONTRIBUTING.md).
-## Citing
-If you use it, please mention the Nkululeko paper:
+## Citation
+If you use Nkululeko, please cite the paper:
 > F. Burkhardt, Johannes Wagner, Hagen Wierstorf, Florian Eyben and Björn Schuller: Nkululeko: A Tool For Rapid Speaker Characteristics Detection, Proc. Proc. LREC, 2022

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/augmenting/randomsplicer.py RENAMED Viewed

@@ -5,7 +5,7 @@ Code originally by Oliver Pauly
 Based on an idea by Klaus Scherer
-K. R. Scherer, “Randomized splicing: A note on a simple technique for masking speech content”
+K. R. Scherer, “Randomized splicing: A note on a simple technique for masking speech content”
 Journal of Experimental Research in Personality, vol. 5, pp. 155–159, 1971.
 Evaluated in:

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/augmenting/randomsplicing.py RENAMED Viewed

@@ -3,7 +3,7 @@ Code originally by Oliver Pauly
 Based on an idea by Klaus Scherer
-K. R. Scherer, “Randomized splicing: A note on a simple technique for masking speech content”
+K. R. Scherer, “Randomized splicing: A note on a simple technique for masking speech content”
 Journal of Experimental Research in Personality, vol. 5, pp. 155–159, 1971.
 Evaluated in:

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/augmenting/resampler.py RENAMED Viewed

@@ -17,7 +17,7 @@ class Resampler:
     def __init__(self, df, replace, not_testing=True):
         self.SAMPLING_RATE = 16000
         self.df = df
-        self.util = Util("resampler", has_config=not_testing)
+        self.util = Util("resampler", has_config=not not_testing)
         self.util.warn(f"all files might be resampled to {self.SAMPLING_RATE}")
         self.not_testing = not_testing
         self.replace = (
@@ -30,7 +30,7 @@ class Resampler:
         files = self.df.index.get_level_values(0).values
         # replace = eval(self.util.config_val("RESAMPLE", "replace", "False"))
         replace = self.replace
-        if self.not_testing:
+        if not self.not_testing:
             store = self.util.get_path("store")
         else:
             store = "./"
@@ -67,17 +67,25 @@ class Resampler:
             self.df = self.df.set_index(
                 self.df.index.set_levels(new_files, level="file")
             )
-            target_file = self.util.config_val("RESAMPLE", "target", "resampled.csv")
-            # remove encoded labels
-            target = self.util.config_val("DATA", "target", "emotion")
-            if "class_label" in self.df.columns:
-                self.df = self.df.drop(columns=[target])
-                self.df = self.df.rename(columns={"class_label": target})
-            # save file
-            self.df.to_csv(target_file)
-            self.util.debug(
-                "saved resampled list of files to" f" {os.path.abspath(target_file)}"
-            )
+            if not self.not_testing:
+                target_file = self.util.config_val("RESAMPLE", "target", "resampled.csv")
+                # remove encoded labels
+                target = self.util.config_val("DATA", "target", "emotion")
+                if "class_label" in self.df.columns:
+                    self.df = self.df.drop(columns=[target])
+                    self.df = self.df.rename(columns={"class_label": target})
+                # save file
+                self.df.to_csv(target_file)
+                self.util.debug(
+                    "saved resampled list of files to" f" {os.path.abspath(target_file)}"
+                )
+            else:
+                # When running from command line, save to simple resampled.csv
+                target_file = "resampled.csv"
+                self.df.to_csv(target_file)
+                self.util.debug(
+                    f"saved resampled list of files to {os.path.abspath(target_file)}"
+                )
         self.util.debug(f"resampled {succes} files, {error} errors")
@@ -91,7 +99,7 @@ def main():
         df_sample.index, allow_nat=False
     )
     df_sample.head(10)
-    resampler = Resampler(df_sample, not_testing=False)
+    resampler = Resampler(df_sample, False, not_testing=False)
     resampler.resample()
     shutil.copyfile(testfile, "tmp.resample_result.wav")
     shutil.copyfile("tmp.wav", testfile)

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/autopredict/ap_age.py RENAMED Viewed

@@ -1,4 +1,4 @@
-""""
+""" "
 A predictor for age.
 Currently based on audEERING's agender model.
 """

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/autopredict/ap_arousal.py RENAMED Viewed

@@ -1,4 +1,4 @@
-""""
+""" "
 A predictor for emotional arousal.
 Currently based on audEERING's emotional dimension model.
 """

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/autopredict/ap_gender.py RENAMED Viewed

@@ -1,4 +1,4 @@
-""""
+""" "
 A predictor for biological sex.
 Currently based on audEERING's agender model.
 """

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/autopredict/ap_mos.py RENAMED Viewed

@@ -1,4 +1,4 @@
-""""
+""" "
 A predictor for MOS - mean opinion score.
 """

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/autopredict/ap_pesq.py RENAMED Viewed

@@ -1,4 +1,4 @@
-""""
+""" "
 A predictor for PESQ - Perceptual Evaluation of Speech Quality.
 """

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/autopredict/ap_sdr.py RENAMED Viewed

@@ -1,6 +1,6 @@
-""""
+""" "
 A predictor for SDR - Signal to Distortion Ratio.
-as estimated by Scale-Invariant Signal-to-Distortion Ratio (SI-SDR)
+as estimated by Scale-Invariant Signal-to-Distortion Ratio (SI-SDR)
 """
 import ast

{nkululeko-0.94.1 → nkululeko-0.94.3}/nkululeko/autopredict/ap_sid.py RENAMED Viewed

@@ -1,4 +1,4 @@
-""""
+""" "
 A predictor for sid - Speaker ID.
 """

nkululeko 0.94.1__tar.gz → 0.94.3__tar.gz

nkululeko 0.94.1tar.gz → 0.94.3tar.gz