PyPI - nkululeko - Versions diffs - 0.57.0__tar.gz → 0.58.0__tar.gz - Mend

nkululeko 0.57.0tar.gz → 0.58.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (88) hide show

{nkululeko-0.57.0 → nkululeko-0.58.0}/CHANGELOG.md RENAMED Viewed

@@ -1,6 +1,12 @@
 Changelog
 =========
+Version 0.58.0
+--------------
+* added dominance predict
+* added MOS predict
+* added PESQ predict
 Version 0.57.0
 --------------
 * renamed autopredict predict

{nkululeko-0.57.0/nkululeko.egg-info → nkululeko-0.58.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: nkululeko
-Version: 0.57.0
+Version: 0.58.0
 Summary: Machine learning audio prediction experiments based on templates
 Home-page: https://github.com/felixbur/nkululeko
 Author: Felix Burkhardt
@@ -17,6 +17,7 @@ License-File: LICENSE
 # Nkululeko
 * [Overview](#overview)
 * [Installation](#installation)
+* [Documentation](https://nkululeko.readthedocs.io)
 * [Usage](#usage)
 * [Hello World](#hello-world-example)
 * [Licence](#licence)
@@ -66,7 +67,11 @@ Sometimes you only want to take a look at your data:
 <img src="meta/images/data_plot.png" width="500px"/>
-## Installatione
+## Documentation
+The documentation, along with extensions of installation, usage, INI file format, and examples, can be found [nkululeko.readthedocs.io](https://nkululeko.readthedocs.io).
+## Installation
 Create and activate a virtual Python environment and simply run
 ```
@@ -106,10 +111,11 @@ Read the [Hello World example](#hello-world-example) for initial usage with Emo-
 Here is an overview of the interfaces:
 * **nkululeko.nkululeko**: doing experiments
-* **nkululeko.demo**: demo the current best model on command line
+* **nkululeko.demo**: demo the current best model on the command line
 * **nkululeko.test**: predict a series of files with the current best model
 * **nkululeko.explore**: perform data exploration
 * **nkululeko.augment**: augment the current training data
+* **nkululeko.predict**: predict a series of files with a given model
 Alternatively, there is a central "experiment" class that can be used by own experiments
@@ -217,6 +223,12 @@ Nkululeko can be used under the [MIT license](https://choosealicense.com/license
 Changelog
 =========
+Version 0.58.0
+--------------
+* added dominance predict
+* added MOS predict
+* added PESQ predict
 Version 0.57.0
 --------------
 * renamed autopredict predict

{nkululeko-0.57.0 → nkululeko-0.58.0}/README.md RENAMED Viewed

@@ -1,6 +1,7 @@
 # Nkululeko
 * [Overview](#overview)
 * [Installation](#installation)
+* [Documentation](https://nkululeko.readthedocs.io)
 * [Usage](#usage)
 * [Hello World](#hello-world-example)
 * [Licence](#licence)
@@ -50,7 +51,11 @@ Sometimes you only want to take a look at your data:
 <img src="meta/images/data_plot.png" width="500px"/>
-## Installatione
+## Documentation
+The documentation, along with extensions of installation, usage, INI file format, and examples, can be found [nkululeko.readthedocs.io](https://nkululeko.readthedocs.io).
+## Installation
 Create and activate a virtual Python environment and simply run
 ```
@@ -90,10 +95,11 @@ Read the [Hello World example](#hello-world-example) for initial usage with Emo-
 Here is an overview of the interfaces:
 * **nkululeko.nkululeko**: doing experiments
-* **nkululeko.demo**: demo the current best model on command line
+* **nkululeko.demo**: demo the current best model on the command line
 * **nkululeko.test**: predict a series of files with the current best model
 * **nkululeko.explore**: perform data exploration
 * **nkululeko.augment**: augment the current training data
+* **nkululeko.predict**: predict a series of files with a given model
 Alternatively, there is a central "experiment" class that can be used by own experiments

nkululeko-0.58.0/nkululeko/ap_dominance.py ADDED Viewed

@@ -0,0 +1,29 @@
+"""
+A predictor for emotional dominance.
+Currently based on audEERING's emotional dimension model.
+"""
+from nkululeko.util import Util
+from nkululeko.feature_extractor import FeatureExtractor
+import ast
+import nkululeko.glob_conf as glob_conf
+class DominancePredictor:
+    """
+    DominancePredictor
+    predicting dominance with the audEERING emotional dimension model
+    """
+    def __init__(self, df):
+        self.df = df
+        self.util = Util('dominancePredictor')
+    def predict(self, split_selection):
+        self.util.debug(f'predicting dominance for {split_selection} samples')
+        feats_name = "_".join(ast.literal_eval(glob_conf.config['DATA']['databases']))
+        self.feature_extractor = FeatureExtractor(self.df, ['auddim'], feats_name, split_selection)
+        pred_df = self.feature_extractor.extract()
+        pred_vals = pred_df.dominance * 1000
+        return_df = self.df.copy()
+        return_df['dominance_pred'] = pred_vals.astype('int')/1000
+        return return_df

nkululeko-0.58.0/nkululeko/ap_mos.py ADDED Viewed

@@ -0,0 +1,35 @@
+""""
+A predictor for MOS - mean opinion score.
+"""
+from nkululeko.util import Util
+import ast
+import nkululeko.glob_conf as glob_conf
+from nkululeko.feature_extractor import FeatureExtractor
+import numpy as np
+class MOSPredictor:
+    """
+    MOSPredictor
+    predicting MOS
+    """
+    def __init__(self, df):
+        self.df = df
+        self.util = Util('mosPredictor')
+    def predict(self, split_selection):
+        self.util.debug(f'estimating MOS for {split_selection} samples')
+        return_df = self.df.copy()
+        feats_name = "_".join(ast.literal_eval(glob_conf.config['DATA']['databases']))
+        self.feature_extractor = FeatureExtractor(self.df, ['mos'], feats_name, split_selection)
+        result_df = self.feature_extractor.extract()
+        # replace missing values by 0
+        result_df = result_df.fillna(0)
+        result_df = result_df.replace(np.nan, 0)
+        result_df.replace([np.inf, -np.inf], 0, inplace=True)
+        pred_snr = result_df.mos * 100
+        return_df['mos_pred'] = pred_snr.astype('int')/100
+        return return_df

nkululeko-0.58.0/nkululeko/ap_pesq.py ADDED Viewed

@@ -0,0 +1,35 @@
+""""
+A predictor for PESQ - Perceptual Evaluation of Speech Quality.
+"""
+from nkululeko.util import Util
+import ast
+import nkululeko.glob_conf as glob_conf
+from nkululeko.feature_extractor import FeatureExtractor
+import numpy as np
+class PESQPredictor:
+    """
+    PESQPredictor
+    predicting PESQ
+    """
+    def __init__(self, df):
+        self.df = df
+        self.util = Util('pesqPredictor')
+    def predict(self, split_selection):
+        self.util.debug(f'estimating PESQ for {split_selection} samples')
+        return_df = self.df.copy()
+        feats_name = "_".join(ast.literal_eval(glob_conf.config['DATA']['databases']))
+        self.feature_extractor = FeatureExtractor(self.df, ['pesq'], feats_name, split_selection)
+        result_df = self.feature_extractor.extract()
+        # replace missing values by 0
+        result_df = result_df.fillna(0)
+        result_df = result_df.replace(np.nan, 0)
+        result_df.replace([np.inf, -np.inf], 0, inplace=True)
+        pred_vals = result_df.pesq * 100
+        return_df['pesq_pred'] = pred_vals.astype('int')/100
+        return return_df

nkululeko-0.58.0/nkululeko/constants.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ VERSION = '0.58.0'

{nkululeko-0.57.0 → nkululeko-0.58.0}/nkululeko/dataset.py RENAMED Viewed

@@ -67,7 +67,6 @@ class Dataset:
         # store the dataframe
         store = self.util.get_path('store')
         store_file = f'{store}{self.name}.pkl'
-        self.util.debug(f'{self.name}: loading ...')
         self.root = self._load_db()
 #        self.got_speaker, self.got_gender = False, False
         if not self.start_fresh and os.path.isfile(store_file):

{nkululeko-0.57.0 → nkululeko-0.58.0}/nkululeko/experiment.py RENAMED Viewed

@@ -82,7 +82,8 @@ class Experiment:
                 self.got_speaker = True
             self.datasets.update({d: data})
         self.target = self.util.config_val('DATA', 'target', 'emotion')
-        self.util.debug(f'loaded databases {self.datasets.keys()}')
+        dbs = ','.join(list(self.datasets.keys()))
+        self.util.debug(f'loaded databases {dbs}')
     def _import_csv(self, storage):
         # df = pd.read_csv(storage, header=0, index_col=[0,1,2])
@@ -353,6 +354,14 @@ class Experiment:
                 from nkululeko.ap_snr import SNRPredictor
                 predictor = SNRPredictor(df)
                 df = predictor.predict(sample_selection)
+            elif target == 'mos':
+                from nkululeko.ap_mos import MOSPredictor
+                predictor = MOSPredictor(df)
+                df = predictor.predict(sample_selection)
+            elif target == 'pesq':
+                from nkululeko.ap_pesq import PESQPredictor
+                predictor = PESQPredictor(df)
+                df = predictor.predict(sample_selection)
             elif target == 'arousal':
                 from nkululeko.ap_arousal import ArousalPredictor
                 predictor = ArousalPredictor(df)
@@ -361,8 +370,12 @@ class Experiment:
                 from nkululeko.ap_valence import ValencePredictor
                 predictor = ValencePredictor(df)
                 df = predictor.predict(sample_selection)
+            elif target == 'dominance':
+                from nkululeko.ap_dominance import DominancePredictor
+                predictor = DominancePredictor(df)
+                df = predictor.predict(sample_selection)
             else:
-                self.util.error(f'unkown auto predict target: {target}')
+                self.util.error(f'unknown auto predict target: {target}')
         return df
     def random_splice(self):

nkululeko-0.58.0/nkululeko/feats_mos.py ADDED Viewed

@@ -0,0 +1,92 @@
+""" feats_mos.py
+predict MOS (mean opinion score)
+adapted from
+from https://pytorch.org/audio/main/tutorials/squim_tutorial.html#sphx-glr-tutorials-squim-tutorial-py
+paper: https://arxiv.org/pdf/2304.01448.pdf
+needs
+pip uninstall -y torch torchvision torchaudio
+pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
+"""
+from nkululeko.util import Util
+from nkululeko.featureset import Featureset
+import os
+import pandas as pd
+import os
+import nkululeko.glob_conf as glob_conf
+import audiofile
+import torch
+import torchaudio
+from torchaudio.pipelines import SQUIM_SUBJECTIVE
+from torchaudio.utils import download_asset
+class MOSSet(Featureset):
+    """Class to predict MOS (mean opinion score)
+    """
+    def __init__(self, name, data_df):
+        """Constructor. is_train is needed to distinguish from test/dev sets, because they use the codebook from the training"""
+        super().__init__(name, data_df)
+        self.device = self.util.config_val('MODEL', 'device', 'cpu')
+        self.model_initialized = False
+    def init_model(self):
+        # load model
+        self.util.debug('loading MOS model...')
+        self.subjective_model = SQUIM_SUBJECTIVE.get_model()
+        NMR_SPEECH = download_asset("tutorial-assets/ctc-decoding/1688-142285-0007.wav")
+        self.WAVEFORM_NMR, SAMPLE_RATE_NMR = torchaudio.load(NMR_SPEECH)
+        self.model_initialized = True
+    def extract(self):
+        """Extract the features or load them from disk if present."""
+        store = self.util.get_path('store')
+        store_format = self.util.config_val('FEATS', 'store_format', 'pkl')
+        storage = f'{store}{self.name}.{store_format}'
+        extract = self.util.config_val('FEATS', 'needs_feature_extraction', False)
+        no_reuse = eval(self.util.config_val('FEATS', 'no_reuse', 'False'))
+        if extract or no_reuse or not os.path.isfile(storage):
+            if not self.model_initialized:
+                self.init_model()
+            self.util.debug('predicting MOS, this might take a while...')
+            emb_series = pd.Series(index = self.data_df.index, dtype=object)
+            length = len(self.data_df.index)
+            for idx, (file, start, end) in enumerate(self.data_df.index.to_list()):
+                signal, sampling_rate = audiofile.read(file, offset=start.total_seconds(), duration=(end-start).total_seconds(), always_2d=True)
+                emb = self.get_embeddings(signal, sampling_rate)
+                emb_series[idx] = emb
+                if idx%10==0:
+                    self.util.debug(f'MOS: {idx} of {length} done')
+            self.df = pd.DataFrame(emb_series.values.tolist(), index=self.data_df.index)
+            self.df.columns = ['mos']
+            self.util.write_store(self.df, storage, store_format)
+            try:
+                glob_conf.config['DATA']['needs_feature_extraction'] = 'false'
+            except KeyError:
+                pass
+        else:
+            self.util.debug('reusing predicted MOS values')
+            self.df = self.util.get_store(storage, store_format)
+            if self.df.isnull().values.any():
+                nanrows = self.df.columns[self.df.isna().any()].tolist()
+                print(nanrows)
+                self.util.error(f'got nan: {self.df.shape} {self.df.isnull().sum().sum()}')
+    def get_embeddings(self, signal, sampling_rate):
+        tmp_audio_name = 'mos_audio_tmp.wav'
+        audiofile.write(tmp_audio_name, signal, sampling_rate)
+        WAVEFORM_SPEECH, SAMPLE_RATE_SPEECH = torchaudio.load(tmp_audio_name)
+        with torch.no_grad():
+            mos = self.subjective_model(WAVEFORM_SPEECH, self.WAVEFORM_NMR)
+        return float(mos[0].numpy())
+    def extract_sample(self, signal, sr):
+        self.init_model()
+        feats = self.get_embeddings(signal, sr)
+        return feats

nkululeko-0.58.0/nkululeko/feats_pesq.py ADDED Viewed

@@ -0,0 +1,89 @@
+""" feats_pesq.py
+predict PESQ (Perceptual Evaluation of Speech Quality)
+adapted from
+from https://pytorch.org/audio/main/tutorials/squim_tutorial.html#sphx-glr-tutorials-squim-tutorial-py
+paper: https://arxiv.org/pdf/2304.01448.pdf
+needs
+pip uninstall -y torch torchvision torchaudio
+pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
+"""
+from nkululeko.util import Util
+from nkululeko.featureset import Featureset
+import os
+import pandas as pd
+import os
+import nkululeko.glob_conf as glob_conf
+import audiofile
+import torch
+import torchaudio
+from torchaudio.pipelines import SQUIM_OBJECTIVE
+class PESQSet(Featureset):
+    """Class to predict PESQ (Perceptual Evaluation of Speech Quality)
+    """
+    def __init__(self, name, data_df):
+        """Constructor. is_train is needed to distinguish from test/dev sets, because they use the codebook from the training"""
+        super().__init__(name, data_df)
+        self.device = self.util.config_val('MODEL', 'device', 'cpu')
+        self.model_initialized = False
+    def init_model(self):
+        # load model
+        self.util.debug('loading model...')
+        self.objective_model = SQUIM_OBJECTIVE.get_model()
+        self.model_initialized = True
+    def extract(self):
+        """Extract the features or load them from disk if present."""
+        store = self.util.get_path('store')
+        store_format = self.util.config_val('FEATS', 'store_format', 'pkl')
+        storage = f'{store}{self.name}.{store_format}'
+        extract = self.util.config_val('FEATS', 'needs_feature_extraction', False)
+        no_reuse = eval(self.util.config_val('FEATS', 'no_reuse', 'False'))
+        if extract or no_reuse or not os.path.isfile(storage):
+            if not self.model_initialized:
+                self.init_model()
+            self.util.debug('predicting PESQ, this might take a while...')
+            emb_series = pd.Series(index = self.data_df.index, dtype=object)
+            length = len(self.data_df.index)
+            for idx, (file, start, end) in enumerate(self.data_df.index.to_list()):
+                signal, sampling_rate = audiofile.read(file, offset=start.total_seconds(), duration=(end-start).total_seconds(), always_2d=True)
+                emb = self.get_embeddings(signal, sampling_rate)
+                emb_series[idx] = emb
+                if idx%10==0:
+                    self.util.debug(f'PESQ: {idx} of {length} done')
+            self.df = pd.DataFrame(emb_series.values.tolist(), index=self.data_df.index)
+            self.df.columns = ['pesq']
+            self.util.write_store(self.df, storage, store_format)
+            try:
+                glob_conf.config['DATA']['needs_feature_extraction'] = 'false'
+            except KeyError:
+                pass
+        else:
+            self.util.debug('reusing predicted PESQ values')
+            self.df = self.util.get_store(storage, store_format)
+            if self.df.isnull().values.any():
+                nanrows = self.df.columns[self.df.isna().any()].tolist()
+                print(nanrows)
+                self.util.error(f'got nan: {self.df.shape} {self.df.isnull().sum().sum()}')
+    def get_embeddings(self, signal, sampling_rate):
+        tmp_audio_name = 'pesq_audio_tmp.wav'
+        audiofile.write(tmp_audio_name, signal, sampling_rate)
+        WAVEFORM_SPEECH, SAMPLE_RATE_SPEECH = torchaudio.load(tmp_audio_name)
+        with torch.no_grad():
+            stoi_hyp, pesq_hyp, si_sdr_hyp = self.objective_model(WAVEFORM_SPEECH)
+        return float(pesq_hyp[0].numpy())
+    def extract_sample(self, signal, sr):
+        self.init_model()
+        feats = self.get_embeddings(signal, sr)
+        return feats

{nkululeko-0.57.0 → nkululeko-0.58.0}/nkululeko/feature_extractor.py RENAMED Viewed

@@ -61,6 +61,12 @@ class FeatureExtractor:
             elif feats_type=='snr':
                 from nkululeko.feats_snr import SNRSet
                 self.featExtractor = SNRSet(f'{store_name}_{self.feats_designation}', self.data_df)
+            elif feats_type=='mos':
+                from nkululeko.feats_mos import MOSSet
+                self.featExtractor = MOSSet(f'{store_name}_{self.feats_designation}', self.data_df)
+            elif feats_type=='pesq':
+                from nkululeko.feats_pesq import PESQSet
+                self.featExtractor = PESQSet(f'{store_name}_{self.feats_designation}', self.data_df)
             elif feats_type=='clap':
                 from nkululeko.feats_clap import Clap
                 self.featExtractor = Clap(f'{store_name}_{self.feats_designation}', self.data_df)

{nkululeko-0.57.0 → nkululeko-0.58.0/nkululeko.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: nkululeko
-Version: 0.57.0
+Version: 0.58.0
 Summary: Machine learning audio prediction experiments based on templates
 Home-page: https://github.com/felixbur/nkululeko
 Author: Felix Burkhardt
@@ -17,6 +17,7 @@ License-File: LICENSE
 # Nkululeko
 * [Overview](#overview)
 * [Installation](#installation)
+* [Documentation](https://nkululeko.readthedocs.io)
 * [Usage](#usage)
 * [Hello World](#hello-world-example)
 * [Licence](#licence)
@@ -66,7 +67,11 @@ Sometimes you only want to take a look at your data:
 <img src="meta/images/data_plot.png" width="500px"/>
-## Installatione
+## Documentation
+The documentation, along with extensions of installation, usage, INI file format, and examples, can be found [nkululeko.readthedocs.io](https://nkululeko.readthedocs.io).
+## Installation
 Create and activate a virtual Python environment and simply run
 ```
@@ -106,10 +111,11 @@ Read the [Hello World example](#hello-world-example) for initial usage with Emo-
 Here is an overview of the interfaces:
 * **nkululeko.nkululeko**: doing experiments
-* **nkululeko.demo**: demo the current best model on command line
+* **nkululeko.demo**: demo the current best model on the command line
 * **nkululeko.test**: predict a series of files with the current best model
 * **nkululeko.explore**: perform data exploration
 * **nkululeko.augment**: augment the current training data
+* **nkululeko.predict**: predict a series of files with a given model
 Alternatively, there is a central "experiment" class that can be used by own experiments
@@ -217,6 +223,12 @@ Nkululeko can be used under the [MIT license](https://choosealicense.com/license
 Changelog
 =========
+Version 0.58.0
+--------------
+* added dominance predict
+* added MOS predict
+* added PESQ predict
 Version 0.57.0
 --------------
 * renamed autopredict predict

{nkululeko-0.57.0 → nkululeko-0.58.0}/nkululeko.egg-info/SOURCES.txt RENAMED Viewed

@@ -7,7 +7,10 @@ setup.py
 nkululeko/__init__.py
 nkululeko/ap_age.py
 nkululeko/ap_arousal.py
+nkululeko/ap_dominance.py
 nkululeko/ap_gender.py
+nkululeko/ap_mos.py
+nkululeko/ap_pesq.py
 nkululeko/ap_snr.py
 nkululeko/ap_valence.py
 nkululeko/augment.py
@@ -31,8 +34,10 @@ nkululeko/feats_audmodel_dim.py
 nkululeko/feats_clap.py
 nkululeko/feats_import.py
 nkululeko/feats_mld.py
+nkululeko/feats_mos.py
 nkululeko/feats_opensmile.py
 nkululeko/feats_oxbow.py
+nkululeko/feats_pesq.py
 nkululeko/feats_praat.py
 nkululeko/feats_snr.py
 nkululeko/feats_trill.py