torch-l1-snr 0.0.2__py3-none-any.whl → 0.0.4__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {torch_l1_snr-0.0.2.dist-info → torch_l1_snr-0.0.4.dist-info}/METADATA +35 -24
- torch_l1_snr-0.0.4.dist-info/RECORD +7 -0
- torch_l1_snr-0.0.2.dist-info/RECORD +0 -7
- {torch_l1_snr-0.0.2.dist-info → torch_l1_snr-0.0.4.dist-info}/WHEEL +0 -0
- {torch_l1_snr-0.0.2.dist-info → torch_l1_snr-0.0.4.dist-info}/licenses/LICENSE +0 -0
- {torch_l1_snr-0.0.2.dist-info → torch_l1_snr-0.0.4.dist-info}/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: torch-l1-snr
|
|
3
|
-
Version: 0.0.
|
|
3
|
+
Version: 0.0.4
|
|
4
4
|
Summary: L1-SNR loss functions for audio source separation in PyTorch
|
|
5
5
|
Home-page: https://github.com/crlandsc/torch-l1-snr
|
|
6
6
|
Author: Christopher Landscaping
|
|
@@ -32,28 +32,29 @@ Dynamic: license-file
|
|
|
32
32
|
|
|
33
33
|
[](https://github.com/crlandsc/torch-l1snr/blob/main/LICENSE) [](https://github.com/crlandsc/torch-l1snr/stargazers)
|
|
34
34
|
|
|
35
|
-
# torch-l1-snr
|
|
36
|
-
|
|
37
35
|
A PyTorch implementation of L1-based Signal-to-Noise Ratio (SNR) loss functions for audio source separation. This package provides implementations and novel extensions based on concepts from recent academic papers, offering flexible and robust loss functions that can be easily integrated into any PyTorch-based audio separation pipeline.
|
|
38
36
|
|
|
39
|
-
The core `L1SNRLoss` is based on the loss function described in [1], while `L1SNRDBLoss` and `STFTL1SNRDBLoss` are extensions of the adaptive level-matching regularization technique proposed in [2].
|
|
37
|
+
The core `L1SNRLoss` is based on the loss function described in [[1]](https://arxiv.org/abs/2309.02539), while `L1SNRDBLoss` and `STFTL1SNRDBLoss` are extensions of the adaptive level-matching regularization technique proposed in [[2]](https://arxiv.org/abs/2501.16171).
|
|
40
38
|
|
|
41
39
|
## Features
|
|
42
40
|
|
|
43
|
-
- **Time-Domain L1SNR Loss**: A basic, time-domain L1-SNR loss, based on [1].
|
|
44
|
-
- **Regularized Time-Domain L1SNRDBLoss**: An extension of the L1SNR loss with adaptive level-matching regularization from [2], plus an optional L1 loss component.
|
|
45
|
-
- **Multi-Resolution STFT L1SNRDBLoss**: A spectrogram-domain version of the loss from [2], calculated over multiple STFT resolutions.
|
|
41
|
+
- **Time-Domain L1SNR Loss**: A basic, time-domain L1-SNR loss, based on [[1]](https://arxiv.org/abs/2309.02539).
|
|
42
|
+
- **Regularized Time-Domain L1SNRDBLoss**: An extension of the L1SNR loss with adaptive level-matching regularization from [[2]](https://arxiv.org/abs/2501.16171), plus an optional L1 loss component.
|
|
43
|
+
- **Multi-Resolution STFT L1SNRDBLoss**: A spectrogram-domain version of the loss from [[2]](https://arxiv.org/abs/2501.16171), calculated over multiple STFT resolutions.
|
|
46
44
|
- **Modular Stem-based Loss**: A wrapper that combines time and spectrogram domain losses and can be configured to run on specific stems.
|
|
47
45
|
- **Efficient & Robust**: Includes optimizations for pure L1 loss calculation and robust handling of `NaN`/`inf` values and short audio segments.
|
|
48
46
|
|
|
49
47
|
## Installation
|
|
50
48
|
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
49
|
+
[](https://pypi.org/project/torch-l1-snr/) [](https://pypi.org/project/torch-l1-snr/) [](https://pypi.org/project/torch-l1-snr/)
|
|
50
|
+
|
|
51
|
+
## Install from PyPI
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
pip install torch-l1-snr
|
|
55
|
+
```
|
|
55
56
|
|
|
56
|
-
|
|
57
|
+
## Install from GitHub
|
|
57
58
|
|
|
58
59
|
```bash
|
|
59
60
|
pip install git+https://github.com/crlandsc/torch-l1snr.git
|
|
@@ -90,9 +91,13 @@ from torch_l1snr import L1SNRDBLoss
|
|
|
90
91
|
estimates = torch.randn(4, 32000) # Batch of 4, 32000 samples
|
|
91
92
|
actuals = torch.randn(4, 32000)
|
|
92
93
|
|
|
93
|
-
# Initialize the loss function
|
|
94
|
-
# l1_weight=0.1 blends L1SNR with 10% L1 loss
|
|
95
|
-
loss_fn = L1SNRDBLoss(
|
|
94
|
+
# Initialize the loss function with regularization enabled
|
|
95
|
+
# l1_weight=0.1 blends L1SNR+Regularization with 10% L1 loss
|
|
96
|
+
loss_fn = L1SNRDBLoss(
|
|
97
|
+
name="l1_snr_db_loss",
|
|
98
|
+
use_regularization=True, # Enable adaptive level-matching regularization
|
|
99
|
+
l1_weight=0.1 # 10% L1 loss, 90% L1SNR + regularization
|
|
100
|
+
)
|
|
96
101
|
|
|
97
102
|
# Calculate loss
|
|
98
103
|
loss = loss_fn(estimates, actuals)
|
|
@@ -112,8 +117,11 @@ estimates = torch.randn(4, 32000)
|
|
|
112
117
|
actuals = torch.randn(4, 32000)
|
|
113
118
|
|
|
114
119
|
# Initialize the loss function
|
|
115
|
-
# Uses multiple STFT resolutions by default
|
|
116
|
-
loss_fn = STFTL1SNRDBLoss(
|
|
120
|
+
# Uses multiple STFT resolutions by default: [512, 1024, 2048] FFT sizes
|
|
121
|
+
loss_fn = STFTL1SNRDBLoss(
|
|
122
|
+
name="stft_l1_snr_db_loss",
|
|
123
|
+
l1_weight=0.0 # Pure L1SNR (no regularization, no L1)
|
|
124
|
+
)
|
|
117
125
|
|
|
118
126
|
# Calculate loss
|
|
119
127
|
loss = loss_fn(estimates, actuals)
|
|
@@ -137,9 +145,12 @@ actuals = torch.randn(2, 2, 44100)
|
|
|
137
145
|
|
|
138
146
|
# --- Configuration ---
|
|
139
147
|
loss_fn = MultiL1SNRDBLoss(
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
148
|
+
name="multi_l1_snr_db_loss",
|
|
149
|
+
weight=1.0, # Overall weight for this loss
|
|
150
|
+
spec_weight=0.6, # 60% spectrogram loss, 40% time-domain loss
|
|
151
|
+
l1_weight=0.1, # Use 10% L1, 90% L1SNR+Reg in both domains
|
|
152
|
+
use_time_regularization=True, # Enable regularization in time domain
|
|
153
|
+
use_spec_regularization=False # Disable regularization in spec domain
|
|
143
154
|
)
|
|
144
155
|
loss = loss_fn(estimates, actuals)
|
|
145
156
|
print(f"Multi-domain Loss: {loss.item()}")
|
|
@@ -155,7 +166,7 @@ The goal of these loss functions is to provide a perceptually-informed and robus
|
|
|
155
166
|
|
|
156
167
|
#### Level-Matching Regularization
|
|
157
168
|
|
|
158
|
-
A key feature of `L1SNRDBLoss` is the adaptive regularization term, as described in [2]. This component calculates the difference in decibel-scaled root-mean-square (dBRMS) levels between the estimated and actual signals. An adaptive weight (`lambda`) is applied to this difference, which increases when the model incorrectly silences a non-silent target. This encourages the model to learn the correct output level and specifically avoids the model collapsing to a trivial silent solution when uncertain.
|
|
169
|
+
A key feature of `L1SNRDBLoss` is the adaptive regularization term, as described in [[2]](https://arxiv.org/abs/2501.16171). This component calculates the difference in decibel-scaled root-mean-square (dBRMS) levels between the estimated and actual signals. An adaptive weight (`lambda`) is applied to this difference, which increases when the model incorrectly silences a non-silent target. This encourages the model to learn the correct output level and specifically avoids the model collapsing to a trivial silent solution when uncertain.
|
|
159
170
|
|
|
160
171
|
#### Multi-Resolution Spectrogram Analysis
|
|
161
172
|
|
|
@@ -194,8 +205,8 @@ The loss functions implemented here are based on the work of the authors of the
|
|
|
194
205
|
|
|
195
206
|
## References
|
|
196
207
|
|
|
197
|
-
[1] K. N. Watcharasupat, C.-W. Wu, Y. Ding, I. Orife, A. J. Hipple, P. A. Williams, S. Kramer, A. Lerch, and W. Wolcott, "A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation," IEEE Open Journal of Signal Processing, 2023.
|
|
208
|
+
[1] K. N. Watcharasupat, C.-W. Wu, Y. Ding, I. Orife, A. J. Hipple, P. A. Williams, S. Kramer, A. Lerch, and W. Wolcott, "A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation," IEEE Open Journal of Signal Processing, 2023. [arXiv:2309.02539](https://arxiv.org/abs/2309.02539)
|
|
198
209
|
|
|
199
|
-
[2] K. N. Watcharasupat and A. Lerch, "Separate This, and All of these Things Around It: Music Source Separation via Hyperellipsoidal Queries," arXiv:2501.16171.
|
|
210
|
+
[2] K. N. Watcharasupat and A. Lerch, "Separate This, and All of these Things Around It: Music Source Separation via Hyperellipsoidal Queries," [arXiv:2501.16171](https://arxiv.org/abs/2501.16171).
|
|
200
211
|
|
|
201
|
-
[3] K. N. Watcharasupat and A. Lerch, "A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems," Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024.
|
|
212
|
+
[3] K. N. Watcharasupat and A. Lerch, "A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems," Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024. [arXiv:2406.18747](https://arxiv.org/abs/2406.18747)
|
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
torch_l1_snr-0.0.4.dist-info/licenses/LICENSE,sha256=JdS2Pv6DDs3jvXHACGdcHYdiFMe9EO1XGeHkEHLTr8Y,1079
|
|
2
|
+
torch_l1snr/__init__.py,sha256=pR9jg3fjTKt_suZoVDC67tqB7EWRkbfaXaPP7pYQrlQ,220
|
|
3
|
+
torch_l1snr/l1snr.py,sha256=aqmtNfT_8A0IRI9jiVGwNse3igBvelQGKnjfe23Xh7w,35304
|
|
4
|
+
torch_l1_snr-0.0.4.dist-info/METADATA,sha256=pB7DvZ6BdvCshcDqOTkJNqekh97qXNaPc7tnNzBqJVk,11143
|
|
5
|
+
torch_l1_snr-0.0.4.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
|
6
|
+
torch_l1_snr-0.0.4.dist-info/top_level.txt,sha256=NfaRND6pcjZ7-035d4XAg8xJuz31EEU210Y9xWeFOxc,12
|
|
7
|
+
torch_l1_snr-0.0.4.dist-info/RECORD,,
|
|
@@ -1,7 +0,0 @@
|
|
|
1
|
-
torch_l1_snr-0.0.2.dist-info/licenses/LICENSE,sha256=JdS2Pv6DDs3jvXHACGdcHYdiFMe9EO1XGeHkEHLTr8Y,1079
|
|
2
|
-
torch_l1snr/__init__.py,sha256=pR9jg3fjTKt_suZoVDC67tqB7EWRkbfaXaPP7pYQrlQ,220
|
|
3
|
-
torch_l1snr/l1snr.py,sha256=aqmtNfT_8A0IRI9jiVGwNse3igBvelQGKnjfe23Xh7w,35304
|
|
4
|
-
torch_l1_snr-0.0.2.dist-info/METADATA,sha256=C8sH_v1T2LYRCwCiLMMJkrnh4l-chYzjZYbD2xyI8gg,10370
|
|
5
|
-
torch_l1_snr-0.0.2.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
|
6
|
-
torch_l1_snr-0.0.2.dist-info/top_level.txt,sha256=NfaRND6pcjZ7-035d4XAg8xJuz31EEU210Y9xWeFOxc,12
|
|
7
|
-
torch_l1_snr-0.0.2.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|
|
File without changes
|